2017年8月8日 星期二

[ Python 常見問題 ] UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c

Source From Here 
Question 
I have a socket server that is supposed to receive UTF-8 valid characters from clients. The problem is some clients (mainly hackers) are sending all the wrong kind of data over it. 

I need to be able to make the string UTF-8 with or without those characters. 

How-To 
Please refer to "Unicode HowTo" 
  1. str = unicode(str, errors='replace')  
Or 
  1. str = unicode(str, errors='ignore')  
Note: This solution will strip out (ignore) the characters in question returning the string without them. Only use this if your need is to strip them not convert them. 

Alternatively, use the open method from the codecs module to read in the file: 
  1. import codecs  
  2. with codecs.open(file_name, "r",encoding='utf-8', errors='ignore') as fdata:  
  3.     ...  


沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...