2017年8月8日 星期二

[ Python 常見問題 ] UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c

Source From Here 
Question 
I have a socket server that is supposed to receive UTF-8 valid characters from clients. The problem is some clients (mainly hackers) are sending all the wrong kind of data over it. 

I need to be able to make the string UTF-8 with or without those characters. 

How-To 
Please refer to "Unicode HowTo" 
  1. str = unicode(str, errors='replace')  
Or 
  1. str = unicode(str, errors='ignore')  
Note: This solution will strip out (ignore) the characters in question returning the string without them. Only use this if your need is to strip them not convert them. 

Alternatively, use the open method from the codecs module to read in the file: 
  1. import codecs  
  2. with codecs.open(file_name, "r",encoding='utf-8', errors='ignore') as fdata:  
  3.     ...  


沒有留言:

張貼留言

[ Py DS ] Ch1 - IPython: Beyond Normal Python

Source From  Here   Keyboard Shortcuts in the IPython Shell   If you spend any amount of time on the computer, you’ve probably found a u...