Source From Here
Question
I have a socket server that is supposed to receive UTF-8 valid characters from clients. The problem is some clients (mainly hackers) are sending all the wrong kind of data over it.
I need to be able to make the string UTF-8 with or without those characters.
How-To
Please refer to "Unicode HowTo"
Or
Note: This solution will strip out (ignore) the characters in question returning the string without them. Only use this if your need is to strip them not convert them.
Alternatively, use the open method from the codecs module to read in the file:
Question
I have a socket server that is supposed to receive UTF-8 valid characters from clients. The problem is some clients (mainly hackers) are sending all the wrong kind of data over it.
I need to be able to make the string UTF-8 with or without those characters.
How-To
Please refer to "Unicode HowTo"
- str = unicode(str, errors='replace')
- str = unicode(str, errors='ignore')
Alternatively, use the open method from the codecs module to read in the file:
- import codecs
- with codecs.open(file_name, "r",encoding='utf-8', errors='ignore') as fdata:
- ...
沒有留言:
張貼留言