2018年7月12日 星期四

[ Python 常見問題 ] Find the length of a sentence with English words and Chinese characters

Source From Here 
Question 
The sentence may include non-english characters, e.g. Chinese: 
  1. 你好,hello world  
the expected value for the length is 14 (2 Chinese characters, 10 English words, 1 space and 1 comma) 

How-To 
Decoding is requred: 
>>> s = "你好,hello world" 
>>> len(s) 
18 // 2 chines words will be encoded in UTF-8 which is 3 bytes. So 2 * 2 = 4 bytes to have 14 + 4 = 18 
>>> len(s.decode('utf-8')) // With decoding, we can have correct length this time 
14 
>>> for c in s.decode('utf-8'): 
... print(c) 
... 
 
 
, 
h 
e 
l 
l 
o 

w 
o 
r 
l 
d


沒有留言:

張貼留言

[ Py DS ] Ch1 - IPython: Beyond Normal Python

Source From  Here   Keyboard Shortcuts in the IPython Shell   If you spend any amount of time on the computer, you’ve probably found a u...