2018年7月12日 星期四

[ Python 常見問題 ] Find the length of a sentence with English words and Chinese characters

Source From Here 
Question 
The sentence may include non-english characters, e.g. Chinese: 
  1. 你好,hello world  
the expected value for the length is 14 (2 Chinese characters, 10 English words, 1 space and 1 comma) 

How-To 
Decoding is requred: 
>>> s = "你好,hello world" 
>>> len(s) 
18 // 2 chines words will be encoded in UTF-8 which is 3 bytes. So 2 * 2 = 4 bytes to have 14 + 4 = 18 
>>> len(s.decode('utf-8')) // With decoding, we can have correct length this time 
14 
>>> for c in s.decode('utf-8'): 
... print(c) 
... 
 
 
, 
h 
e 
l 
l 
o 

w 
o 
r 
l 
d


沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...