2018年7月12日 星期四

[ Python 常見問題 ] Find the length of a sentence with English words and Chinese characters

Source From Here 
Question 
The sentence may include non-english characters, e.g. Chinese: 
  1. 你好,hello world  
the expected value for the length is 14 (2 Chinese characters, 10 English words, 1 space and 1 comma) 

How-To 
Decoding is requred: 
>>> s = "你好,hello world" 
>>> len(s) 
18 // 2 chines words will be encoded in UTF-8 which is 3 bytes. So 2 * 2 = 4 bytes to have 14 + 4 = 18 
>>> len(s.decode('utf-8')) // With decoding, we can have correct length this time 
14 
>>> for c in s.decode('utf-8'): 
... print(c) 
... 
 
 
, 
h 
e 
l 
l 
o 

w 
o 
r 
l 
d


沒有留言:

張貼留言

[ Python 文章收集 ] SQLAlchemy quick start with PostgreSQL

Source From   Here   Preface   This is a quick tutorial for getting started with   SQLAlchemy Core API .   Prerequisites   In this quick st...