2019年12月25日 星期三

[ Python 常見問題 ] How to find all occurrences of a substring?

Source From Here
Question
Python has str.find() and str.rfind() to get the index of a substring in a string.

I'm wondering whether there is something like str.find_all() which can return all found indexes (not only the first from the beginning or the first from the end). For example:
  1. string = "test test test test"  
  2.   
  3. print string.find('test') # 0  
  4. print string.rfind('test') # 15  
  5.   
  6. #this is the goal  
  7. print string.find_all('test') # [0,5,10,15]  
How-To
There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expression module re:
>>> import re
>>> ts = 'test test test test'
>>> [m.start() for m in re.finditer('test', ts)]
[0, 5, 10, 15]

If you want to find overlapping matches, lookahead will do that:
>>> [m.start() for m in re.finditer('(?=tt)', 'ttt')]
[0, 1]
>>> [m.start() for m in re.finditer('tt', 'ttt')]
[0]

If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:
  1. search = 'tt'  
  2. [m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]  
  3. #[1]  
re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...