Question
I want to use re.MULTILINE but NOT re.DOTALL, so that I can have a regex that includes both an "any character" wildcard and the normal . wildcard that doesn't match newlines.
Is there a way to do this? What should I use to match any character in those instances that I want to include newlines?
How-To
To match a newline, or "any symbol" without re.S/re.DOTALL, you may use any of the following:
- [\s\S]
- [\w\W]
- [\d\D]
Comparing it to (.|\s) and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a * or + quantifier). Compare the small example: it takes (?:.|\n)+ 45 steps to complete, and it takes [\s\S]+ just 2 steps.
Let's test this solution a little bit:
Supplement
* String services : re — Regular expression operations
沒有留言:
張貼留言