Error with re.split function and re.DOTALL flag in python 2.7.1 module

I have a Mac running Lion and Python 2.7.1. I notice something very strange from the re module. If I run the following line:

print re.split(r'\s*,\s*', 'a, b,\nc, d, e, f, g, h, i, j, k,\nl, m, n, o, p, q, r') 

I get this result:

 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r'] 

But if I ran it with the re.DOTALL flag as follows:

 print re.split(r'\s*,\s*', 'a, b,\nc, d, e, f, g, h, i, j, k,\nl, m, n, o, p, q, r', re.DOTALL) 

Then I get this result:

 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q, r'] 

Note that 'q, r' counts as one match instead of two.

Why is this happening? I do not understand why the re.DOTALL flag will matter if I do not use dots in my template. Am I doing something wrong or is there some kind of mistake?

+4
source share
1 answer
 >>> s = 'a, b,\nc, d, e, f, g, h, i, j, k,\nl, m, n, o, p, q, r' >>> re.split(r'\s*,\s*', s) ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r'] >>> re.split(r'\s*,\s*', s, maxsplit=16) ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q, r'] >>> re.split(r'\s*,\s*', s, flags=re.DOTALL) ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r'] 

The problem is that you pass re.DOTALL positionally, where it sets the argument maxsplit=0 , not the argument flags=0 . re.DOTALL is a constant of 16 .

+10
source

Source: https://habr.com/ru/post/1380859/


All Articles