Is Python split into several delimiter errors?

I looked at the answers to this previously asked question:

Split strings with multiple delimiters?

For my version of this problem, I wanted to separate everything that wasn’t from a specific character set. This led me to a solution that I liked, until I discovered this obvious error. Is this a mistake or some python quirk I am not familiar with?

>>> b = "Which_of'these-markers/does,it:choose to;split!on?"
>>> b1 = re.split("[^a-zA-Z0-9_'-/]+", b)
>>> b1
["Which_of'these-markers/does,it", 'choose', 'to', 'split', 'on', '']

I don’t understand why it is not split into a comma (','), given that the comma is not in my exception list?

+4
source share
2 answers

'-/ a range containing a comma has been created inside the character class:

enter image description here

Python re, :

  • : [-A-Z] ( ASCII -)
  • : [A-Z()-] ( ASCII, (, ) -)
  • : [A-Z-+] ( ASCII, - +)
  • .

, ( [\w-+], ). .NET , Python re.

.

re.split(r"[^a-zA-Z0-9_'/-]+", b)

Python 2.7

re.split(r"[^\w'/-]+", b)
+7

'-/ , ascii 39 47, , ascii 44.

- , .

+2

Source: https://habr.com/ru/post/1674592/


All Articles