Split a string around any characters not specified

I am looking to be able to break a string into a list around everything that is not a number or a dot. Currently, the split method only provides a way to make a positive match for split, is regular expression the best way for this situation?

For example, for the string "10.23, 10.13.21; 10.1 10.5 and 10.23.32" This should return the list ['10.23', '10.13.21', '10.1', '10.5', '10.23.32']

As such, I believe that the best regular expression to use in this situation would be ... [\d\.]+

Is this the best way to handle this?

+4
source share
3 answers

If you are thinking of re.findall : you can use re.split with an inverted version of your regex:

 In [1]: import re In [2]: s = "10.23, 10.13.21; 10.1 10.5 and 10.23.32" In [3]: re.split(r'[^\d\.]+', s) Out[3]: ['10.23', '10.13.21', '10.1', '10.5', '10.23.32'] 
+9
source

If you need a solution other than regular expression, you can use str.translate and translate everything except '.0123456789' into spaces and make a split() call

 In [69]: mystr Out[69]: '10.23, 10.13.21; 10.1 10.5 and 10.23.32' In [70]: mystr.translate(' '*46 + '. ' + '0123456789' + ' '*198).split() Out[70]: ['10.23', '10.13.21', '10.1', '10.5', '10.23.32'] 

Hope this helps

+2
source

Perhaps a more readable form of what @ inspectorG4dget suggested:

 >>> import string >>> s = '10.23, 10.13.21; 10.1 10.5 and 10.23.32' >>> ''.join(c if c in set(string.digits + '.') else ' ' for c in s).split() ['10.23', '10.13.21', '10.1', '10.5', '10.23.32'] 

This way you can avoid regular expressions, which is often a good idea when you can easily.

+2
source

Source: https://habr.com/ru/post/1447074/


All Articles