Regex for Python string matching

I wanted to match the numeric values โ€‹โ€‹of the string:

1,000 metric tonnes per contract month Five cents ($0.05) per tonne Five cents ($0.05) per tonne 1,000 metric tonnes per contract month 

My current approach:

 size = re.findall(r'(\d+(,?\d*).*?)', my_string) 

What I get with my approach:

 print size [(u'1,000', u',000')] 

As you can see, number 1 cut out from the second element of the list, why? Also, can I get a hint on how I can match the terms $0.05 ?

+4
source share
5 answers

Something like that:

 >>> import re >>> strs = """1,000 metric tonnes per contract month Five cents ($0.05) per tonne Five cents ($0.05) per tonne 1,000 metric tonnes per contract month""" >>> [m.group(0) for m in re.finditer(r'\$?\d+([,.]\d+)?', strs)] ['1,000', '$0.05', '$0.05', '1,000'] 

Demo: http://rubular.com/r/UomzIY3SD3

+3
source

re,findall() returns a tuple of all capture groups for each match, and each set of normal brackets creates one such group. Write your regex as follows:

 size = re.findall(r'\d{1,3}(?:,\d{3})*(?:\.\d+)?', my_string) 

Explanation:

 \d{1,3} # One to three digits (?:,\d{3})* # Optional thousands groups (?:\.\d+)? # Optional decimal part 

This assumes that all numbers have commas as thousands separators, i.e. E. There are no numbers like 1000000 . If you need to match them too, use

 size = re.findall(r'\d+(?:,\d{3})*(?:\.\d+)?', my_string) 
+3
source

Try this regex:

 (\$?\d+(?:[,.]?\d*(?:\.\d+)?)).*? 

Live demo

0
source

Why are you grouping your regular expression? Try r'\$?\d+,?\d*\.?\d*'

0
source

I would try this regex:

g '[0-9] + (?: [0-9] +) (?:. [0-9])?

Add \ $? at the beginning to optionally catch $

0
source

Source: https://habr.com/ru/post/1487285/


All Articles