Match numbers if the string starts with a keyword

Question

Match numbers if the string starts with a keyword

I have a file that looks like this:

foo: 11.00 12.00 bar 13.00 bar: 11.00 12.00 bar foo: 11.00 12.00

and would like to extract all the numbers in the lines starting with the keyword "foo:". Expected Result:

 ['11.00', '12.00', '13.00'] ['11.00', '12.00']

Now it is easy if I use two regular expressions, for example:

  if re.match('^foo:', line): re.findall('\d+\.\d+', line)

but I was wondering if it is possible to combine them into one regular expression?

Thanks for your help, MD

+4

python regex

Mike d Nov 01 '11 at 13:09

source share

3 answers

egor83 · Answer 1 · 2011-11-01T13:39:59+0000

Not quite what you requested, but since it is recommended that you use standard Python tools instead of regular expressions, I would do something like this:

 import re with open('numbers.txt', 'r') as f: [re.findall(r'\d+\.\d+', line) for line in f if line.startswith('foo')]

UPDATE

And this will return the numbers after 'foo', even if it is somewhere in the string, and not just at the beginning:

 with open('numbers.txt', 'r') as f: [re.findall(r'\d+\.\d+', line.partition('foo')[2]) for line in f]

Some programmer dude · Answer 2 · 2011-11-01T13:20:10+0000

If all lines in a file always have the same number of numbers, you can use the following regular expression:

 "^foo:[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)"

Example:

 >>> import re >>> line = "foo: 11.00 12.00 bar 13.00" >>> re.match("^foo:[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)", line).groups() ('11.00', '12.00', '13.00') >>>

Using parentheses around part of the regular expression makes it into a group that can be extracted from the matching object. See the Python documentation for more information.

Austin marshall · Answer 3 · 2011-11-01T14:15:06+0000

You can do without the first regexp and instead filter the strings in a list comprehension by comparing the first four characters of the string and compile the internal regexp:

 import re with open("input.txt", "r") as inp: prog=re.compile("\d+\.\d+") results=[prog.findall(line) for line in inp if line[:4]=="foo:"]

Match numbers if the string starts with a keyword

More articles: