Numbers in Haystack Simplification

Question

Numbers in Haystack Simplification

I am working on a Python course in Coursera, which involves using regular expressions. The goal is to read a file of text and numbers, extract all numbers and summarize them. For sample data ( http://py4e-data.dr-chuck.net/regex_sum_42.txt ), I have the following code:

import re
handle = open("regex_sum_42.txt")
numlist=list()
for line in handle :
    line = line.rstrip()
    stuff = re.findall('([0-9.]+)',line)
    for element in stuff :
        try :
            num = int(element)
            numlist.append(num)
        except :
            continue
print(sum(numlist))

Since the “stuff” list also includes empty spaces (strings where there are no numbers) and “.”, I thought I needed try / except strings to prevent a trace error. Is there an easier way to implement this program without a second loop?

+4

python

user21359 Aug 6 '17 at 18:06

source share

1 answer

Willem Van Onsem · Accepted Answer · 2017-08-06T18:23:56+0000

, . , , , () , (b) .

, :

import re

rgx = re.compile(r'\-?\d+')

the_sum = 0
with open("regex_sum_42.txt") as handle:
    for line in handle:
        the_sum += sum(int(x) for x in rgx.findall(line))

print(the_sum)

, , , , . . \-?, , -2, . :

, . , , . , , :

import re

rgx = re.compile(r'\-?\d+(?:\.\d*)?')

the_sum = 0
with open("regex_sum_42.txt") as handle:
    for line in handle:
        the_sum += sum(float(x) for x in rgx.findall(line))

print(the_sum)

, non-capture (?:..) , findall ( , ). :

445833.0

, 'http://www.py4e.com/code3/', '4' '3' . '\b', :

import re

rgx = re.compile(r'\b\-?\d+(?:\.\d*)?\b')

the_sum = 0
with open("regex_sum_42.txt") as handle:
    for line in handle:
        the_sum += sum(float(x) for x in rgx.findall(line))

print(the_sum)

:

445822.0

, 11.

Numbers in Haystack Simplification

More articles: