I know that my answers have similar questions, but after reading them I still do not have the solution that I am looking for.
Using Python 3.2.2, I need to match "Month, Day, Year" when the month is a string, Day - two digits of no more than 30, 31 or 28 for February and February 29 in a leap year. (Mostly REAL and Valid Date)
This is what I have so far:
pattern = "(January|February|March|April|May|June|July|August|September|October|November|December)[,][ ](0[1-9]|[12][0-9]|3[01])[,][ ]((19|20)[0-9][0-9])" expression = re.compile(pattern) matches = expression.findall(sampleTextFile)
I'm still not very familiar with the regex syntax, so I may have characters that are not needed ([,] [] for commas and spaces seem like the wrong way to do this), but when I try to match "January 26, 1991" in my text file, a listing of the elements in the “matches” (“January”, “26”, “1991”, “19”).
Why does the extra “19” appear at the end?
Also, what things can I add or change in my regex that will allow me to check dates correctly? My plan right now is to accept almost all dates and then extrude them later using high-level constructs, comparing the grouping of days with the grouping of the month and year to see if the day should be <31,30,29,28
Any help would be greatly appreciated, including constructive criticism as to how I am going to design my regex.