How to find dates in a sentence using NLP, RegEx in Python

Can someone suggest me a way to search and parse dates (in any format "Aug06", "Aug2006", "August 2, 2008", "August 19, 2006", "08-06", "01-08-06") in python.

I came across this question, but it is in perl ... Extract inconsistently formatted date from a string (date parsing, NLP)

Any suggestion would be helpful.

+3
source share
2 answers

This finds all the dates in your example sentence:

for match in re.finditer(
    r"""(?ix)             # case-insensitive, verbose regex
    \b                    # match a word boundary
    (?:                   # match the following three times:
     (?:                  # either
      \d+                 # a number,
      (?:\.|st|nd|rd|th)* # followed by a dot, st, nd, rd, or th (optional)
      |                   # or a month name
      (?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)
     )
     [\s./-]*             # followed by a date separator or whitespace (optional)
    ){3}                  # do this three times
    \b                    # and end at a word boundary.""", 
    subject):
    # match start: match.start()
    # match end (exclusive): match.end()
    # matched text: match.group()

( - 21. Mai 2006 , 4ème décembre 1999), , August Augst Aug, , .

, .

. () , You'll find it in box 21. August 3rd will be the shipping date. 21. August 3rd, , , .

+4
from dateutil import parser


texts = ["Aug06", "Aug2006", "August 2 2008", "19th August 2006", "08-06", "01-08-06"]
for text in texts:
    print text, parser.parse(text)


Aug06            2010-08-06 00:00:00
Aug2006          2006-08-28 00:00:00
August 2 2008    2008-08-02 00:00:00
19th August 2006 2006-08-19 00:00:00
08-06            2010-08-06 00:00:00
01-08-06         2006-01-08 00:00:00

, . , .

months = ['January', 'February',...]
months.extend([mon[:3] for mon in months])

# search for numeric dates:
/[\d \-]+/

# search for dates:
for word in sentence.split():
    if word in months:
        ...
+2

Source: https://habr.com/ru/post/1766902/


All Articles