Match date with same character separating values

I need to find dates in several formats in the text. I have some regex like this:

# Detection of:
# 25/02/2014 or 25/02/14 or 25.02.14
regex = r'\b(0?[1-9]|[12]\d|3[01])[-/\._](0?[1-9]|1[012])[-/\._]((?:19|20)\d\d|\d\d)\b'

The problem is that it also matches type dates 25.02/14, which is not very good, because the separation character does not match.

I could, of course, make a few regular expressions with a different splitting character for each regular expression or do subsequent processing of the matching results, but I would prefer a complete solution using only one good regular expression. Is there any way to do this?

+4
source share
2 answers

( "", , IP, , ..), . :

import re

s = '25.02.19.35  6666-20-03-16-67875 25.02/2014 25.02/14 11/12/98 11/12/1998 14/12-2014 14-12-2014 14.12.1998'

found_dates = [m.group() for m in re.finditer(r'\b(?:0?[1-9]|[12]\d|3[01])([./-])(?:0?[1-9]|1[012])\1(?:19|20)?\d\d\b', s)]
print(found_dates) # initial regex

found_dates = [m.group() for m in re.finditer(r'(?<![\d.-])(?:0?[1-9]|[12]\d|3[01])([./-])(?:0?[1-9]|1[012])\1(?:19|20)?\d\d(?!\1\d)', s)]
print(found_dates) # fixed boundaries

# = >['25.02.19', '20-03-16', '11/12/98', '11/12/1998', '14-12-2014', '14.12.1998']
# => ['11/12/98', '11/12/1998', '14-12-2014', '14.12.1998']

, '25.02.19' ( IP) '20-03-16' ( / ).

. .

:

  • (?<![\d.-]) - lookbehind, , , . - (/ , URL-)
  • (?:0?[1-9]|[12]\d|3[01]) - 01/1 31 ( )
  • ([./-]) - 1 ( ), ., / -
  • (?:0?[1-9]|1[012]) - : 01/1 12
  • \1 - Group 1, , .
  • (?:19|20)?\d\d - year part: 19 20 ( ), .
  • (?!\1\d) - , ( 1), , .
+1

Rawing, :

regex = r'\b(0?[1-9]|[12]\d|3[01])([./-])(0?[1-9]|1[012])\2((?:19|20)\d\d|\d\d)\b'

, :

import re

s = '25.02/2014 25.02/14 11/12/98 11/12/1998 14/12-2014 14-12-2014 14.12.1998'

found_dates = []
for m in re.finditer(r'\b(0?[1-9]|[12]\d|3[01])([./-])(0?[1-9]|1[012])\2((?:19|20)\d\d|\d\d)\b', s):
    found_dates.append(m.group(0))
print(found_dates)

: ['11/12/98', '11/12/1998', '14-12-2014', '14.12.1998']

+1

Source: https://habr.com/ru/post/1675371/


All Articles