Multiple occurrences of the same character in a regexp string - Python

Given a string of 3 uppercase letters, 1 small cap and 3 more capitals, for example Aaaaaaa

I cannot find a regex that will find a string that matches a string that has:

  • the first 3 capital letters are all different
  • any small letter caps
  • first 2 are the same capital letters as the first.
  • the last capital letter matches the last capital letter in the first "trio"

eg. A BC a AA C (no spaces)

EDIT:

It turns out I need something a little different, for example. ABCaAAC, where 'a' is a small cap of the version of the very first character, not just any character

+4
source share
1 answer

The following should work:

^([AZ])(?!.?\1)([AZ])(?!\2)([AZ])[az]\1\1\3$ 

For instance:

 >>> regex = re.compile(r'^([AZ])(?!.?\1)([AZ])(?!\2)([AZ])[az]\1\1\3$') >>> regex.match('ABAaAAA') # fails: first three are not different >>> regex.match('ABCaABC') # fails: first two of second three are not first char >>> regex.match('ABCaAAB') # fails: last char is not last of first three >>> regex.match('ABCaAAC') # matches! <_sre.SRE_Match object at 0x7fe09a44a880> 

Explanation:

 ^ # start of string ([AZ]) # match any uppercase character, place in \1 (?!.?\1) # fail if either of the next two characters are the previous character ([AZ]) # match any uppercase character, place in \2 (?!\2) # fail if next character is same as the previous character ([AZ]) # match any uppercase character, place in \3 [az] # match any lowercase character \1 # match capture group 1 \1 # match capture group 1 \3 # match capture group 3 $ # end of string 

If you want to pull these matches from a larger piece of text, just get rid of ^ and $ and use regex.search() or regex.findall() .

However, you can more easily understand the following approach, it uses a regular expression for basic validation, but then uses regular string operations to validate all additional requirements:

 def validate(s): return (re.match(r'^[AZ]{3}[az][AZ]{3}$', s) and s[4] == s[0] and s[5] == s[0] and s[-1] == s[2] and len(set(s[:3])) == 3) >>> validate('ABAaAAA') False >>> validate('ABCaABC') False >>> validate('ABCaAAB') False >>> validate('ABCaAAC') True 
+11
source

Source: https://habr.com/ru/post/1402556/


All Articles