In the following regular expression, I would like each character in the string to be replaced with an “X”, but it does not work.
In Python 2.7:
>>> import re >>> re.sub(u"[a-zA-Z]","X","dfäg") 'XX\xc3\xa4X'
or
>>> re.sub("[a-zA-Z]","X","dfäg",re.UNICODE) u'XX\xe4X'
In Python 3.4:
>>> re.sub("[a-zA-Z]","X","dfäg") 'XXäX'
Is there any way to "customize" the [a-zA-Z] template to match "ä", "ü", etc.? If this is not possible, how can I create a similar character range pattern between square brackets that will include Unicode characters in the regular full alphabet range? I mean, for example, in a language like German, “ä” will be placed somewhere next to “a” in the alphabet, so one would expect it to be included in the range “az”.
source share