This is difficult to do with the Python regex because the current implementation does not support Unicode property shortcuts such as \p{Lu} and \p{Ll} .
[A-Za-z] , of course, will match the letters ASCII regardless of whether the Unicode option is set or not.
So, until the re module is updated (or you install the regex package at the moment in development), you either need to do it programmatically ( char.islower() over the line and does char.islower() / char.isupper() for characters) or sets all code points Unicode by hand, which is probably not worth the effort ...
source share