For some reason, I encountered a similar problem in my build system, having ZSH version 5.0.2 on my laptop (where Unicode works, as expected) and ZSH 4.3.17 on my build system. It seems to me that ZSH 5 has no problem with Unicode characters in regex patterns.
In particular, parsing a key / value pair:
[[ "revision/author=Ľudovít Lučenič" =~ '^([^=]+)=(.*)$' ]] echo "$match[1]:$match[2]"
is having
:
In addition, I assume some flaw in the support of ZSH 4 Unicode in general.
Update: after some research, I found that the dot in the regular expression does not match the letter “č” in ZSH 4. As soon as I updated the template to:
[[ "revision/author=Ľudovít Lučenič" =~ '^([^=]+)=((.|č)*)$' ]] echo "$match[1]:$match[2]"
I get the same result in both versions of ZSH. However, I do not know why this particular letter is the problem here. However, it can help someone solve this shortcoming.
source share