Javascript regExp cyrillic pattern

I know this is a stupid question, but I spent two days with no results. What should be the regExp pattern, allowing my user to enter only Cyrillic characters and spaces? Thanks in advance!

+3
source share
1 answer

You cannot do this in Javascript because Javascript does not provide even the simplest Unicode level 1 support in your regular expressions. You will need to switch languages ​​to do this correctly.

You cannot use the listed block ranges for this. This confuses blocks and scripts that are deeply corrupted. There are 150 code points that have a property \p{Script=Cyrillic}but that do not have a property \p{Block=Cyrillic}. They are in different blocks. Watch:

$ unichars '\p{Script=Cyrillic}' '\P{Block=Cyrillic}' | wc -l
150

In addition, the Cyrillic block has a pair of non-cyrillic code points.

The best thing you could do is list all 404 Cyrillic code points as a class of characters that can turn out to be prohibitive.

$ unichars '\p{Script=Cyrillic}'  | wc -l
404

You can use unichars scripts to list everything if you really want to. You may also want to grab the uniprops script while you are there.

+1
source

Source: https://habr.com/ru/post/1790185/


All Articles