Javascript regex to match strings with heterogeneous Unicode character ranges

I need help regarding regular expression in Javascript.

I am trying to match any string containing only Basic Latin characters (ASCII) or only Greek Unicode characters. Avoid strings with mixed characters from these two sets.

I have this regular expression that matches exactly the opposite (all lines containing at least one Greek and one Latin character), but cannot find a way to negate this:

https://regex101.com/r/JHzmhc/1

Thanks in advance.

+4
source share
2 answers

you can use

^(?:[\u0000-\u007F]+|[\u0370-\u03FF]+)$

Watch the regex demo

  • ^ -
  • (?: - , (, ):
    • [\u0000-\u007F]+ - 1 + ASCII-
    • | -
    • [\u0370-\u03FF]+ - 1 +
  • ) -
  • $ - .
+3

Wiktors . , , [\u0370-\u03FF] - .

Unicode , youd do:

/^(?:[\0-\x7F\p]+|\p{Script_Extensions=Greek}+)$/u

, Unicode ECMAScript , :

/^(?:[\0-\x7F]+|(?:[\u0342\u0345\u0370-\u0373\u0375-\u0377\u037A-\u037D\u037F\u0384\u0386\u0388-\u038A\u038C\u038E-\u03A1\u03A3-\u03E1\u03F0-\u03FF\u1D26-\u1D2A\u1D5D-\u1D61\u1D66-\u1D6A\u1DBF-\u1DC1\u1F00-\u1F15\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59\u1F5B\u1F5D\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FC4\u1FC6-\u1FD3\u1FD6-\u1FDB\u1FDD-\u1FEF\u1FF2-\u1FF4\u1FF6-\u1FFE\u2126\uAB65]|\uD800[\uDD40-\uDD8E\uDDA0]|\uD834[\uDE00-\uDE45])+)$/

: https://regex101.com/r/cmNTLA/1

+2

Source: https://habr.com/ru/post/1679891/


All Articles