Can someone explain this regex?

/^[\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Nd}]+$/mu 

This is the expression validation used by cakePHP to validate alphanumeric strings. I can not understand what is Ll, ​​Lm, Lt, etc.? This is necessary to check alphanumeric strings, so they must check numbers and characters. Can someone explain this expression a bit.

Thanks.

+4
source share
4 answers

Ll, Lm, Lo, Lt, Lu, Nd - Unicode character classes.

See here about 1/3 of the page:

http://www.regular-expressions.info/unicode.html

  • \ p {Ll} or \ p {Lowercase_Letter}: a lowercase letter with upper case option.
  • \ p {Lu} or \ p {Uppercase_Letter}: Uppercase lowercase letter.
  • \ p {Lt} or \ p {Titlecase_Letter}: a letter that appears at the beginning of a word when only the first letter of a word is uppercase.
  • \ p {L &} or \ p {Letter &}: a letter that exists in lowercase and uppercase variants (a combination of Ll, Lu and Lt).
  • \ p {Lm} or \ p {Modifier_Letter}: a special character that is used as a letter.
  • \ p {Lo} or \ p {Other_Letter}: a letter or ideogram that does not have lower and upper case.
+8
source

The code between curly braces (Li, Lm, Lt, etc.) are Unicode character classes. The quick character class for Unicode characters creates, for example, the following list: http://www.siao2.com/2005/04/23/411106.aspx

+1
source

If you regularly come across strange regular expressions, try one of them: https://stackoverflow.com/questions/89718/is-there-anything-like-regexbuddy-in-the-open-source-world - although I'm not sure Do they explain these (mostly Unicode?) placeholders. Otherwise, check out the list http://regular-expressions.info/

0
source

Source: https://habr.com/ru/post/1334952/


All Articles