Subtract from character class

Is there a way to subtract characters or a range of characters from another character class?

I need to find a substring inside a string that should contain only characters, but without "<" and ">".

[[:print:]] - ('<' | '>')

Because "<" and ">" are delimiters and should not occur inside the string itself.

<abc> // valid
<ab<c> // invalid
<ab\tc> //invalid
+3
source share
2 answers

[:print:]equivalent [\x20-\x7E], therefore, if you do not want <( \x3C) and >( \x3E), you can do[\x20-\x3B\x3D\x3F-\x7E]

this will match the printed characters in the string except <and>

/[\x20-\x3B\x3D\x3F-\x7E]+/
+4
source

, .

[a[b]]

- .

[a&&b]

- .

[a&&[^b]]

- .

Java. , , Java

[^\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]

\w. ( Perl, \w , Java.) Word :

(?:(?<=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])|(?<![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]))

\b, Java, , , . \X , , grapheme, :

(?>\PM\pM*)

grapheme, ( , ):

(?:(?:\u000D\u000A)|(?:[\u0E40\u0E41\u0E42\u0E43\u0E44\u0EC0\u0EC1\u0EC2\u0EC3\u0EC4\uAAB5\uAAB6\uAAB9\uAABB\uAABC]*(?:[\u1100-\u115F\uA960-\uA97C]+|([\u1100-\u115F\uA960-\uA97C]*((?:[[\u1160-\u11A2\uD7B0-\uD7C6][\uAC00\uAC1C\uAC38]][\u1160-\u11A2\uD7B0-\uD7C6]*|[\uAC01\uAC02\uAC03\uAC04])[\u11A8-\u11F9\uD7CB-\uD7FB]*))|[\u11A8-\u11F9\uD7CB-\uD7FB]+|[^[\p{Zl}\p{Zp}\p{Cc}\p{Cf}&&[^\u000D\u000A\u200C\u200D]]\u000D\u000A])[[\p{Mn}\p{Me}\u200C\u200D\u0488\u0489\u20DD\u20DE\u20DF\u20E0\u20E2\u20E3\u20E4\uA670\uA671\uA672\uFF9E\uFF9F][\p{Mc}\u0E30\u0E32\u0E33\u0E45\u0EB0\u0EB2\u0EB3]]*)|(?s:.))

, , , !

, Java .

- , Perl, Python Ruby. .

+3

Source: https://habr.com/ru/post/1778327/


All Articles