How to split a string using a regex that excludes the escaped version of my token?

In Java, I use the line splitting method to split a string containing values ​​separated by a semicolon.

Currently, I have the following line, which works in 99% of all cases.

String[] fields = optionsTxt.split(";");

However, a requirement has been added to include semicolons as part of the string. So, the following lines should analyze the following values:

"Foo foo;Bar bar" => [Foo foo] [Bar bar]
"Foo foo\; foo foo;Bar bar bar" => [Foo foo\; foo foo] [Bar bar bar]

It should be excruciatingly simple, but I'm completely not sure how to do it. I just want to not tokenize when there is \; and only tokenize when there is :.

Does anyone know a magic formula?

+2
source share
4 answers

try the following:

String[] fields = optionsTxt.split("(?<!\\\\);");
+2
source

, , , \; , , {{ESCAPED_SEMICOLON}}, , , , , \;

+1

(java.util.regex)

[^\\];

, , .

+1

, . , , :

String[] fields = optionsTxt.split("((?<!\\\\)|(?<=[^\\\\](\\\\\\\\){0,15}));");

15 . .

0

Source: https://habr.com/ru/post/1785233/


All Articles