String with java regex extension

I'm kinda stuck trying to find a regex to break lines with the following properties:

  • Denoted by | (trumpet)
  • If a single value contains a pipe escaped with \ (backslash)
  • If an individual value ends with a backslash, with a backslash escaped

So, for example, here are a few lines that I want to break:

  • One|Two|Three should give: ["One", "Two", "Three"]
  • One\|Two\|Three should give: ["One|Two|Three"]
  • One\\|Two\|Three should give: ["One\", "Two|Three"]

Now, how could I split this into one regex?

UPDATE. As many of you have already stated, this is not a good regex application. In addition, solving regular expressions is several orders of magnitude less than just repeating characters. I ended up iterating over the characters:

 public static List<String> splitValues(String val) { final List<String> list = new ArrayList<String>(); boolean esc = false; final StringBuilder sb = new StringBuilder(1024); final CharacterIterator it = new StringCharacterIterator(val); for(char c = it.first(); c != CharacterIterator.DONE; c = it.next()) { if(esc) { sb.append(c); esc = false; } else if(c == '\\') { esc = true; } else if(c == '|') { list.add(sb.toString()); sb.delete(0, sb.length()); } else { sb.append(c); } } if(sb.length() > 0) { list.add(sb.toString()); } return list; } 
+6
source share
1 answer

The trick is not to use the split() method. This forces you to use lookbehind to detect escaped characters, but it fails when the screens themselves are escaped (as you discovered). You should use find() instead to match tokens instead of delimiters:

 public static List<String> splitIt(String source) { Pattern p = Pattern.compile("(?:[^|\\\\]|\\\\.)+"); Matcher m = p.matcher(source); List<String> result = new ArrayList<String>(); while (m.find()) { result.add(m.group().replaceAll("\\\\(.)", "$1")); } return result; } public static void main(String[] args) throws Exception { String[] test = { "One|Two|Three", "One\\|Two\\|Three", "One\\\\|Two\\|Three", "One\\\\\\|Two" }; for (String s :test) { System.out.printf("%n%s%n%s%n", s, splitIt(s)); } } 

exit:

 One|Two|Three [One, Two, Three] One\|Two\|Three [One|Two|Three] One\\|Two\|Three [One\, Two|Three] One\\\|Two [One\|Two] 
+13
source

Source: https://habr.com/ru/post/893909/


All Articles