RegEx syntax differences between Python and Java

I have a working regex in Python and am trying to convert it to Java. There seems to be a subtle difference in the implementation.

RegEx is trying to match another reg ex. Regression question:

/(\\.|[^[/\\\n]|\[(\\.|[^\]\\\n])*])+/([gim]+\b|\B) 

One of the lines with which there are problems: /\s+/;

The reg ex parameter must not match the end ; . In Python, RegEx works correctly (and does not match the ending;; but in Java it includes;.

Question (s):

  • What can I do to get this RegEx running in Java?
  • Based on what I read here , there should be no difference for this RegEx. Is there somewhere a list of differences between RegEx implementations in Python and Java?
+6
source share
2 answers

Java does not parse regular expressions much like Python does for a small set of cases. In this particular case, nested [ caused problems. In Python, you do not need to avoid nested [ , but you need to do it in Java.

Original RegEx (for Python):

 /(\\.|[^[/\\\n]|\[(\\.|[^\]\\\n])*])+/([gim]+\b|\B) 

Fixed RegEx (for Java and Python):

 /(\\.|[^\[/\\\n]|\[(\\.|[^\]\\\n])*\])+/([gim]+\b|\B) 
+10
source

The obvious difference between b / w Java and Python is that in Java you need to avoid a lot of characters.

In addition, you are likely to encounter a mismatch between the matching methods, and not the difference in the actual record in the regular expression:

Given Java

 String regex, input; // initialized to something Matcher matcher = Pattern.compile( regex ).matcher( input ); 
  • Java matcher.matches() (also Pattern.matches( regex, input ) ) matches the entire string. It has no direct equivalent in Python. The same result can be achieved using re.match( regex, input ) with regex that ends in $ .
  • Java matcher.find() and Python re.search( regex, input ) correspond to any part of the string.
  • Java matcher.lookingAt() and Python re.match( regex, input ) correspond to the beginning of the line.

For more information, also read the Java Matcher documentation and compare with the Python Documentation .

Since you said that this is not a problem, I decided to do a test: http://ideone.com/6w61T It seems that java does exactly what you need (group 0, full match, does not contain ; ). Your problem is elsewhere.

+5
source

Source: https://habr.com/ru/post/915149/


All Articles