Regex in java and its performance compared to indexOf

Please someone tell me how to match "_" and period ".". excatly once per line using regex, it is also more efficient to use indexOf () instead of expressing regular expressions.

String s= "Hello_Wor.ld" or s="12323_!£££$.asdfasd" 

basically any number of characters can appear before and after _ and . , the only requirement is that the whole line should contain only one space _ and .

+6
source share
2 answers

indexOf will be much faster than regex, and probably also easier to understand.

Just check if indexOf('_') >= 0 , and then if indexOf('_', indexOfFirstUnderScore) < 0 . Do the same for the period.

 private boolean containsOneAndOnlyOne(String s, char c) { int firstIndex = s.indexOf(c); if (firstIndex < 0) { return false; } int secondIndex = s.indexOf(c, firstIndex + 1); return secondIndex < 0; } 
+7
source

Matches a string with one . :

 /^[^.]*\.[^.]*$/ 

Same thing for _ :

 /^[^_]*_[^_]*/ 

The combined regex should look something like this:

 /^([^._]*\.[^._]*_[^._]*)|([^._]*_[^._]*\.[^._]*)$/ 

It should now be obvious that indexOf is a better solution, being simpler (performance does not matter until it is shown that this is a bottleneck).

If interested, notice how the combined regular expression has two members, for "string with one . Before one _ " and vice versa. He would have six for three characters, and n! for n. It would be easier to run both regular expressions and the result than using a combined regular expression.

You should always look for a simpler solution when using regular expressions.

+2
source

Source: https://habr.com/ru/post/901649/


All Articles