Incremental pattern (RegEx) in Java?

Is there a way or effective library that allows incremental matching of regular expressions in Java?

What I mean is that I would like to have an OutputStream so that I can send a couple of bytes at a time, and this keeps track of data consistency so far against regular expression. If a byte is received, this will cause this regex to not exactly match, I would like the thread to tell me about it. Otherwise, he should inform me of the current best match, if any.

I understand that this is likely to be an extremely complex and not well defined problem, because you can imagine regular expressions that may correspond to the whole expression or any part of it or have no solution until the stream is closed in any case. Even something as trivial as * can match H, He, Hel, Hell, Hello, etc. In this case, I would like the stream to say: "Yes, this expression could coincide if it ended now, and here are the groups that he will return."

But if the pattern internally passes through the string, it matches character by character, may it not be so difficult?

+4
source share
1 answer

Incremental matching can be achieved by calculating a state finite state machine corresponding to a regular expression, and performing state transitions on this when processing input characters. Most lexers work this way. However, this approach is not suitable for groups .

So, perhaps you could do this two parts: have one match, which determines if there is any match at all, or any chance of a match in the future. You can use this to give you a quick response after each input character. Once you get a complete match, you can subtract the mechanism for tracking and grouping regular expressions to determine your respective groups. In some cases, it would also be advisable to encode grouping elements into an automaton, but I cannot come up with a general way of doing this.

+1
source

Source: https://habr.com/ru/post/1438700/


All Articles