Why won't a longer token in rotation be negotiated?

I am using ruby ​​2.1, but the same thing can be reproduced on the rubular website.

If this is my line:

儘管中國婦幼衛生監測辦公室制定的 

And I execute a regex with this expression:

 (中國婦幼衛生監測辦公室制定|管中) 

I expect to get a longer token as a match.

 中國婦幼衛生監測辦公室制定 

Instead, I get the second rotation as a match.

As far as I know, it works as if not in Chinese characters.

If this is my line:

 foobar 

And I use this regex:

 (foobar|foo) 

The result of matching foobar . If the order is different than the matching string foo . That makes sense to me.

+5
source share
1 answer

Your assumption that the regex matches a longer interlace is incorrect.

If you have some time, let's see how your regular expression works ...

Quick update: how the regular expression works: the state machine always reads from left to right, returning to where it is needed.

There are two pointers, one to the template:

 (cdefghijkl|bcd) 

Another in your line:

 abcdefghijklmnopqrstuvw 

The pointer to the line moves to the left. As soon as he can return, he will :

x
(source: gyazo.com )

Let me turn this into a more "consistent" sequence for understanding:

y
(source: gyazo.com )

Your foobar example is another topic. As I mentioned in this post :

How a regular expression works: a state machine always reads from left to right. ,|,, == , since it will always correspond only to the first rotation.

That's fine, Uniedr, but how do I get him to the first rotation?

Look! [TG44]

 ^(?:.*?\Kcdefghijkl|.*?\Kbcd) 

There is a demonstration of regular expressions here .

This regular expression first tries to match the entire string with the first interlace. Only if he fails completely will he try to match the second rotation. \K used here to maintain consistency with the contents behind the \K construct.


* : \K supported in Ruby since 2.0.0.

Read more:





Ah, I was bored, so I optimized the regex:

 ^(?:(?:(?!cdefghijkl)c?[^c]*)++\Kcdefghijkl|(?:(?!bcd)b?[^b]*)++\Kbcd) 

You can see the demo here .

+15
source

Source: https://habr.com/ru/post/1201174/


All Articles