Are regular expressions (regular expression) really regular?

I understand how regular expressions got their name and read a question related to it ( Why are regular expressions called "regular" expressions?), But I'm still curious about regular expressions.

For example, how can backlinks be regular? Does it not require some memory and, therefore, it is impossible to combine / generate using a state machine?

+5
source share
2 answers

The link is in the answer to your question (in wikipedia), unlike many regular expression engines provided by modern programming languages, which are complemented by functions that allow you to recognize languages ​​that cannot be expressed by a classic regular expression.

So, I would say that the evolution of regular expression has moved it away from the original idea of ​​expressing regular languages.

From the Wikipedia article on regular expressions :

Many of the features found in almost all modern library regular expressions provide expressive power that far exceeds languages. For example, many implementations allow you to group subexpressions with parentheses and remind the meaning that they match the same expression (backlinks). This means that, among other things, the pattern can correspond to lines of repeating words like "dad" or "WikiWiki", called squares in formal language theory. The pattern for these lines is (.+)\1 .

+4
source

Modern extensions, including backlinks, make regular expressions not candidates for ordinary languages, but IMOs can be raised to context-free languages, but not with Turing machines.

Regular grammars have a common property called the pumping lemma. You can check the example here , which proves that 0 n 1 n is not a regular grammar (which is very similar to backlinks). Here's how to show that backlinks do not satisfy the property of the pump lemma.

  • Pumpdown lemma in the current context: to show that the regular expression system is a regular grammar, there must be a finite length p such that all lines corresponding to the regular expression and having a length equal to or greater than p can be split at the top in three substrings xyz such that y is not an empty string, and all the strings represented by xy * z (y downloaded for [0, infinitely)) match the regular expression.

  • If we cannot show that such p can satisfy the conditions for a regular expression, then it is not included in the correct grammar.

  • For backlinks, we will need to have two of these pumping strings that are of equal length, one for the submatrix in the captured group and one in the backlink. This is exactly what automatic or context-free languages ​​click on. There is also the problem of swapping for context-free grammars, based on splitting on uvwxy, where v and x can be pumped the same n times. We can show that a regular expression with a reverse reference system satisfies this lemma.

+2
source

Source: https://habr.com/ru/post/1245978/


All Articles