What does "two-level regular expressions" mean?

Question

What does "two-level regular expressions" mean?

I understand the basic regex, but don’t know what the quote below means (regarding how to implement the wiki parser), can anyone provide some pseudo code to enlighten me?

Two-Level Regular Expressions
This is a very popular approach. This is pretty fast because it scans the source exactly two times.
The idea is to create two types of regular expressions: one division of the text into blocks of different types (paragraphs, headings, lists, preformatted blocks, etc.), and then process each of them using the regular regular character level expression.

Quote: http://www.wikicreole.org/wiki/CommonWikiParsingTechniques

+4

regex

user1154337 Jan 17 '12 at 16:16

source share

2 answers

It seems that “two-level regular expressions” are a (slightly ambiguous) term for what I recommended in a few answers here in StackOverflow for analyzing a slightly complex (but still regular) language problem.

An example is getting all urls img src= from an HTML page. It is possible (but rather dirty) to do this all in one regex; which makes sense to use a regular expression to get all the <img> tags (capturing the entire tag), and then use another regular expression to get src="http://some-url-here.com" from each match. This makes the code more readable and you only look at the text twice.

+3

Platinum azure Jan 17 '12 at 16:33

source share

Andrew Barber · Accepted Answer · 2012-01-17T16:20:02+0000

This means that you are not trying to perform several tasks in one regular expression, but divide it into two tasks (two levels); first splitting, then processing each token separately.

My opinion is that people are often reluctant to try to do too much Regex alone, instead of making things a lot easier by breaking up various tasks like this.

What does "two-level regular expressions" mean?

Two-Level Regular Expressions

More articles: