How do I make a match at the first occurrence?

I need to digest some bbcode with a Ruby regex.

I need to delimit the elements with the match command and use regexp /pattern/m to get rid of new lines.

For example, my bbcode in the line:

 s="[b]Title[/b] \n Article text \n [b]references[/b]" 

Then I use match to delimit the parts of the text, especially the Title and The Reference sections, which are between [b] and [/b] :

 t=s.match(/\[b\](.*)\[\/b\]/m) 

I use syntax (..) to catch a string in regexp, and I use \ to avoid special characters [ and ] . /m is to get rid of newline in line.

Then t[1] contains:

 "Title[/b] \n Artucle text \n [b]references" 

instead of "Title" . because the match does not stop at the first occurrence [/b] . And t[2] is zero instead of โ€œReferencesโ€ for the same reason.

How can I distinguish between text parts enclosed between regular bbcode tags?

+4
source share
2 answers

Use a non-living operator ? in the following way:

 t=s.match(/[b](.*?)[/b]/m) 
+8
source

If you are sure that you will not encounter the opening of square brackets between your bbcode tags, you can use a character class that excludes them:

 t=s.match(/\[b\]([^\[]*)\[\/b\]/) 

But if your [b] tags may contain other tags, you need to use a recursive template:

 t=s.match(/(?x) # definitions (?<tag> \[ (?<name> \w++ ) [^\]]* \] (?> [^\[]+ | \g<tag> )* \[\/\g<name>\] ){0} # main pattern \[b\] (?<content> (?> [^\[]+ | \g<tag> )* ) \[\/b\] /) 

And if you need to deal with closing tags:

 t=s.match(/(?x) # definitions (?<self> \[ (?:img|hr)\b [^\]]* \] ){0} (?<tag> \[ (?<name> \w++ ) [^\]]* \] (?> [^\[]+ | \g<self> | \g<tag> )* \[\/\g<name>\] ){0} # main pattern \[b\] (?<content> (?> [^\[]+ | \g<self> | \g<tag> )* ) \[\/b\] /) 

Note: {0} allows you to define named subpatterns that can be used later without matching.

+1
source

Source: https://habr.com/ru/post/1484312/


All Articles