Ruby Regex, only one capture (very simple!)

I suppose this would be a stupid mistake, but for me the following returns an array containing only "M". See this:

/(.)+?/.match("Many many characters!").captures => ["M"] 

Why doesn't it return an array of each character? I must have missed something obviously obvious because I don’t see what happened to it?

Edit: just realized i don't need +? but he still does not work without him.

Edit: Apologies! I’ll clarify: my goal is to allow users to enter a regular expression and style and text input file, wherever there is a match, the text will be surrounded by an html element and the style will be applied, I do not just separate the string into characters, I used only this regular expression, therefore that it was the simplest, although it was stupid of me. How to get capture groups from scan () or is it impossible? I see that $ 1 contains "!" (last match?), not others.

Edit: God, this is really not my day. As the pest informed me, the captures are stored in separate arrays. How to get the offset of these captures from the source string? I would like to be able to get the offset of the captures, and then surround it with another line. Or what is gsub for? (I thought I only replaced the match, not the capture group)

Hope final editing: Right, let me start this again: P

So I have a line. The user will use the configuration file to enter a regular expression, and then the style associated with each capture group. I need to be able to scan the entire line and get the beginning, end or offset and size of each group.

So, if the user has configured ([\w-\.]+)@((?:[\w]+\.)+)([a-zA-Z]{2,4}) (email address), I should get:

 [ ["elliotpotts", 0, 11], ["sample.", 12, 7], ["com", 19, 3] ] 

from the line: " Elliotpotts@sample.com "

If this is not clear, something is wrong with me: P. Thank you guys so much and thank you for being so patient!

+6
source share
4 answers

Since your capture only matches a single character, one . (.)+ does not match (.+)

 >> /(.)+?/.match("Many many characters!").captures => ["M"] >> /(.+)?/.match("Many many characters!").captures => ["Many many characters!"] >> /(.+?)/.match("Many many characters!").captures => ["M"] 

If you want each character to match recursively, use String#scan or String#split if you don't need capture groups

Using scan:

 "Many many characters!".scan(/./) #=> ["M", "a", "n", "y", " ", "m", "a", "n", "y", " ", "c", "h", "a", "r", "a", "c", "t", "e", "r", "s", "!"] 

Note that the other answer uses (.) , While it is fine if you care about the capture group, it is a little pointless if you do not, otherwise it will return EVERY CHARACTER to its own separate array, for example:

 [["M"], ["a"], ["n"], ["y"], [" "], ["m"], ["a"], ["n"], ["y"], [" "], ["c"], ["h"], ["a"], ["r"], ["a"], ["c"], ["t"], ["e"], ["r"], ["s"], ["!"]] 

Otherwise, just use split : "Many many characters!".split(' ')"

EDIT In response to your edit:

 reg = /([\w-\.]+)@((?:[\w]+\.)+)([a-zA-Z]{2,4})/ str = " elliotpotts@sample.com " str.scan(reg).flatten.map { |capture| [capture, str.index(capture), capture.size] } #=> [["elliotpotts", 0, 11], ["sample.", 12, 7], ["com", 19, 3]]` 

Oh, and you don’t need to scan, you don’t actually scan, so you don’t need to go through at least not with the example you gave:

 str.match(reg).captures.map { |capture| [capture, str.index(capture), capture.size] } 

Will also work

+9
source

Yes, something important was missing; -)

(...) introduces only ONE capture group: the number of group matches does not matter, since the index is determined only by the regular expression itself, and not by the input.

The key is a "global regular expression" that will apply the regular expression several times in order. In Ruby, this is done with inverting from Regex#match to String#scan (many other languages ​​have the Regex#match modifier "/ g"):

 "Many many characters!".scan(/(.)+?/) # but more simply (or see answers using String#split) "Many many characters!".scan(/(.)/) 

Happy coding

+1
source

It returns only one character, because everything you asked for it to match. You probably want to use scan :

 str = "Many many characters!" matches = str.scan(/(.)/) 
0
source

The following code is from Get Index of String Scan Results in ruby ​​and modified to my liking.

 [].tap {|results| "abab".scan(/a/) {|capture| results.push(([capture, Regexp::last_match.offset(0)]).flatten) } } => [["a", 0], ["a", 2]] 
0
source

Source: https://habr.com/ru/post/898537/


All Articles