Ruby regex issue calls sub method on String

I am looking at a Koans tutorial (this is a great way to find out) and I came across this statement:

assert_equal __, "one two-three".sub(/(t\w*)/) { $1[0, 1] } 

In this statement, __ is where I have to put the expected result in order to run the test correctly. I looked at this for a while and pulled out most of it, but I can't figure out what the last bit means:

 { $1[0, 1] } 

Expected Answer:

 "one t-three" 

and I expected:

 "tt" 
+4
source share
3 answers

{ $1[0, 1] } is a block containing the expression $1[0,1] . $1[0,1] evaluates to the first character of the string $1 , which contains the contents of the first capture group of the last matching regular expression.

When sub is called with a regular expression and a block, it will find the first match of the regular expression, call the block, and then replace the matched substring with the result of the block.

So, "one two-three".sub(/(t\w*)/) { $1[0, 1] } searches for the pattern t\w* . This finds the substring "two" . Since all this is in the capture group, this substring is stored at $1 . Now the block is called and returns "two"[0,1] , which is equal to "t" . Therefore, "two" is replaced by "t" , and you get "one t-three" .

It is important to note that sub , unlike gsub , replaces only the first occurrence, and not the appearance of the pattern.

+11
source

@ sepp2k already gave a really good answer, I just wanted to add how you could use IRB to possibly get there yourself:

 >> "one two-three".sub(/(t\w*)/) { $1 } #=> "one two-three" >> "one two-three".sub(/(t\w*)/) { $1[0] } #=> "one t-three" >> "one two-three".sub(/(t\w*)/) { $1[1] } #=> "one w-three" >> "one two-three".sub(/(t\w*)/) { $1[2] } #=> "one o-three" >> "one two-three".sub(/(t\w*)/) { $1[3] } #=> "one -three" >> "one two-three".sub(/(t\w*)/) { $1[0,3] } #=> "one two-three" >> "one two-three".sub(/(t\w*)/) { $1[0,2] } #=> "one tw-three" >> "one two-three".sub(/(t\w*)/) { $1[0,1] } #=> "one t-three" 
+2
source

The statement from the documentation ( http://ruby-doc.org/core/classes/String.html#M001185 ), here are the answers to your two questions: "why is the return value" one t-three '"and" what does {$ 1 [0, 1]}? "

What does {$ 1 [0, 1]} mean? The String # sub method can take either two arguments, or one argument, and a block. The latter is the form used here, and it is just like the Integer.times method, which takes a block:

 5.times { puts "hello!" } 

So this explains the closing braces.

$ 1 is a substring corresponding to the first regular expression capture group, as described here . [0, 1] is the string method "[]", which returns a substring based on the values โ€‹โ€‹of the array โ€” here, the first character.

Combine, {$ 1 [0, 1]} is the block that returns the first character in $ 1, where $ 1 is the substring that was matched by the capture group when the last expression was used to match the string.

Why is the return value one t three? The String # sub ('substitute') method, unlike its brother String # gsub ('global substitute'), replaces the first part of the string corresponding to the regular expression, with its replacement. Therefore, this method will replace the first subscript match "(t \ w *)" with the value of the block described above, i.e. With its first character. Since โ€œtwoโ€ is the first subscript match (t \ w *) (a 't', followed by any number of letters), it is replaced by its first character, 't'.

+1
source

Source: https://habr.com/ru/post/1334664/


All Articles