How to get subpattern index in regexp javascript?

I wrote a regular expression in JavaScript to search for searchedUrl in a string:

 var input = '1234 url( test ) 5678'; var searchedUrl = 'test'; var regexpStr = "url\\(\\s*"+searchedUrl+"\\s*\\)"; var regex = new RegExp(regexpStr , 'i'); var match = input.match(regex); console.log(match); // return an array 

Output:

 ["url( test )", index: 5, input: "1234 url( test ) 5678"] 

Now I would like to get the position searchedUrl (in the above example, this is the position test at 1234 url( test ) 5678 .

How can i do this?

+4
source share
4 answers

As far as I could tell, it was not possible to get the subtitle offset automatically, you need to perform the calculation yourself using the lastIndex RegExp or index property to match the object returned by exec() . Depending on what you use, you will either have to add or subtract the length of the groups leading to your match. However, this means that you need to group the first or last part of the regular expression, right down to the pattern you want to find.

lastIndex seems to come into play using the global flag /g/ , and it will record the index after the whole match. Therefore, if you want to use lastIndex , you will need to work backward from the end of your template.

For more information on the exec() method, see here:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec

The solution in operation is briefly shown below:

 var str = '---hello123'; var r = /([az]+)([0-9]+)/; var m = r.exec( str ); alert( m.index + m[1].length ); // will give the position of 123 

Update

This applies to your problem using the following:

 var input = '1234 url( test ) 5678'; var searchedUrl = 'test'; var regexpStr = "(url\\(\\s*)("+searchedUrl+")\\s*\\)"; var regex = new RegExp(regexpStr , 'i'); var match = regex.exec(input); 

Then, to get the offset bias, you can use:

 match.index + match[1].length 

match[1] now contains url( (plus two spaces) due to the grouping of brackets that allows us to calculate the internal offset.

update 2

Obviously, things are a little more complicated if you have the RegExp templates you want to group before the actual template you want to find. This is just an easy way to add each length of the group.

 var s = '~- [This may or may not be random|it depends on your perspective] -~'; var r = /(\[)([az ]+)(\|)([az ]+)(\])/i; var m = r.exec( s ); 

To get the offset position it depends on your perspective , you should use:

 m.index + m[1].length + m[2].length + m[3].length; 

Obviously, if you know that there are parts in RegExp that never change length, you can replace them with hard-coded numeric values. However, it is probably best to save the above .length checks, just in case you — or someone else — ever changes what matches your expression.

+2
source

JS has no direct way to get the subpattern / capture group index. But you can get around this with some tricks. For instance:

 var reStr = "(url\\(\\s*)" + searchedUrl + "\\s*\\)"; var re = new RegExp(reStr, 'i'); var m = re.exec(input); if(m){ var index = m.index + m[1].length; console.log("url found at " + index); } 
+2
source

You do not need an index.

This is a case where providing a little extra information would get a much better answer. I can't blame you for this; we are encouraged to create simple test cases and cut out irrelevant parts.

But one important element was missing: what are you planning to do with this index. In the meantime, we all chased the wrong problem. :-)

I had the feeling that something was missing; that is why I asked you about it.

As you mentioned in the comment, you want to find the URL in the input line and somehow highlight it, possibly wrapping it with a <b></b> or the like:

 '1234 url( <b>test</b> ) 5678' 

(Let me know if you meant something else by "highlight".)

You can use character indexes for this, however there is a much simpler way to use the regular expression itself.

Index retrieval

But since you asked if you need an index, you can get it with the code as follows:

 var input = '1234 url( test ) 5678'; var url = 'test'; var regexpStr = "^(.*url\\(\\s*)"+ url +"\\s*\\)"; var regex = new RegExp( regexpStr , 'i' ); var match = input.match( regex ); var start = match[1].length; 

This is a bit simpler than the code in the other answers, but any of them will work equally well. This approach works by binding the regular expression to the beginning of the line with ^ and putting all the characters in front of the url in the group with () . The length of this group, match[1] , is your index.

Slicing and slicing

Once you recognize the start index of test in your string, you can use .slice() or other string methods to cut the string and paste the tags, perhaps with code like this:

 // Wrap url in <b></b> tag by slicing and pasting strings var output = input.slice( 0, start ) + '<b>' + url + '</b>' + input.slice( start + url.length ); console.log( output ); 

It will certainly work, but it is really difficult.

In addition, I left some error handling code. What if there is no corresponding url? match will be undefined , and match[1] will fail. But instead of worrying about it, let's see how we can do this without any character indexing.

Easy way

Let regex do your job. Everybody is here:

 var input = '1234 url( test ) 5678'; var url = 'test'; var regexpStr = "(url\\(\\s*)(" + url + ")(\\s*\\))"; var regex = new RegExp( regexpStr , 'i' ); var output = input.replace( regex, "$1<b>$2</b>$3" ); console.log( output ); 

This code has three groups in a regular expression: one to capture the URL itself, with groups before and after the URL to capture other relevant text so that we don't lose it. Then just .replace() and you .replace() done!

You do not need to worry about any rows or indexes. And the code works cleanly if the URL is not found: it returns an immutable input line.

+1
source

You should use .exec, there is excellent sub-template matching documentation on the mdn website

-1
source

Source: https://habr.com/ru/post/1485577/


All Articles