Conditional Regexp: returns only one group

Two types of URLs I want to map:

(1) www.test.de/type1/12345/this-is-a-title.html (2) www.test.de/category/another-title-oh-yes.html 

In the first type I want to match "12345". In the second type, I want to combine "category / another-title-oh-yes".

Here is what I came up with:

 (?:(?:\.de\/type1\/([\d]*)\/)|\.de\/([\S]+)\.html) 

This returns the following:

For type (1):

 Match group 1: 12345 Match group 2: 

For type (2):

 Match group: Match group 2: category/another-title-oh-yes 

As you can see, it is already working pretty well. For various reasons, I need the regex to return only one match group. Is there any way to achieve this?

+6
source share
2 answers

Java / PHP / Python

Get both a consistent group with index 1 using both Negative Lookahead and Positive Lookbehind.

 ((?<=\.de\/type1\/)\d+|(?<=\.de\/)(?!type1)[^\.]+) 

There are two regex patterns that are ORed.

The first regular expression pattern is looking for 12345

The second regex pattern looks for category/another-title-oh-yes .


Note:

  • Each regex pattern must match one match in each URL
  • Combine the entire regex pattern inside the brackets (...|...) and remove the brackets from [^\.]+ And \d+ where:

     [^\.]+ find anything until dot is found \d+ find one or more digits 

Here's an online demo of regex101


Input:

 www.test.de/type1/12345/this-is-a-title.html www.test.de/category/another-title-oh-yes.html 

Output:

 MATCH 1 1. [18-23] `12345` MATCH 2 1. [57-86] `category/another-title-oh-yes` 

Javascript

try this and get as a consistent group at index 2.

 ((?:\.de\/type1\/)(\d+)|(?:\.de\/)(?!type1)([^\.]+)) 

Here's an online demo of regex101 .

Input:

 www.test.de/type1/12345/this-is-a-title.html www.test.de/category/another-title-oh-yes.html 

Output:

 MATCH 1 1. `.de/type1/12345` 2. `12345` MATCH 2 1. `.de/category/another-title-oh-yes` 2. `category/another-title-oh-yes` 
+3
source

Perhaps it:

 ^www\.test\.de/(type1/(.*)\.|(.*)\.html)$ 

Regular expression visualization

Demo version of Debuggex

Then, for example:

 var str = "www.test.de/type1/12345/this-is-a-title.html" var regex = /^www\.test\.de/(type1/(.*)\.|(.*)\.html)$/ console.log(str.match(regex)) 

This will lead to the output of the array, the first element will be a string, the second - that is, after the website address, the third - that corresponds to type 1, and the fourth element is the rest.

You can do something like var matches = str.match(regex); return matches[2] || matches[3]; var matches = str.match(regex); return matches[2] || matches[3];

+1
source

Source: https://habr.com/ru/post/971876/


All Articles