Change regular expressions with a common sub-expression with different prefix and suffix expressions

I have the following regular expression that has 3 alternations (see full regular expression below), each with its own prefixes and suffixes. I feel that this is being repeated excessively and would like to simplify, if possible. I am matching values โ€‹โ€‹in a malformed JSON string to replace values โ€‹โ€‹that do not have a key with indexed keys.

Each rotation must correspond to a pair of prefixes and a suffix with a sub-expression. I currently have 3 pairs, but that could change. If I had a few more pairs, the entire regex would turn into a nightmare to change and understand if I need to modify the repeating sub-expression.

Question

How can I shorten the entire regex below without requiring repetition of the subexpression for the listed pairs of suffixes and prefixes?

Subexpression repeated in each alternation

("(?:[^\\"]+|\\.)*") 

prefix / suffix pairs

  • {,
  • ,
  • ,}

Whole regex

 /\{("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=\})/g 

Test lines

  • {"trailer":"","pallet":"A","date":"11-Dec-15","c","z","a"}
  • {"trailer":"","pallet":"A","a","date":"11-Dec-15"}
  • {"a","trailer":"","pallet":"A","date":"11-Dec-15"}
  • {"a","trailer":"","pallet":"A","date":"11-Dec-15","z\""}
  • {"trailer":"","pallet":"A","11-Dec-15"}
  • {"trailer\"","pallet":"A","11-Dec\"-15","z\""}

Living example

Please limit your responses to regular expressions, not JSON validation methods, as I am trying to get a better idea of โ€‹โ€‹regular expressions, and this is just an example that I use.

+5
source share
1 answer

While regex can be simplified:

 /\{("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=\})/g 

To:

 /{("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=,)|,("(?:[^\\"]+|\\.)*")(?=})/g 

Removing the escaping { and } , as this is not required for the JavaScript regex mechanism.

It is not possible to remove your explicit duplicate pattern ("(?:[^\\"]+|\\.)*") In JavaScript.

JavaScript does not support all the same regular expression functions that are supported by PCRE (PHP, C ++, Perl, etc.).

For example, in PHP / C ++ you can do this:

 {("(?:[^\\"]+|\\.)*")(?=,)|,((?1))(?=,)|,((?1))(?=}) 

For Perl 5.22, you will need to avoid this again { so that it looks something like this:

 m/\{("(?:[^\\"]+|\\.)*")(?=,)|,((?1))(?=,)|,((?1))(?=})/g 

This (?1) is a subroutine call to match the regular expression within capture group 1 , which in this case is ("(?:[^\\"]+|\\.)*") .

+1
source

Source: https://habr.com/ru/post/1237926/


All Articles