View regexps without capture with if / then

I have some broken JSON files that I want to fix. The problem is that one of the fields, AcquisitionDateTime, is garbled:

{ "AcquisitionDateTime": 2016-04-28T17:09:39.515625, } 

What I want to do is wrap the value in parentheses. I can do this with a regex:

 perl -pi -e 's/\"AcqDateTime\": (.*),/\"AcqDateTime\": \"\1\",/g' t.json 

Now I want to extend the regex so that if the JSON does not break, the content will not be wrapped in "" twice. The problem I am facing is that I do not know how to mix lookahead, if / then statements, and capture groups. Here is my attempt:

 Lookahead, if you find a ", then capture what is between it. Else capture everything. perl -pi -e 's/\"AcqDateTime\": (?(?=\")\"(.*)\"|(.*)),/\"AcqDateTime:\" \"\1\",/g' t.json 

This is the part I'm interested in fixing:

 Lookahead for a \" -> if yes, then capture without it. \"(.*)\" Else capture all (.*) (?(?=\")\"(.*)\"|(.*)), 

Will someone explain to me what I'm doing wrong?

Thanks in advance.

+6
source share
2 answers

A good start to a timestamp match would be

 \S+ 

But it also matches with a comma, so we switch to

  [^\s,]+ 

Now you also avoid matching quotes.

  [^\s",]+ 

That is all you need.

 perl -i -pe's/"AcqDateTime":\s*+\K([^\s",]+)/"$1"/g' t.json 
+3
source

The following expression includes checking for partial packaging of quotation marks (i.e., only at the beginning or at the end of a value), no wrapping at both ends, or an empty value:

 perl -pi -e 's/\"AcqDateTime\": (|(?<!\")[^\"].*|.*[^\"](?!\")),/\"AcqDateTime\": \"\1\",/g' t.json 

where (|(?<!\")[^\"].*|.*[^\"](?!\")) includes:

  • an empty string value, as in the case of { "AcquisitionDateTime": } or
  • (?<!\")[^\"].* : a value that does not start with a quote, as in { "AcquisitionDateTime": 2016" } , or
  • .*[^\"](?!\") : a value that does not end with a quote, as in { "AcquisitionDateTime": "2016 } .
+2
source

Source: https://habr.com/ru/post/1013394/


All Articles