I am using happily gawk with FPAT. Here is the script I use for my examples:
#!/usr/bin/gawk -f BEGIN { FPAT="([^,]*)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) { printf "Record #%s, field #%s: %s\n", NR, i, $i } }
Simple, without quotes
It works well.
$ echo 'a,b,c,d' | ./test.awk Record
With quotes
It works well.
$ echo '"a","b",c,d' | ./test.awk Record
With empty columns and quotation marks
It works well.
$ echo '"a","b",,d' | ./test.awk Record
With escaped quotation marks, empty columns and quotation marks
It works well.
$ echo '"""a"": aaa","b",,d' | ./test.awk Record
With a column containing escaped quotes and ending with a comma
Fails.
$ echo '"""a"": aaa,","b",,d' | ./test.awk Record
Expected Result:
$ echo '"""a"": aaa,","b",,d' | ./test_that_would_be_working.awk Record
Is there a regex for FPAT that will make this work, or is it just not supported by awk?
The pattern will be " followed by only one. " Searching for a regular expression class works one character at a time, so it cannot match "" .
I think there might be a lookaround option, but I'm not good enough to make it work.