How to parse contents with a specific column in a csv file in bash

Question

How to parse contents with a specific column in a csv file in bash

I am trying to parse a csv file line by line and its format looks something like this:

"name","content1,with commas as you see", "content2, also may contain commas", "..." ... ...

I want to get content in a specific column without quotes. for example: 1st column and 3rd. Thus, the expected content should be:

 name (if get column 1) content2, also may contain commas (if get column 3)

I tried to use awk, but that didn't work. I also tried:

 while IFS=, read col1 col2 col3 col4; do echo "got ${col1}|${col3}"; done < file

But it contains quotation marks "", and the contents of col3 are erroneous, which mixes a comma inside each column. How do I split such formats containing commas in each column?

+4

bash csv multiple-columns

Qingshan Zhang Jun 11 '13 at 13:24

source share

2 answers

If you have GNU awk , then FPAT will come to your aid.

 gawk '{print $1,$3}' FPAT="([^,]+)|(\"[^\"]+\")" my.csv

In awk we usually use FS , which determines which field is not, not a field. In this particular case, we really want to determine the fields by what they are, and FPAT allows us to do just that.

+3

jaypal singh Jun 11 '13 at 13:39

source share

l0b0 · Accepted Answer · 2013-06-11T14:11:19+0000

Due to complexities like these, it is probably a lot easier if you use the actual CSV parser like csvtool :

 $ csvtool col 3 - <<< '"name","content1,with commas as you see", "content2, also may contain commas", "..."' "content2, also may contain commas"

How to parse contents with a specific column in a csv file in bash

More articles: