How do you parse comma values ​​(csv) with awk?

I am trying to write an awk script to convert a formatted CSV table to XML for Bugzilla errors. The input CSV format is as follows (created from an XLS spreadsheet and saved as CSV):

tag_1,tag_2,...,tag_N
value1_1,value1_2,...,value1_N
value2_1,value2_2,...,value2_N
valueM_1,valueM_2,...,valueM_N

The header column shows the name of the XML tag. The above file converted to XML should look like this:

<element>
    <tag_1>value1_1</tag_1>
    <tag_2>value1_2</tag_2>
    ...
    <tag_N>value1_N</tag_N>
</element>
<element>
    <tag_1>value2_1</tag_1>
    <tag_2>value2_2</tag_2>
    ...
    <tag_N>value2_N</tag_N>
</element>
...

awk script I have to do the following:

BEGIN {OFS = "\n"}
NR == 1 {for (i = 1; i <=NF; i++)
            tag[i]=$i
         print "<bugzilla version=\"3.4.1\" urlbase=\"http://mozilla.com/\" maintainer=\"somebody@mozilla.com\" exporter=\"somebody.else@mozilla.com\">"}
NR != 1 {print "   <bug>"
         for (i = 1; i <= NF; i++)
            print "      <" tag[i] ">" $i "</" tag[i] ">"
         print "   </bug>"}
END {print "</bugzilla>"}

Actual CSV file:

cf_foo,cf_bar,short_desc,cf_zebra,cf_pizza,cf_dumpling ,assigned_to,bug_status,cf_word,cf_caslte
ABCD,A-BAR-0032,A NICE DESCRIPTION - help me,pretty,Pepperoni,,,NEW,,

Actual conclusion:

$ awk -f csvtobugs.awk bugs.csv

<bugzilla version="3.4.1" urlbase="http://mozilla.com/" maintainer="somebody@mozilla.com" exporter="somebody.else@mozilla.com">
   <bug>
      <cf_foo,cf_bar,short_desc,cf_zebra,cf_pizza,cf_dumpling>ABCD,A-BAR-0032,A</cf_foo,cf_bar,short_desc,cf_zebra,cf_pizza,cf_dumpling>
      <,assigned_to,bug_status,cf_word,cf_caslte>NICE</,assigned_to,bug_status,cf_word,cf_caslte>
      <>DESCRIPTION</>
      <>-</>
      <>help</>
      <>me,pretty,Pepperoni,,,NEW,,</>
   </bug>
   <bug>
   </bug>
</bugzilla>

It is clear that the intended result is not (I agree, I copied this script from this forum: http://www.unix.com/shell-programming-scripting/21404-csv-xml.html ). The problem is that it was SOOOOO, since I looked at awk scripts and I don't have IDEA, which means syntax.

+3
6

FS = "," BEGIN ; , , , , ( ) , "CSV", ; -).

+4

, :)

awk script , " CSV. ( , , , ), python, perl.Net .. CSV XML, , , , awk script, .

+1

, csv , :

1997,Ford,E350,"Super, luxurious truck"

", " , . csv libs "Mark" .

+1

, FS ( ):

BEGIN {
    FS=",";
    OFS = "\n"}
NR == 1 {for (i = 1; i <=NF; i++)
            tag[i]=$i
         print "<bugzilla version=\"3.4.1\" urlbase=\"http://mozilla.com/\" maintainer=\"somebody@mozilla.com\" exporter=\"somebody.else@mozilla.com\">"}
NR != 1 {print "   <bug>"
         for (i = 1; i <= NF; i++)
            print "      <" tag[i] ">" $i "</" tag[i] ">"
         print "   </bug>"}
END {print "</bugzilla>"}

:

<bugzilla version="3.4.1" urlbase="http://mozilla.com/" maintainer="somebody@mozilla.com" exporter="somebody.else@mozilla.com">
   <bug>
      <cf_foo>ABCD</cf_foo>
      <cf_bar>A-BAR-0032</cf_bar>
      <short_desc>A NICE DESCRIPTION - help me</short_desc>
      <cf_zebra>pretty</cf_zebra>
      <cf_pizza>Pepperoni</cf_pizza>
      <cf_dumpling ></cf_dumpling >
      <assigned_to></assigned_to>
      <bug_status>NEW</bug_status>
      <cf_word></cf_word>
      <cf_caslte></cf_caslte>
   </bug>
</bugzilla>
0

You can use various tricks, such as tuning FS. More tricks can be found in the Awk newsgroup. There are also parsers like mine: http://lorance.freeshell.org/csv/

0
source

Instead, you can try csvprintf . It can convert CSV to XML, which can then be used with XSLT as desired.

0
source

Source: https://habr.com/ru/post/1718047/


All Articles