Split CSV and add headers and index column with awk

Question

Split CSV and add headers and index column with awk

I am trying to split a large CSV into smaller date based files using awk. I have the necessary command, although it returns the error "too many open files." So I read to close the file, but the command written will close after only one line is written to each file.

awk -F' ' '{close($1".csv")}{print > ($1".csv")}' 2015full.csv

In addition, I would like to add a header row for each split file and index column. My data looks like this:

2015full.csv

2015-12-24 18:20:57 -87.2788204 36.5984675 0
2015-12-24 18:20:42 -87.2784049 36.597298699999996 0
2015-12-24 18:20:26 -87.274402 36.5932405 0
2015-12-23 18:20:10 -87.25762519999999 36.572330400000006 0
2015-12-23 18:19:40 -87.25762519999999 36.572330400000006 0
2015-12-23 18:19:21 -87.25762519999999 36.572330400000006 0

And I'm trying to get:

2015-12-24.csv

num date time lon lat
1 2015-12-24 18:20:57 -87.2788204 36.5984675
2 2015-12-24 18:20:42 -87.2784049 36.597298699999996
3 2015-12-24 18:20:26 -87.274402 36.5932405

2015-12-23.csv

num date time lon lat
1 2015-12-23 18:20:10 -87.25762519999999 36.572330400000006
2 2015-12-23 18:19:40 -87.25762519999999 36.572330400000006
3 2015-12-23 18:19:21 -87.25762519999999 36.572330400000006

I can have the following figures below:

awk -F' ' 'NR==1{print "num", $0; "date", $1; "time", $2; "lon", $3; "lat", $4; next}{print (NR-1), $0}{close($1".csv")}{print > ($1".csv")}' 2015full.csv

but they are not in the order that creates a working team for my purposes. Anyone have a suggestion for me? Thank!

+4

bash awk csv

Luteser Dec 28 '17 at 23:33

source share

3 answers

awk '
    BEGIN { hdr = "num" OFS "date" OFS "time" OFS "lon" OFS "lat" }
    $1!=prev { close(out); out=$1".csv"; print hdr > out; idx=0; prev=$1 }
    { print ++idx, $0 > out }
' 2015full.csv

0

Ed Morton 29 . '17 0:27

if awk is not used

for i in $(cut -d ' ' -f1 2015full.csv|uniq);do grep -w $i 2015full.csv|nl -w1 -s ' ' |sed "1i num date time lon lat" >$i.csv; done

0

once Dec 29 '17 at 2:10

source share

RavinderSingh13 · Accepted Answer · 2017-12-28T23:36:48+0000

, $1 , $1 valued .csv - , , , (, Input_file , , 1- awk).

awk -F' ' 'prev!=$1{close(prev".csv")}{print > ($1".csv");prev=$1}' 2015full.csv

EDIT: , $1, .

awk -F' ' 'prev!=$1{close(prev".csv");print "num date time lon lat" > ($1".csv")}{print > ($1".csv");prev=$1}' 2015full.csv

Split CSV and add headers and index column with awk

More articles: