How to split CSV files according to the specified number of lines?

I have a CSV file (about 10,000 rows, each row of 300 columns) stored on a LINUX server. I want to split this CSV file into 500 CSV files with 20 entries. (Each one has the same CSV header as the original CSV)

Is there any linux command to help with this conversion?

+68
split linux unix csv
Dec 21 '13 at 16:31
source share
4 answers

It is made into a function. Now you can call splitCsv <Filename> [chunkSize]

 splitCsv() { HEADER=$(head -1 $1) if [ -n "$2" ]; then CHUNK=$2 else CHUNK=1000 fi tail -n +2 $1 | split -l $CHUNK - $1_split_ for i in $1_split_*; do echo -e "$HEADER\n$(cat $i)" > $i done } 

Found at : http://edmondscommerce.imtqy.com/linux/linux-split-file-eg-csv-and-keep-header-row.html

+65
Dec 21 '13 at 16:42
source share

Use the Linux partition command:

 split -l 20 file.txt new 

Divide the file "file.txt" into files starting with the name "new", each of which contains 20 lines of text.

Type man split at the Unix prompt for more information. However, you first need to remove the header from file.txt (for example, using the tail command), and then add it back to each of the split files.

+113
Dec 21 '13 at 16:37
source share

This should do it for you - all of your files are ultimately called Part1-Part500.

 #!/bin/bash FILENAME=10000.csv HDR=$(head -1 $FILENAME) # Pick up CSV header line to apply to each file split -l 20 $FILENAME xyz # Split the file into chunks of 20 lines each n=1 for f in xyz* # Go through all newly created chunks do echo $HDR > Part${n} # Write out header to new file called "Part(n)" cat $f >> Part${n} # Add in the 20 lines from the "split" command rm $f # Remove temporary file ((n++)) # Increment name of output part done 
+13
Dec 21 '13 at 17:42
source share

That should work !!!

file_name = name of the file you want to split.
10000 = number of lines in each split file
file_part_ = Prefix for the name of the shared file (file_part_0, file_part_1, file_part_2..etc continues)

split -d -l 10000 filename .csv file_part_

+11
Jan 08 '18 at 16:47
source share



All Articles