I would recommend using a tool like csvkit csvjoin
pip install csvkit
$ csvjoin
usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-v] [-l]
[
[FILE [FILE ...]]
Execute a SQL-like join to merge CSV files on a specified column or columns.
positional arguments:
FILE The CSV files to operate on. If only one is specified,
it will be copied to STDOUT.
optional arguments:
-h,
-d DELIMITER,
Delimiting character of the input CSV file.
-t,
tabs. Overrides "-d".
-q QUOTECHAR,
Character used to quote strings in the input CSV file.
-u {0,1,2,3},
Quoting style used in the input CSV file. 0 = Quote
Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 =
Quote None.
-b,
CSV file.
-p ESCAPECHAR,
Character used to escape the delimiter if
("Quote None") is specified and to escape the
QUOTECHAR if
-z MAXFIELDSIZE,
Maximum length of a single field in the input CSV
file.
-e ENCODING,
Specify the encoding the input CSV file.
-S,
Ignore whitespace immediately following the delimiter.
-v,
-l,
output. Useful when piping to grep or as a simple
primary key.
zero-based numbering instead of the default 1-based
numbering.
-c COLUMNS,
The column name(s) on which to join. Should be either
one name (or index) or a comma-separated list with one
name (or index) for each file, in the same order that
the files were specified. May also be left
unspecified, in which case the two files will be
joined sequentially without performing any matching.
inner join.
inner join. If more than two files are provided this
will be executed as a sequence of left outer joins,
starting at the left.
inner join. If more than two files are provided this
will be executed as a sequence of right outer joins,
starting at the right.
Note that the join operation requires reading all files into memory. Don't try
this on very large files.
source
share