Unix merges more than two files

Question

Unix merges more than two files

I have three files, each with id and value.

sdt5z@fir-s :~/test$ ls a.txt b.txt c.txt sdt5z@fir-s :~/test$ cat a.txt id1 1 id2 2 id3 3 sdt5z@fir-s :~/test$ cat b.txt id1 4 id2 5 id3 6 sdt5z@fir-s :~/test$ cat c.txt id1 7 id2 8 id3 9

I want to create a file that looks like this ...

 id1 1 4 7 id2 2 5 8 id3 3 6 9

... it is advisable to use one command.

I know the connect and insert commands. The insert will duplicate the identifier column each time:

 sdt5z@fir-s :~/test$ paste a.txt b.txt c.txt id1 1 id1 4 id1 7 id2 2 id2 5 id2 8 id3 3 id3 6 id3 9

The connection works well, but only for two files at a time:

 sdt5z@fir-s :~/test$ join a.txt b.txt id1 1 4 id2 2 5 id3 3 6 sdt5z@fir-s :~/test$ join a.txt b.txt c.txt join: extra operand `c.txt' Try `join --help' for more information.

I also know that the insert can take STDIN as one of the arguments using "-". For example, I can replicate the connection command using:

 sdt5z@fir-s :~/test$ cut -f2 b.txt | paste a.txt - id1 1 4 id2 2 5 id3 3 6

But I'm still not sure how to change this to accommodate three files.

Since I am doing this inside a perl script, I know that I can do something like putting this inside a foreach loop, something like join file1 file2> tmp1, join tmp1 file3> tmp2, etc. But this is getting messy, and I would like to do it with a single line interface.

+4

join text perl paste cut

Stephen turner Feb 09 '12 at 2:43

source share

4 answers

Since you are doing this inside a Perl script , is there any specific reason why you are NOT doing work in Perl rather than spawning in the shell?

Something like (NOT TESTED! Caveat emptor):

 use File::Slurp; # Slurp the files in if they aren't too big my @files = qw(a.txt b.txt c.txt); my %file_data = map ($_ => [ read_file($_) ] ) @files; my @id_orders; my %data = (); my $first_file = 1; foreach my $file (@files) { foreach my $line (@{ $file_data{$file} }) { my ($id, $value) = split(/\s+/, $line); push @id_orders, $id if $first_file; $data{$id} ||= []; push @{ $data{$id} }, $value; } $first_file = 0; } foreach my $id (@id_orders) { print "$d " . join(" ", @{ $data{$id} }) . "\n"; }

+1

DVK Feb 09 '12 at 14:50

source share

perl -lanE'$h{$F[0]} .= " $F[1]" END{say $_.$h{$_} foreach keys %h}' *.txt

It should work, I can not check it, because I answer from a mobile. You can also sort the output if you put sort between foreach and keys .

0

Patrick seebauer Feb 09 '12 at 16:33

source share

 pr -m -t -s\ file1.txt file2.txt|gawk '{print $1"\t"$2"\t"$3"\t"$4}'> finalfile.txt

Given that files file1 and file2 have 2 columns, 1 and 2 represent columns from file1, and 3 and 4 represent columns from file2.

You can also print any column from each file this way, and it will include any number of files. If your file 1 has 5 columns, then $ 6 will be the first column of file2.

0

Chaiwala Apr 16 '13 at 9:03

source share

Sergey Benner · Accepted Answer · 2012-02-09T14:49:22+0000

join a.txt b.txt|join - c.txt

should be enough

Unix merges more than two files

More articles: