Identify common items in multiple files

Question

Identify common items in multiple files

I have 8 files in one column and an uneven number of rows in each column. I need to identify the elements that are common to all these 8 files.

I can perform this task to compare two files, but I can not write a working liner in the shell to do the same.

Any ideas .....

Thanks in advance.

File 1
Floor
pawan

File 2
Raman
Floor
sweet
Barua

File 3
Sweet
Barua
Paul

The answer to comparing these three files should be Paul.

+3

python shell perl

Angelo Jan 2 '12 at 12:13

source share

6 answers

The following single line font should (3 to 8 to suit your case)

 $ sort * | uniq -c | grep 3 3 Paul

It might be better to do this in python using sets ...

+7

Fredrik pihl Jan 2 '12 at 12:21

source share

Perl

 $ perl -lnE '$c{$_}{$ARGV}++ }{ print for grep { keys %{$c{$_}} == 8 } keys %c;' file[1-8]

It should be possible to get rid of hard-coded 8 , as well as @{[ glob "@ARGV" ]} , but I don’t have time to test it.

This solution will correctly handle the existence of duplicate lines in files.

+3

Zaid Jan 2 '12 at 13:49

source share

Here I am trying to find a brief way to make sure that each match is obtained from a different file. If there are no duplicates in the files, this is pretty simple in perl:

 perl -lnwE '$a{$_}++; END { for (keys %a) { print if $a{$_} == 3 } }' files*

The -l option will automatically hide your input (delete a new line) and add a new line to print. This is important if there are no new lines.

The -n option will read the input from the arguments to the file name (or stdin).

The hash assignment will count duplicates, and the END block will print which duplicates appeared 3 times. Change 3 to the number of files you have.

If you want a slightly more flexible version, you can count the arguments in a BEGIN block.

 perl -lnwE 'BEGIN { $n = scalar @ARGV } $a{$_}++; END { for (keys %a) { print if $a{$_} == $n } }' files*

+2

TLP Jan 2 '12 at 13:51

source share

 $ awk '++a[$0]==3' file{1..3}.txt Paul

Update

 $ awk '(FILENAME SEP $0) in b{next}; b[FILENAME,$0]=1 && ++a[$0]==3' file{1..3}.txt Paul

+1

kev Jan 2 '12 at 12:25

source share

This might work for you:

 ls file{1..3} | xargs -n1 sort -u | sort | uniq -c | sed 's/^\s*'"$(ls file{1..3} | wc -l)"'\s*//p;d'

+1

potong Jan 2 '12 at 15:22

source share

eumiro · Accepted Answer · 2012-01-02T12:22:00+0000

python -c 'import sys;print "".join(sorted(set.intersection(*[set(open(a).readlines()) for a in sys.argv[1:]])))' File1 File2 File3

prints Paul for your files File1 , File2 and File3 .

Identify common items in multiple files

Perl

Update

More articles: