Identify common items in multiple files

I have 8 files in one column and an uneven number of rows in each column. I need to identify the elements that are common to all these 8 files.

I can perform this task to compare two files, but I can not write a working liner in the shell to do the same.

Any ideas .....

Thanks in advance.

File 1
Floor
pawan

File 2
Raman
Floor
sweet
Barua

File 3
Sweet
Barua
Paul

The answer to comparing these three files should be Paul.

+3
source share
6 answers
python -c 'import sys;print "".join(sorted(set.intersection(*[set(open(a).readlines()) for a in sys.argv[1:]])))' File1 File2 File3 

prints Paul for your files File1 , File2 and File3 .

+4
source

The following single line font should (3 to 8 to suit your case)

 $ sort * | uniq -c | grep 3 3 Paul 

It might be better to do this in python using sets ...

+7
source

Perl

 $ perl -lnE '$c{$_}{$ARGV}++ }{ print for grep { keys %{$c{$_}} == 8 } keys %c;' file[1-8] 

It should be possible to get rid of hard-coded 8 , as well as @{[ glob "@ARGV" ]} , but I donโ€™t have time to test it.

This solution will correctly handle the existence of duplicate lines in files.

+3
source

Here I am trying to find a brief way to make sure that each match is obtained from a different file. If there are no duplicates in the files, this is pretty simple in perl:

 perl -lnwE '$a{$_}++; END { for (keys %a) { print if $a{$_} == 3 } }' files* 

The -l option will automatically hide your input (delete a new line) and add a new line to print. This is important if there are no new lines.

The -n option will read the input from the arguments to the file name (or stdin).

The hash assignment will count duplicates, and the END block will print which duplicates appeared 3 times. Change 3 to the number of files you have.

If you want a slightly more flexible version, you can count the arguments in a BEGIN block.

 perl -lnwE 'BEGIN { $n = scalar @ARGV } $a{$_}++; END { for (keys %a) { print if $a{$_} == $n } }' files* 
+2
source
 $ awk '++a[$0]==3' file{1..3}.txt Paul 

Update

 $ awk '(FILENAME SEP $0) in b{next}; b[FILENAME,$0]=1 && ++a[$0]==3' file{1..3}.txt Paul 
+1
source

This might work for you:

 ls file{1..3} | xargs -n1 sort -u | sort | uniq -c | sed 's/^\s*'"$(ls file{1..3} | wc -l)"'\s*//p;d' 
+1
source

Source: https://habr.com/ru/post/1388960/


All Articles