I have this code:
awk '!seen[$1,$2]++{a[$1]=(a[$1] ? a[$1]", " : "\t") $2} END{for (i in a) print i a[i]} ' inputfile
and I would like to work to collapse rows with more than two fields, but always rely on the first field as an index.
Input file (three columns with tab delimiters):
protein_1 membrane 1e-4
protein_1 intracellular 1e-5
protein_2 membrane 1e-50
protein_2 citosol 1e-40
Desired output (three columns with tab delimiters):
protein_1 membrane, intracellular 1e-4, 1e-5
protein_2 membrane, citosol 1e-50, 1e-40
Thank!
Stack:
awk '!seen[$1,$2]++{a[$1]=(a[$1] ? a[$1]"\t" : "\t") $2};{a[$1]=(a[$1] ? a[$1]", " : "\t") $3} END{for (i in a) print i a[i]} ' 1 inputfile
source
share