It is necessary to match the appearance of the template only once in the file

Question

It is necessary to match the appearance of the template only once in the file

I have several files with some template

ABCD 100 ABCD 200 EFGH 500 IJKL 50 EFGH 700 ABCD 800 IJKL 100

I want to combine the appearance of each (ABCD / EFGH / IJKL) only after sorting based on the highest numbers in column 2

 ABCD 800 EFGH 700 IJKL 100

I tried cat *txt | sort -k 1 | cat *txt | sort -k 1 | ??

thanks in advance

My bad, because I'm not explicit. I apologize for wasting your time. Below is a detailed example. The file has several columns. I got the one to use awk and tried this cat * txt | awk '{print $ 3, $ 5}' | sort -gr | no. Now I got strings sorted by numerical value. Now how to get the uniq string for the first match.

 <string> <numeral> abcde/efgh/ijkl/mnop -450.00 dfgh/adas/gfda/adasd -100.0 abcde/efgh/ijkl/mnop -100.00 lk/oiojl/ojojl -0.078 dfgh/adas/gfda/adasd 50.0 lk/oiojl/ojojl -0.150 O/p needed abcde/efgh/ijkl/mnop -450.00 dfgh/adas/gfda/adasd -100.0 lk/oiojl/ojojl -0.150

+4

bash shell perl

user2412414 May 23 '13 at 7:00

source share

5 answers

You can use an awk-associated array, and then sort by column 2:

 awk '{ if ($2>arr[$1]) arr[$1]=$2} END{for (i in arr) print i, arr[i]}' file \ | sort -k2 -rn

+2

PP May 23 '13 at 7:08

source share

 cat *txt | perl -ane 'END{print "$_ $r{$_}\n" for sort keys %r} $_<$F[1] and $_=$F[1] for $r{$F[0]}'

+2

Dry27 May 23 '13 at 7:10

source share

If the first column is always 4 characters, then (as suggested by abasu) you can use uniq -w4 .

cat *.txt | sort -gr | uniq -w4

This is sorted in reverse numerically ("ABCD 800" will precede "ABCD 100") and only considers the first 4 characters when searching for unique strings.

If the first column does not always have 4 characters, you can scroll back and forth to rev and use uniq -f1 to skip the first of the inverse fields.

cat *.txt | sort -gr | rev | uniq -f1 | rev

If you want to target a specific word and get the highest matching number, you can use

cat *.txt | sort -gr | grep 'ABCD' | head -n 1

+1

ktm5124 May 23 '13 at 7:06

source share

 perl -anE'$h{$F[0]}=$F[1]if!exists$h{$F[0]}or$F[1]>$h{$F[0]}}{say"$_ $h{$_}"for keys%h'

0

Hynek -Pichi- Vychodil Aug 3 '13 at 20:02

source share

chepner · Accepted Answer · 2013-05-23T12:12:34+0000

You can use sort twice: once to sort by numbers, second time to do stable sorting by rows (so that the largest number remains first), delete duplicates to remove duplicate rows with lower numbers.

 sort -k2,2nr file.txt | sort -k1,1 -u --stable

It is necessary to match the appearance of the template only once in the file

More articles: