How to sort columns with requirements below

Question

How to sort columns with requirements below

I have 3 columns

a 03 w
a 10 x
a 01 y
b 20 w
b 01 x
c 02 w
c 10 y
c 12 z

Expected Result

a 10 x
b 20 w
c 12 z

i.e. I need to sort column 2, but without changing the order of column 1 then grep the row with the maximum value in the list based on the 2nd column

+4

linux vim awk sed

Vinay v Sep 08 '17 at 5:23

source share

4 answers

RomanPerekhrest · Answer 1 · 2017-09-08T07:49:36+0000

Two approaches (choose one of them):

1 ) sorting + uniq "trick":

sort -k1,1 -k2,2rn file | uniq -w1

-k1,1 - sort rows by the 1st field in the 1st phase
-k2,2rn - sort rows by second field numerically in reverse order
uniq -w1- displays unique strings comparing no more than a 1character in strings (can be changed -w<number>)

Output:

a 10 x
b 20 w
c 12 z

2) GNU datamash:

datamash -Wsf -g1 max 2 <file | cut -f1-3

:

a   10  x
b   20  w
c   12  z

Akshay Hegde · Answer 2 · 2017-09-08T06:18:40+0000

$ cat infile
a 03 w
a 10 x
a 01 y
b 20 w
b 01 x
c 02 w
c 10 y
c 12 z

$ awk -F'[[:blank:]]' '{f=($1 in b)}f && b[$1]<$2 || !f{a[$1]=$0;b[$1]=$2}END{for(i in a)print a[i]}' infile
a 10 x
b 20 w
c 12 z

awk -F'[[:blank:]]' '
                     {
                       f=($1 in b)
                     }
                     f && b[$1]<$2 || !f{
                        a[$1]=$0;
                        b[$1]=$2
                     }
                  END{
                        for(i in a)
                            print a[i]
                     }
                    ' infile

-F'[[:blank:]]' -
f=($1 in b) - f (true=1/false=0), , index/array ($1) b
f && b[$1]<$2 || !f, f , (b[$1]) (< $2) // (||) !f ,
a[$1]=$0; (a) ($1) // ($0)
b[$1]=$2 (b) ($1) ($2)
END { for(i in a) print a[i] } END a .

: , -F'...' ,

hek2mgl · Answer 3 · 2017-09-08T06:52:16+0000

UNIX sort awk:

sort -k1,1 -k2,2nr file | awk '!seen[$1]++'

vim:

:!%sort -k1,1 -k2,2nr | awk '\!seen[$1]++'

:

sort , 1, 2. :

a 10 x
a 03 w
a 01 y
b 20 w
b 01 x
c 12 z
c 10 y
c 02 w

awk script, seen, 1. !, 1 , :

a 10 x  <-- print
a 03 w
a 01 y
b 20 w  <-- print
b 01 x
c 12 z  <-- print
c 10 y
c 02 w

RavinderSingh13 · Answer 4 · 2017-09-08T07:08:59+0000

.

awk '
{
  b[$1]=a[$1]>$2?(b[$1]?b[$1]:$0):$0;
  a[$1]=a[$1]>$2?a[$1]:$2;
}
END{
  for(i in a){
     print b[i]
}
}
'   Input_file

:

awk '
{                                    ##Starting block here.
  b[$1]=a[$1]>$2?(b[$1]?b[$1]:$0):$0;##creating an array named b whose index is $1, then checking if array a with index $1 value is greater than $2 or not, if yes then assign b[$1] to b[$1] else change it to current line. This is to make sure always we should get the line whose $2 value is greater than its previous value with respect to $1.
  a[$1]=a[$1]>$2?a[$1]:$2; ##creating an array named a whose index is $1 and checking if value of a[$1] is greater than $2 is yes then keep a[$1] value as it is else change its value to current line value.
}
END{                       ##Starting END block of awk here.
  for(i in a){             ##Starting a for loop to traverse inside array a elements.
     print b[i]            ##Because array a and array b have same indexes and we have to print whole lines values so printing array b value here.
}
}
'  Input_file              ##mentioning the Input_file here.

How to sort columns with requirements below

More articles: