Counting the number of lines from text files

I have a text file with 10 columns, for example f.txt, which looks like this:

aab abb 263-455 aab abb 263-455 aab abb 263-455 bbb abb 26-455 bbb abb 26-455 bbb aka 264-266 bga bga 230-232 bga bga 230-232 

I want to calculate the unique number of each row in the first and second columns based on the numbers of the third column.

Output:

 aab - 1 abb - 2 bbb - 2 aka - 1 bga - 2 Total no - 8 
+4
source share
5 answers
 awk ' !s[1":"$1":"$3]++{sU[$1]++;tot++} !s[2":"$2":"$3]++{sU[$2]++;tot++} END{ for (x in sU) print x, sU[x]; print "Total No -",tot; }' input 

Output

 bga 1 aab 1 bbb 2 aka 1 bga 1 abb 2 Total No - 8 
+3
source

This will do the trick:

 $ awk '!a[$0]++{c[$1]++;c[$2]++} END{for(k in c){print k" - "c[k];s+=c[k]}print "\nTotal No -",s}' file aka - 1 bga - 2 aab - 1 abb - 2 bbb - 2 Total No - 8 

In a more readable form, the script:

 !lines[$0]++{ count[$1]++ count[$2]++ } END { for (line in count) { print line" - "count[line] sum += count[line] } print "\nTotal No -",sum } 

To run it in this form, save it in a script.awk file and:

 $ awk -f script.awk file aka - 1 bga - 2 aab - 1 abb - 2 bbb - 2 Total No - 8 
+3
source
  awk '!b[$1,$3]++{a[$1]++} !c[$2,$3]++{a[$2]++} END{for (i in a) {print i,a[i];sum+=a[i]}print "Total -",sum}' file 
+2
source

This is a slightly long command, but it is easy to understand:

 gawk '{a[$3,$1,1];a[$3,$2,2]}END{for(i in a)print i}' input | cut -d $'\x1c' -f 2 | sort | uniq -c | awk -v OFS=' - ' '{sum+=$1;print $2,$1};END{print "\nTotal No",sum}' 

 aab - 1 abb - 2 aka - 1 bbb - 2 bga - 2 Total No - 8 
+1
source
 { if (a[$1][$3] != 1){ a[$1][$3] = 1; total[$1]++; } if (a[$2][$3] != 1){ a[$2][$3] = 1; total[$2]++; } } END { for (item in total){ print item, total[item]; totalCount += total[item]; } print "\nTotal no - ", totalCount; } 

Output:

 aka 1 bga 1 aab 1 abb 2 bbb 2 Total no - 7 
0
source

Source: https://habr.com/ru/post/1482516/


All Articles