Sum based on two matching fields using awk

I cannot find awk solution for this simple task. I can easily sum the column ($ 3) based on a single matching field ($ 1) with:

awk -F, '{array[$1]+=$3} END { for (i in array) {print i"," array[i]}}' datas.csv 

Now, how can I do this based on two fields? Say $ 1 and $ 2? Here is an example of data:

 P1,gram,10 P1,tree,12 P1,gram,34 P2,gram,23 ... 

I just need to sum column 3 if the first and second fields match.

Thanx for any help!

+6
source share
2 answers

Thus

 awk -F, '{array[$1","$2]+=$3} END { for (i in array) {print i"," array[i]}}' datas.csv 

My result

 P1,tree,12 P1,gram,44 P2,gram,23 

EDIT

Since the OP requires the commas to remain on the output, I edited the answer above using @yi_H comma.

+6
source

For a solution requiring less memory but first requiring sorting (nothing is free):

 sort datas.csv | awk -F "," 'NR==1{last=$1 "," $2; sum=0;}{if (last != $1 "," $2) {print last "," sum; last=$1 "," $2; sum=0;} sum += $3;}END{print last "," sum;}' 
+1
source

Source: https://habr.com/ru/post/894464/


All Articles