Bash sum grouping cycle

I have a help file1 format:

client1 bla blahblah 2542 KB
client1 bla blahblah 4342 MB
client1 bla blahblah    7 GB

client2 bla blahblah  455 MB
client2 bla blahblah  455 MB

...

And I need to get a weekly

client1 SUM xy KB
client2 SUM yx KB

Currently im is using:

sumfunction ()
    {
    inputfile=helpfile1

    for i in `awk -F":" '{print $1}' $inputfile| sort -u | xargs`
    do
    awk -v name=$i 'BEGIN {sum=0};
    $0~name {
    print $0;
    if ($5 == "GB") sum = sum + $4*1024*1024;
    if ($5 == "MB") sum = sum + $4*1024;
    if ($5 == "KB") sum = sum + $4};
    END {print name " SUM " sum " kB"}' $inputfile
    done
    }   

sumfunction | grep SUM | sort -g -r -k 3 > weeklysize

I need to use it on a rather long file, and this awk takes too much time. Is there any other code (only bash) to make this faster? thank you

+4
source share
3 answers

You can use the following awk script:

awk '/MB$/{$4*=1024};/GB$/{$4*=1024*1024};{a[$1]+=$4}END{for(i in a){printf "%s %s KB\n",i, a[i]}}' a.txt 

It looks better in this format:

/MB$/    {$4*=1024};        # handle MB
/GB$/    {$4*=1024*1024};   # handle GB

# count KB amount for the client
{a[$1]+=$4}

END{
    for(i in a){
        printf "%s %s KB\n",i, a[i]
    }
} 

Output

client1 11788782 KB
client2 931840 KB
+5
source
#!/usr/bin/awk -f

BEGIN {
    output_unit = "KB"
    modifier["KB"] = 1
    modifier["MB"] = 1024
    modifier["GB"] = 1024**2
}
NF  { sums[$1] += modifier[$5] * $4 }
END {
    for (client in sums) {
        printf "%s SUM %d %s\n", client, sums[client]/modifier[output_unit], output_unit
    }
}

Notes:

  • empty lines will be skipped ( NR { [...] })
  • an output module configured by mounting output_unit, respectively ( KB, MB, GB)

$ ./t.awk t.txt
client1 SUM 11788782 KB
client2 SUM 931840 KB
+4
source

Pure Bash (4.0 +):

declare -Ai client                  # associative array

while read c1 c2 c3 c4 c5 ; do
  if [ -n "$c5" ] ; then
    if [ $c5 = 'KB' ] ; then
      client[$c1]+=$c4
    elif [ $c5 = 'MB' ] ; then
      client[$c1]+=$c4*1024
    elif [ $c5 = 'GB' ] ; then
      client[$c1]+=$c4*1024*1024
    fi
  fi
done < "$infile"

for c in ${!client[@]}; do          # print sorted results
  printf "%s %20d KB\n" $c ${client[$c]}
done | sort  -k1

client1             11788782 KB
client2               931840 KB
+3

Source: https://habr.com/ru/post/1536449/


All Articles