Reset number of lines in awk

I have a file like this

file.txt

0 1 a 1 1 b 2 1 d 3 1 d 4 2 g 5 2 a 6 3 b 7 3 d 8 4 d 9 5 g 10 5 g . . . 

I want the number of rows of reset counted to 0 in the first column of $1 when the field value in the second column of $2 changes using awk or bash script.

Result

 0 1 a 1 1 b 2 1 d 3 1 d 0 2 g 1 2 a 0 3 b 1 3 d 0 4 d 0 5 g 1 5 g . . . 
+4
source share
4 answers

As long as you don't mind the excessive use of memory, and the second column is sorted, I think this is the most interesting:

 awk '{$1=a[$2]+++0;print}' input.txt 
+7
source

This awk single-line engine works for me:

 [ ghoti@pc ~]$ awk 'prev!=$2{first=0;prev=$2} {$1=first;first++} 1' input.txt 0 1 a 1 1 b 2 1 d 3 1 d 0 2 g 1 2 a 0 3 b 1 3 d 0 4 d 0 5 g 1 5 g 

Separate the script and see what it does.

  • prev!=$2 {first=0;prev=$2} - This is what your counter resets. Since the initial state of prev empty, we reset in the first line of input, which is good.
  • {$1=first;first++} - for each line set the first field, then add the variable that we use to set the first field.
  • 1 is awk short-hand for "print the line". This is really a condition, which is always evaluated as "true", and when a pair of conditions / operators is not in the instruction, the operator defaults to "print".

Pretty simple indeed.

Of course, one conclusion is that when you change the value of any field in awk, it overwrites the string using any field delimiters, which by default are just space. If you want to configure this, you can set the OFS variable:

 [ ghoti@pc ~]$ awk -vOFS=" " 'p!=$2{f=0;p=$2}{$1=f;f++}1' input.txt | head -2 0 1 a 1 1 b 

Salt to taste.

+6
source

Pure bash solution:

 file="/PATH/TO/YOUR/OWN/INPUT/FILE" count=0 old_trigger=0 while read abc; do if ((b == old_trigger)); then echo "$((count++)) $b $c" else count=0 echo "$((count++)) $b $c" old_trigger=$b fi done < "$file" 

This solution (IMHO) has the advantage of using a readable algorithm. I love what the other guys give as answers, but it’s not so difficult for beginners.

Note

((...)) is an arithmetic command that returns an exit status of 0 if the expression is non-zero, or 1 if the expression is zero. It is also used as a synonym for let if side effects (appointments) are needed. See http://mywiki.wooledge.org/ArithmeticExpression

+2
source

Perl Solution:

 perl -naE ' $dec = $F[0] if defined $old and $F[1] != $old; $F[0] -= $dec; $old = $F[1]; say join "\t", @F[0,1,2];' 

$dec is subtracted from the first column each time. When the second column changes (its previous value is stored in $old ), $dec incremented to set the first column to zero again. The first line requires a defined condition.

0
source

Source: https://habr.com/ru/post/1441403/


All Articles