Finding a value in a two-delimited file

I have a data file in the following format

1|col2|col3|105,230,3,44,59,62|col5
2|col2|col3|43,44|col5
3|col2|col3|1,2,3,4,5,6,7,8|col5
4|col2|col3|1,2,37|col5
  • Delimiter "|"
  • The 4th column is a set of numbers, separated by a comma.
  • I need entries that have the number “3” separately in their 4th column, but numbers like 43 or 33 should not be counted.
  • “3” may be at the beginning of the 4th column, in the middle of the 4th column or at the end of the 4th column

Thus, the desired entries from the above data

1|col2|col3|105,230,3,44,59,62|col5
3|col2|col3|1,2,3,4,5,6,7,8|col5

I am currently using the following command, but I am looking for a more efficient / organized

awk -F"|" '$4 ~ /,3,/ || $4 ~ /^3,/ || $4 ~ /,3$/'
+4
source share
3 answers

GNU short solution : awk

awk -F'|' '$4 ~ /\<3\>/' file
  • \<and \>- mean the beginning and end of the word, respectively

Output:

1|col2|col3|105,230,3,44,59,62|col5
3|col2|col3|1,2,3,4,5,6,7,8|col5

/:

awk -F'|' '$4 ~ /(^|,)3(,|$)/' file
+5

4- , 3, , , awk :

awk -F"|" '{num=split($4, array,",");for(i=1;i<=num;i++){if(array[i]==3){print;next}}}'   Input_file
+2

GNU awk ( ). :

  • rec = $0
  • oFS = FS
  • FS=","
  • $0 $0 = $4
  • ..
  • FS = oFS

:

parse.awk

BEGIN { FS = "|" }

{ rec = $0 }

{ 
  oFS = FS
  FS  = ","
  $0  = $4
}

/\<3\>/ { 
  print rec
}

{ FS = oFS }

:

awk -f parse.awk infile

:

1|col2|col3|105,230,3,44,59,62|col5
3|col2|col3|1,2,3,4,5,6,7,8|col5
0

Source: https://habr.com/ru/post/1693868/


All Articles