I'm having problems with the AWK field separator, the input file is displayed below
1 | all | | synonym | 1 | root | | scientific name | 2 | Bacteria Bacteria scientific name | 2 | Monera | Monera | in part | 2 | Prokaryota | Prokaryota | in part | 2 | Prokaryota | Prokaryota | in part | 2 | Prokaryota | Prokaryota | in part | 2 | bacteria | bacteria | blast name |
the field separator here is tab, pipe, tab \t|\t so in my attempt to print only the 1st and 2nd columns
awk -F'\t|\t' '{print $1 "\t" $2}' nodes.dmp | less
instead of the desired output, the output will be the first column followed by the channel symbol. I tried to get out of the \t\|\t pipe, but the result remained the same.
1 | 1 | 2 | 2 | 2 | 2 |
Printing the 1st and 3rd columns gave me the original intended output.
awk -F'\t|\t' '{print $1 "\t" $3}' nodes.dmp | less
but I am puzzled by why this is not working as intended.
I understand that the first liner perl will work, but I really want to use awk.
perl -aln -F"\t\|\t" -e 'print $F[0],"\t",$F[1]' nodes.dmp | less
source share