In short :
how to convert from fasta to a "phylip" -like format (without the sequence and residues at the top of the file) using sed
?
The fasta format is as follows:
>sequence1
AATCG
GG-AT
>sequence2
AGTCG
GGGAT
The number of lines in a sequence may vary.
I want to convert it to this:
sequence1 AATCG GG-AT
sequence2 AGTCG GGGAT
My question seems simple, but I lack a real understanding of extended commands in sed
multiline commands and commands using a hold buffer.
Here is the implementation idea I had: fill the template space with a sequence and only print it when a new sequence label is encountered. For this, I would:
manual,
, :
python, perl awk, , " " , sed.
, :
script , .
, , , :
1h
2,3H
4{x; s/\n/ /g; p}
5H
6{H;x; s/\n/ /g; p}
sed -nf fa2phy.sed my.fasta
.