How to selectively replace a group of three characters using sed

I need to replace three specific characters (triplets) in the text. I want to combine a pattern starting at positions 1, 4, 7, 10, 13, etc., but not at positions 2, 3, 5, 6, etc.

For example, I want to look / replace taaon NNNin the text ctaagctaaggcgtaaga, and I want to get ctaagcNNNggcgtaaga.

The first occurrence of "taa" begins at position 2 and should not be matched / replaced, the second occurrence begins at position 7 and is replaced, the third occurrence begins at position 14 and is not matched.


My solution so far is to separate the triplets from the “+”, perform a replacement, and remove all the “+”. However, I am thinking of a more elegant solution with the sed command.

echo $dna | sed 's/.../&+/g;s/+$//' | sed 's/taa/NNN/g' | sed 's/+//g'
+4
source share
4 answers

with GNU sed

echo ctaagctaaggcgtaaga | sed 's/taa/NNN/2'

ctaagcNNNggcgtaaga

this replaced the second instance, but I missed the actual requirement. The following only replace triples in the correct positions.

echo ctaagctaaggcgtaaga | 
fold -w3 | 
sed 's/taa/NNN/' | 
tr -d '\n'; echo ""

ctaagcNNNggcgtaaga
+1
source

As far as I know, for this you will need more than one team sed. However, you can execute the entire command in one call sed, for example:

<<<ctaagctaaggcgtaaga bsdsed 's/.../&+/g; s/taa/NNN/g; s/+//g'

Conclusion:

ctaagcNNNggcgtaaga
+1
source

s:

sed 's/^\(\(...\)*\)taa/\1NNN/g'

, taa . .

:

sed 's/^\(\(...\)*\)taa/\1NNN/g' <<EOF
ctaagctaaggcgtaaga
EOF

:

ctaagcNNNggcgtaaga
+1

(GNU sed):

sed -r ':a;s/^((...)*)taa/\1NNN/;ta' file

, , , .

0

Source: https://habr.com/ru/post/1671181/


All Articles