, , . , AF038954 atgaccatcctccagacatacttccggcagaacagggatga, atgaagtcttggacaacctcttggcttttgtctgtga. ?
, , , :
while (<DATA>) {
chomp;
print "processing $_\n";
my ($id, $rna_sq) = split;
while ($rna_sq =~ /(atg.*?(?:tga|taa|tag))/g) {
printf "\t%8s %4i %4i %s %i\n",
$id,
pos($rna_sq) - length($1) + 1,
pos($rna_sq),
$1,
length($1);
}
}
(atg.*?(?:tga|taa|tag)) , , ( ? .* "" ), . while , .
, , : , , , , , . :
while (<DATA>) {
chomp;
print "processing $_\n";
my ($id, $rna_sq) = split;
while ($rna_sq =~ /atg/g) {
if ($' =~ /(.*?(?:tga|taa|tag))/) {
my $match = "atg$1";
printf "\t%8s %4i %4i %s %i\n",
$id,
pos($rna_sq) - 2,
pos($rna_sq) - 3 + length($match),
$match,
length($match);
}
}
}
Here we use a special variable (usually not recommended) $'that contains content after the match. We will look at this to find the end of the sequence and bring out the details. Since our main global match with $rna_seqdoes not include the sequence (as indicated above), we restart the search for the beginning, in which the previous search stopped, which was immediately after the start of the search. Thus, we include overlapping sequences.
source
share