1. here I will use the <> operator
, , . chomp , , .
2. Check to make sure the file only contains acgt or die
if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }
while <> $_, (my $line = <>).
. .
, ne , . !~ ( =~ , [^acgt]). , , i .
3. Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C).
GWW, . T- > U - . tr ().
4. Take this transcription & break it into 3 character 'codons' starting at the first occurance of "AUG"
not sure but I'm thinking this is where I will start a %hash variables?
. while(<>). index "AUG". , ( substr $line, -2, 2). ( .=) , "AUG". , , , .
5. Take the 3 character "codons" and give them a single letter Symbol (an uppercase one-letter amino acid name)
Assign a key a value using (there are 70 possibilities here so I'm not sure where to store or how to access)
, GWW, -:
%codons = ( AUG => 'M', ...).
(.) split , , -.
6.If a gap is encountered a new line is started and process is repeated
not sure but we can assume that gaps are multiples of threes.
. . exists $codons{$current_codon}.
7. Am I approaching this the right way? Is there a Perl function that I'm overlooking that can simplify the main program?
, , . ; read_codon translate: , .
, , , :
use warnings; use strict;
use feature 'state';
sub read_codon {
my ($file) = @_;
state @buffer;
open state $handle, '<', $file or die $!;
if (@buffer < 3) {
my $new_line = scalar <$handle> or return;
chomp $new_line;
push @buffer, split //, $new_line;
}
return transcribe(
join '',
shift @buffer,
shift @buffer,
shift @buffer
);
}
sub transcribe {
my ($codon) = @_;
$codon =~ tr/T/U/;
return $codon;
}
{
our %codes = (
AUG => 'M',
...
);
sub translate {
my ($codon) = @_ or return;
state $TRANSLATE = 0;
$TRANSLATE = 1 if $codon =~ m/AUG/i;
$TRANSLATE = 0 if $codon =~ m/U(AA|GA|AG)/i;
return $codes{$codon} if $TRANSLATE;
}
}