Perl program to simulate RNA synthesis

Question

Perl program to simulate RNA synthesis

Look for suggestions on how to approach my homework Perl programming programming to write an RNA synthesis program. I summed up and described the program below. In particular, I am looking for feedback on the blocks below (I will indicate for convenience). I read Chapter 6, "Perl Programming Elements," by Andrew Johnson (great book). I also read perlfunc and perlop pod-pages, without jumping anything from where to start.

Description of the program: the program should read the input file from the command line, translate it into RNA, and then rewrite RNA into a sequence of uppercase single-letter amino acid names.

Accept a file with a name on the command line
here I will use the operator <>

Make sure the file contains only acgt or die

if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }

Transcribe DNA into RNA (each A is replaced by U, T is replaced by A, C is replaced by G, G is replaced by C)
not sure how to do it
Take this transcription and divide it into 3 codons, starting with the first appearance of AUG
not sure, but I think I will start with% hash variables?
Take the 3-digit "codons" and give them a single-letter character (single-word name in the form of an uppercase letter)
Assign a key using a value (there are 70 possibilities here, so I'm not sure where to store it or how to handle it)
If a space occurs, a new line begins and the process repeats
not sure, but we can assume that spaces are multiples of triples.
Am I approaching this correctly? Is there a Perl function that I skip that can simplify the main program?

Note

( ).

, , , , "AUG". , .

, , . , !

+3

perl hash bioinformatics

Koala 06 . '10 5:06

3

.

, , , , , , .

, , , , strands. , "" .

2. , .

3. if hash

4. . .

5. , .

6. . , , №2, , ATGC.

perl, . perl, bioperl. , .

+3

GWW 06 . '10 5:16

Take a look at BioPerl and look at the source modules for indicators on how to do this.

+1

slashmais Nov 06 '10 at 5:17

source share

Pedro Silva · Accepted Answer · 2010-11-06T07:19:52+0000

1. here I will use the <> operator

, , . chomp , , .

2. Check to make sure the file only contains acgt or die

if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }

while <> $_, (my $line = <>).

. .

, ne , . !~ ( =~ , [^acgt]). , , i .

3. Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C).

GWW, . T- > U - . tr ().

4. Take this transcription & break it into 3 character 'codons' starting at the first occurance of "AUG"

not sure but I'm thinking this is where I will start a %hash variables?

. while(<>). index "AUG". , ( substr $line, -2, 2). ( .=) , "AUG". , , , .

5. Take the 3 character "codons" and give them a single letter Symbol (an uppercase one-letter amino acid name)

Assign a key a value using (there are 70 possibilities here so I'm not sure where to store or how to access)

, GWW, -:

%codons = ( AUG => 'M', ...).

(.) split , , -.

6.If a gap is encountered a new line is started and process is repeated

not sure but we can assume that gaps are multiples of threes.

. . exists $codons{$current_codon}.

7. Am I approaching this the right way? Is there a Perl function that I'm overlooking that can simplify the main program?

, , . ; read_codon translate: , .

, , , :

use warnings; use strict;
use feature 'state';


# read_codon works by using the new [state][1] feature in Perl 5.10
# both @buffer and $handle represent 'state' on this function:
# Both permits abstracting reading codons from processing the file
# line-by-line.
# Once read_colon is called for the first time, both are initialized.
# Since $handle is a state variable, the current file handle position
# is never reset. Similarly, @buffer always holds whatever was left
# from the previous call.
# The base case is that @buffer contains less than 3bp, in which case
# we need to read a new line, remove the "\n" character,
# split it and push the resulting list to the end of the @buffer.
# If we encounter EOF on the $handle, then we have exhausted the file,
# and the @buffer as well, so we 'return' undef.
# otherwise we pick the first 3bp of the @buffer, join them into a string,
# transcribe it and return it.

sub read_codon {
    my ($file) = @_;

    state @buffer;
    open state $handle, '<', $file or die $!;

    if (@buffer < 3) {
        my $new_line = scalar <$handle> or return;
        chomp $new_line;
        push @buffer, split //, $new_line;
    }

    return transcribe(
                       join '', 
                       shift @buffer,
                       shift @buffer,
                       shift @buffer
                     );
}

sub transcribe {
    my ($codon) = @_;
    $codon =~ tr/T/U/;
    return $codon;
}


# translate works by using the new [state][1] feature in Perl 5.10
# the $TRANSLATE state is initialized to 0
# as codons are passed to it, 
# the sub updates the state according to start and stop codons.
# Since $TRANSLATE is a state variable, it is only initialized once,
# (the first time the sub is called)
# If the current state is 'translating',
# then the sub returns the appropriate amino-acid from the %codes table, if any.
# Thus this provides a logical way to the caller of this sub to determine whether
# it should print an amino-acid or not: if not, the sub will return undef.
# %codes could also be a state variable, but since it is not actually a 'state',
# it is initialized once, in a code block visible form the sub,
# but separate from the rest of the program, since it is 'private' to the sub

{
    our %codes = (
        AUG => 'M',
        ...
    );

    sub translate {
        my ($codon) = @_ or return;

        state $TRANSLATE = 0;

        $TRANSLATE = 1 if $codon =~ m/AUG/i;
        $TRANSLATE = 0 if $codon =~ m/U(AA|GA|AG)/i;

        return $codes{$codon} if $TRANSLATE;
    }
}

Perl program to simulate RNA synthesis

Note

More articles: