Perl Script to find the motif in the multifasta file and print the complete sequence along with the title line

I can search for a motive in a file with several fasta and print a line containing the motive .... but I need to print all the sequences along with the header line of the motive containing the fasta sequence. Please help me, I'm just new to perl

#!usr/bin/perl -w
use strict;

print STDOUT "Enter the motif: ";
my $motif = <STDIN>;
chomp $motif;


my $line;
open (FILE, "data.fa");
while ($line = <FILE>) {
  if ($line =~ /$motif/)  {
     print $line;
   }
}
+3
source share
3 answers

Try the following:

Bio :: DB :: Fasta

Instructions on the page. For more examples or instructions, just do a Google search for: "use Bio :: DB :: Fasta"

To establish this, simply follow any of these instructions, I suggest using the CPAN.pm method as root:

Installing Perl Modules

+3

script, , , , .

script , FASTA , ( = > ), , . , , , . , Perl, , , .

#!/usr/bin/perl

use strict;
use warnings;

print STDOUT "Enter the motif: ";
my $motif = <STDIN>;
chomp $motif;

my %seqs = %{ read_fasta_as_hash( 'data.fa' ) };
foreach my $id ( keys %seqs ) {
    if ( $seqs{$id} =~ /$motif/ ) {
        print $id, "\n";
        print $seqs{$id}, "\n";
    }
}

sub read_fasta_as_hash {
    my $fn = shift;

    my $current_id = '';
    my %seqs;
    open FILE, "<$fn" or die $!;
    while ( my $line = <FILE> ) {
        chomp $line;
        if ( $line =~ /^(>.*)$/ ) {
            $current_id  = $1;
        } elsif ( $line !~ /^\s*$/ ) { # skip blank lines
            $seqs{$current_id} .= $line
        }
    }
    close FILE or die $!;

    return \%seqs;
}
+1

@james_thompson the answer is great. I would use this if you are looking for something more universal. If you are looking for a simpler version (perhaps for training?), This will also suffice - although note that this will miss the motive if there is a heavy return in the middle.

#!usr/bin/perl -w
use strict;

print STDOUT "Enter the motif: ";
my $motif = <STDIN>;
chomp $motif;


my $line;
my $defline;
open (FILE, "data.fa");
while ($line = <FILE>) {
  if ($line =~ /^>/) {
     $defline = $line;
   } elsif ($line =~ /$motif/)  {
     print($defline,$line);
   }
}
close (FILE);

You will notice that I also added an explicit close to the file descriptor.

0
source

Source: https://habr.com/ru/post/1790621/


All Articles