How to print lines matching a pattern in Perl?

Assuming file.txt has only one sentence per line:

  John Depp is a great guy.  
 He is very inteligent.  
 He can do anything.  
 Come and meet John Depp. 

Perl code is as follows: -

 open ( FILE, "file.txt" ) || die "can't open file!"; @lines = <FILE>; close (FILE); $string = "John Depp"; foreach $line (@lines) { if ($line =~ $string) { print "$line"; } } 

The output will be the first and fourth line.

I want to make it work for a file that has random line breaks, and not just one English sentence per line. I mean that it should also work for the following: -

  John Depp is a great guy.  He is very intelligent.  He can do anything.  Come and meet John Depp. 

The conclusion should be the first and fourth sentences.

Any ideas please?

+4
source share
6 answers

First, pay attention to the name of the famous actor Johnny Depp .

Secondly, it is difficult to find out what the offer is and what is not. I am going to trick and use Lingua :: Sentence :

 #!/usr/bin/perl use strict; use warnings; use Lingua::Sentence; my $splitter = Lingua::Sentence->new('en'); while ( my $text = <DATA> ) { for my $sentence ( split /\n/, $splitter->split($text) ) { print $sentence, "\n" if $sentence =~ /John Depp/; } } __DATA__ John Depp is a great guy. He is very intelligent. He can do anything. Come and meet John Depp. John Depp is a great guy. He is very intelligent. He can do anything. Come and meet John Depp. 

Conclusion:

  John Depp is a great guy.
 Come and meet John Depp.
 John Depp is a great guy.
 Come and meet John Depp. 
+2
source

More simply: if you assume that the "sentences" are separated by dots, you can use this as a field separator:

  $/ = '.'; while(<>) { print if (/John Depp/i); } 
+2
source

Assuming you have your content in line:

 my $content = "John Depp is a great guy. He is very intelligent. He can do anything. Come and meet John Depp."; my @arr = $content =~ /.*John Depp.*/mg; foreach my $a (@arr) { print "$a\n"; } 

Result:

John Depp is a great guy.
Come and meet John Depp.

You can change the regular expression if you want to extract only the interesting part, for example:

 my @arr = $content =~ /is (\w+? ?\w+ \w+)./mg; 

Result:

great guy

very smart

+1
source

one way

 while(<>){ if (/John Depp/i){ @s = split /\s*\.\s*/; foreach my $line (@s){ @f=split /\s*\.\s*/ , $line; foreach my $found (@f){ if ($found =~/John Depp/i) { print $found."\n"; } } } } } 

Exit

 $ cat file John Depp is a great guy. He is very inteligent. He can do anything. Come and meet John Depp. John Depp is a great guy. He is very inteligent. He can do anything. Come and meet John Depp. $ perl perl.pl file John Depp is a great guy Come and meet John Depp John Depp is a great guy Come and meet John Depp 
0
source

By default, variables can be clogged if you are not careful. Naming everyone is a good idea.

This should help you:

 #!/usr/bin/perl -w use strict; my $targetString = "John Depp"; while (my $line = <STDIN>) { chomp($line); my @elements = split("\\.", $line); foreach my $element (@elements) { if ($element =~ m/$targetString/is) { print trim($element).".\n"; } } } sub trim { my $string = shift; $string =~ s/^\s+//; $string =~ s/\s+$//; return $string; } 

Using:

 $ depp.pl < file John Depp is a great guy. Come and meet John Depp. John Depp is a great guy. Come and meet John Depp. 
0
source

Look at your source code, not the specific answer to your question. This is usually a bad idea to read the entire file in memory if you do not need it. You can process the file line by line as

 open ( FILE, "file.txt" ) || die "can't open file!"; $string = "John Depp"; while (<FILE>) { if (/$string/) { print } } 
0
source

Source: https://habr.com/ru/post/1305709/


All Articles