Any array element contained in a string

I have a keyword list and a blacklist. I want to remove all keywords containing any blacklist item. Im currently doing it like this:

my @keywords = ( 'some good keyword', 'some other good keyword', 'some bad keyword'); my @blacklist = ( 'bad' ); A: for my $keyword ( @keywords ) { B: for my $bl ( @blacklist ) { next A if $keyword =~ /$bl/i; # omitting $keyword } # some keyword cleaning (for instance: erasing non a-zA-Z0-9 characters, etc) } 

I was wondering if there is any quickest way to do this, because at the moment I have about 25 million words and a few hundrets words on the black list.

+4
source share
3 answers

The easiest option is to join the blacklist entry into one regular expression, and then grep list of keywords for those that do not match this regular expression:

 #!/usr/bin/env perl use strict; use warnings; use 5.010; my @keywords = ('some good keyword', 'some other good keyword', 'some bad keyword'); my @blacklist = ('bad'); my $re = join '|', @blacklist; my @good = grep { $_ !~ /$re/ } @keywords; say join "\n", @good; 

Output:

 some good keyword some other good keyword 
+4
source

my @blacklist = ( qr/bad/i ) search can help my @blacklist = ( qr/bad/i ) if you want to save nested loops.

Alternatively, changing from my @blacklist = ( 'bad', 'awful', 'worst' ) to my $blacklist = qr/bad|awful|worst/; , and then replacing the inner loop with if ( $keywords[$i] =~ $blacklist ) ...

+3
source

This should do it:

 my @indices; for my $i (0..$#keywords) { for my $bl (@blacklist) { if ($keywords[$i] =~ $bl) { push(@indices, $i); last; } } } for my $i (@indices) { @keywords = splice(@keywords, $i); } 
0
source

Source: https://habr.com/ru/post/1482531/


All Articles