Perl, match one pattern multiple times on the same line marked with unknown characters

Question

Perl, match one pattern multiple times on the same line marked with unknown characters

I managed to find similar, but not identical questions to this. How to match one regular expression pattern several times in the same line marked with unknown characters?

For example, let's say I want to combine a HEY pattern. I would like to know all of the following:

HEY
Hey hey
HEYxjfkdsjfkajHEY

So, I would count 5 HEY. So here is my program that works for everything but the last:

open ( FH, $ARGV[0]); while(<FH>) { foreach $w ( split ) { if ($w =~ m/HEY/g) { $count++; } } }

So my question is how to replace this foreach loop so that I can recognize patterns limited to strange characters in unknown configurations (as shown in the example above)?

EDIT:

Thanks for the great answers. I just realized that I needed one more thing that I added in the comments below.

One question: is there a way to keep a consistent term? As in my case, is there a way to refer to $ w (let's say if the regex was more complex and I wanted to keep it in a hash with the number of occurrences)

So, if I matched a real regular expression (say, a sequence of alphanumeric characters) and wanted to store it in a hash.

+6

regex perl

varatis Feb 06 '12 at 6:03

source share

3 answers

The problem is that you really don't want to call split (). He divides things into words, and you will notice that your last line contains only one word (although you will not find it in the dictionary). The word is limited to white space and, therefore, simply "everything but spaces".

What you really want is to continue looking at each line counting each HEY, starting from where you stopped each time. What does / g need at the end, but to keep looking:

 while(<>) { while (/HEY/g) { $count++; } } print "$count\n";

There are, of course, several ways to do this, but this is close to your example. Other people will post other great examples. Learn from everyone!

+5

Wes hardaker Feb 06 '12 at 6:14

source share

None of the above answers worked for my similar problem. $ 1 doesn't seem to change (perl 5.16.3), so $ hash {$ 1} ++ will just read the first match n times.

To get every match, foreach requires an assigned local variable, which will then contain the match variable. Here is a little script that will match and print each occurrence (number).

 #!/usr/bin/perl -w use strict; use warnings FATAL=>'all'; my (%procs); while (<>) { foreach my $proc ($_ =~ m/\((\d+)\)/g) { $procs{$proc}++; } } print join("\n",keys %procs) . "\n";

I use it like this:

 pstree -p | perl extract_numbers.pl | xargs -n 1 echo

(excluding some relevant filters in this pipeline). Any capture of the drawing should also work.

0

Bill mcgonigle Feb 13 '14 at 20:45

source share

masaers · Accepted Answer · 2012-02-06T06:12:34+0000

One way is to capture all string matches and see how much you got. For instance:

 open (FH, $ARGV[0]); while(my $w = <FH>) { my @matches = $w =~ m/(HEY)/g; my $count = scalar(@matches); print "$count\t$w\n"; }

EDIT:

Yes there is! Just flip all matches and use capture variables to increase the hash count:

 my %hash; open (FH, $ARGV[0]); while (my $w = <FH>) { foreach ($w =~ /(HEY)/g) { $hash{$1}++; } }

Perl, match one pattern multiple times on the same line marked with unknown characters

More articles: