Whenever a global regular expression does not match, it resets the position at which the search for the next global regular expression will begin. Therefore, when the first of your two patterns fails, it forces the second to look again from the beginning of the line.
This behavior can be disabled by adding the /c
modifier, which leaves the position unchanged if the regular expression does not match.
In addition, you can improve your templates by removing escape characters ( "
does not need to be escaped, and /
does not need to be escaped if you select another separator) and the extra +?
After capture.
Also use warnings
much better than -w
on the command line.
Here is the working version of your code.
use strict; use warnings; while (<STDIN>) { while( m|<span class="itempp">([^<]+)</span>|gc or m|<font size="-1">([^<]+)</font>|gc ) { print "$1\n"; } }
source share