I have the following input for a Perl script, and I want to get the first occurrence of the NAME = "..." lines in each of the structures <table>...</table>.
The entire file is read in one line, and the regular expression acts on this input.
However, the regex always returns the last occurrence of strings NAME="...". Can someone explain what is happening and how can this be fixed?
Input file:
ADSDF
<TABLE>
NAME="ORDERSAA"
line1
line2
NAME="ORDERSA"
line3
NAME="ORDERSAB"
</TABLE>
<TABLE>
line1
line2
NAME="ORDERSB"
line3
</TABLE>
<TABLE>
line1
line2
NAME="ORDERSC"
line3
</TABLE>
<TABLE>
line1
line2
NAME="ORDERSD"
line3
line3
line3
</TABLE>
<TABLE>
line1
line2
NAME="QUOTES2"
line3
NAME="QUOTES3"
NAME="QUOTES4"
line3
NAME="QUOTES5"
line3
</TABLE>
<TABLE>
line1
line2
NAME="QUOTES6"
NAME="QUOTES7"
NAME="QUOTES8"
NAME="QUOTES9"
line3
line3
</TABLE>
<TABLE>
NAME="MyName IsKhan"
</TABLE>
This is where the Perl code starts:
use warnings;
use strict;
my $nameRegExp = '(<table>((NAME="(.+)")|(.*|\n))*</table>)';
sub extractNames($$){
my ($ifh, $ofh) = @_;
my $fullFile;
read ($ifh, $fullFile, 1024);
while( $fullFile =~ m
print "found: ".$4."\n";
}
}
sub main(){
if( ($#ARGV + 1 )!= 1){
die("Usage: extractNames infile\n");
}
my $infileName = $ARGV[0];
my $outfileName = $ARGV[1];
open my $inFile, "<$infileName" or die("Could not open log file $infileName");
my $outFile;
extractNames( $inFile, $outFile );
close( $inFile );
}
main();
source
share