Gref and data mining in Perl

Question

Gref and data mining in Perl

I have HTML content stored in a variable. How to extract data found between a set of common tags on a page? For example, I'm interested in data (represented by DATA, stored between a set of tags that are one line after another:

...
<td class="jumlah">*DATA_1*</td>
<td class="ud"><a href="">*DATA_2*</a></td>
...

And then I would like to keep the DATA_2 => DATA_1 mapping in a hash

+3

html grep perl extract tags

syker May 21 '10 at 23:19

source share

4 answers

HTML, , ?

https://metacpan.org/pod/XML::XPath

XPath - .

+2

dierre 21 '10 23:23

HTML, Q - HTML:: TreeBuilder HTML:: Parser.

, , , SO, HTML RegEx - - , , 100% , HTML .

0

DVK 21 '10 23:31

: HTML::TreeBuilder::XPath. :

XPath HTML:: TreeBuilder, .

0

Axeman 21 '10 23:38

jasonmp85 · Accepted Answer · 2010-05-21T23:42:43+0000

Since this is HTML, you probably need an XPath module to work with HTML, HTML :: TreeBuilder :: XPath .

, HTML:: TreeBuilder. , - $content, :

my $tree = HTML::TreeBuilder->new;
$tree->parse_file($file_name);

XPath . td , tr table body html:

my $tdNodes = $tree->findnodes('/html/body/table/tr/td');

, , , :

foreach my $node ($tdNodes->get_nodelist) {
  my $data = $node->findvalue('.'); // the content of the node
  print "$data\n";
}

. HTML:: TreeBuilder NodeSet , NodeSet. w3schools XPath .

HTML, , . , .. XPath, , . , HTML XPath , .

Gref and data mining in Perl

More articles: