Scramble HTML files with Perl, returning content only in order

Using HTML :: TreeBuilder - or Mojo :: DOM - I would like to clear the content, but keep it in order so that I can put text values ​​in an array (and then replace the text values ​​with a variable for templates)

But this is in TreeBuilder

my $map_r = $tree->tagname_map();

my @contents = map { $_->content_list } $tree->find_by_tag_name(keys %$map_r);

foreach my $c (@contents) {
  say $c;
}

doesn't return order - of course, hashes are not ordered. So how to visit the tree from root down and save the returned sequence of values? Walking woods recursively? Essentially, I would like to use the as_text method, with the exception of each element. (Following this good idea , but I need this for all the elements)

+4
1

( Mojo:: DOM):

$dom->parse($html)->find('*')->each(
    sub {
        my $text = shift->text;
        $text =~ s/\s+/ /gi;
        push @text, $text;
    }
  );

.

0

Source: https://habr.com/ru/post/1605622/


All Articles