How can I sort XML records using LibXML and Perl?

I am parsing an XML file with LibXML and should sort the entries by date. Each entry has two date fields, one for when the publication was published, and the other for when it was updated.

<?xml version="1.0" encoding="utf-8"?>
...
<entry>
  <published>2009-04-10T18:51:04.696+02:00</published>
  <updated>2009-05-30T14:48:27.853+03:00</updated>
  <title>The title</title>
  <content>The content goes here</content>
</entry>
...

The XML file is already sorted by date, updated, with the last first. I can easily undo this to put the old entries first:

my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file);
my $xc = XML::LibXML::XPathContext->new($doc->documentElement());

foreach my $entry (reverse($xc->findnodes('//entry'))) {
  ...
}

However, I need to change the sorting of the file by publication date, and not by update date. How can i do this? The timestamp looks a bit awkward. Should I normalize this first?

Thanks!

: XPath , XML , . sort , .

+3
2

reverse sort ():

sub parse_date {
    # Transforms date from 2009-04-10T18:51:04.696+02:00 to 20090410
    my $date= shift;
    $date= join "", $date =~ m!\A(\d{4})-(\d{2})-(\d{2}).*!;
    return $date;
}

sub by_published_date {
    my $a_published= parse_date( $a->getChildrenByTagName('published') );
    my $b_published= parse_date( $b->getChildrenByTagName('published') );

    # putting $b_published in front will ensure the descending order.
    return $b_published <=> $a_published;
}

foreach my $entry ( sort by_published_date $xc->findnodes('//entry') ) {
    ...
}

, !

+5

:

 print for sort "2009-06-15T08:00:00+07:00", "2009-06-15T04:00:00+00:00";

3 , .

, "wonky". rfc3339 .

+2

Source: https://habr.com/ru/post/1710400/


All Articles