Since Martijn posted a response in Python and said that Perl would address linear noise, I felt the need for a response to Perl.
In CPAN , in the Perl modules directory, there is a module called Geo :: Gpx . As Martijn already said, GPX is an XML format. But, fortunately, someone has already turned it into a module that processes parsing for us. All we need to do is download this module.
Several modules are available for CSV processing, but the data in this XML file is quite simple, so we really don't need it. We can do it ourselves with built-in functionality.
Pay attention to the following script. I will give an explanation in a minute.
use strict; use warnings; use Geo::Gpx; use DateTime;
Let's do it step by step:
use strict
and use warnings
apply rules, such as declaring variables, and talk about the most common errors that are most difficult to find.use Geo::Gpx
and use DateTime
are the modules we use. Geo::Gpx
will handle the parsing for us. We need a DateTime
to make unix timestamps in a matter of date and time.- The
open
function opens a file. $fh_in
is a variable containing a file descriptor. The GPX file we want to read is fells_loop.gpx , which I took out a loan from topografix.com . You can find more information on open
at perlopentut . - We create a new
Geo::Gpx
object named $gpx
and use our $fh_in
file descriptor to tell it where to read the XML data. new
method is provided by all Perl modules that have an object-oriented interface. close
closes the file descriptor.- The following
open
has >
to tell Perl what we want to write to this file descriptor. - We
print
to the file descriptor, setting it as the first argument to print . Note that there is no comma after the file descriptor. \n
is a newline character. foreach
loop accepts the return value of the waypoints
method of the Geo::Gpx
. This value is an array reference. Think of it as an array that contains arrays (see perlref for more information on links). In each iteration of the loop, the next element of this ref array (which is a waypoint in the GPX data) will be placed in $wp
. If printed with Data::Dumper
, it looks like this:
$VAR1 = { 'ele' => '64.008000', 'lat' => '42.455956', 'time' => 991452424, 'name' => 'SOAPBOX', 'sym' => 'Cemetery', 'desc' => 'Soap Box Derby Track', 'lon' => '-71.107483', 'type' => 'Intersection' };
The for
postfix is ββnow a little more complicated. As we just saw, hashref has 8 keys. Unfortunately, some of them are sometimes missing. Since we have use warnings
, we will get a warning if we try to access one of these missing values. We must create these keys and put an empty string there. ''
foreach
and for
completely interchangeable in Perl, and both of them can also be used in postfix syntax with a single expression. We use the qw
operator to create a list that for
will iterate over. qw
not suitable for quoted words , and it does just that: it returns a list of lines in it, but is quoted. We could also say ('time', 'lat', 'long'... )
.
In the expression, we get access to each key $wp
. $_
is a loop variable. At the first iteration, it will contain "time", then "lat", etc. Since $wp
is hashref, we need a ->
to access them. Curly braces say it hashref. The ||=
operator assigns a value to our ref hash element only if it is not a true value.
Now, if there is a time value (the empty line that we just assigned, if the date has not been set, is considered βnoβ), we replace the unix timestamp with the corresponding date. DateTime helps us do this. The from_epoch
method takes a unix timestamp as an argument. It returns a DateTime
object that we can directly use to call the iso8601
function on it.
This is called a chain. Some modules can do this. This is similar to what jQuery JavaScript objects do. The unix timestamp in our hashref is replaced with the result of the DateTime
operation.
- Now we
print
will add our file descriptor again. join
used to put commas between values. We also added a new line at the end. - As soon as we finish the loop, we
close
the file. - Now everything is ready! :)
In general, I would say that it is quite simple and also quite readable, right? I tried to make it a healthy mix of overly detailed syntax with _Perl_ish flavor.
source share