Reading an XML file in Node.js

I am learning how to use Node. At this time, I have an XML file that looks like this:

sitemap.xml

<?xml version="1.0" encoding="utf-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> <url> <loc>http://www.example.com</loc> <lastmod>2015-10-01</lastmod> <changefreq>monthly</changefreq> </url> <url> <loc>http://www.example.com/about</loc> <lastmod>2015-10-01</lastmod> <changefreq>never</changefreq> </url> <url> <loc>http://www.example.com/articles/tips-and-tricks</loc> <lastmod>2015-10-01</lastmod> <changefreq>never</changefreq> <article:title>Tips and Tricks</blog:title> <article:description>Learn some of the tips-and-tricks of the trade</article:description> </url> </urlset> 

I am trying to load this XML in my Node application. When loading, I want to get only those url elements that include the use of <article: elements. I'm stuck at this time. Right now I am using XML2JS through the following:

 var parser = new xml2js.Parser(); fs.readFile(__dirname + '/../public/sitemap.xml', function(err, data) { if (!err) { console.log(JSON.stringify(data)); } }); 

When the console.log statement is executed, I just see a bunch of numbers in the console window. Something like that:

 {"type":"Buffer","data":[60,63,120, ...]} 

What am I missing?

+19
source share
8 answers

use xml2json

https://www.npmjs.com/package/xml2json

 fs = require('fs'); var parser = require('xml2json'); fs.readFile( './data.xml', function(err, data) { var json = parser.toJson(data); console.log("to json ->", json); }); 
+21
source

From the documentation .

Two arguments are passed to the callback (err, data), where data is the contents of the file.

If no encoding is specified, the raw buffer is returned.

If options is a string, then it indicates the encoding. Example:

 fs.readFile('/etc/passwd', 'utf8', callback); 

You did not specify an encoding, so you get a raw buffer.

+10
source

fs.readFile has an optional second parameter: encoding. If you do not enable this option, it will automatically return a Buffer object to you.

https://nodejs.org/api/fs.html#fs_fs_readfile_filename_options_callback

If you know the encoding, just use:

 fs.readFile(__dirname + '/../public/sitemap.xml', 'utf8', function(err, data) { if (!err) { console.log(data); } }); 
+2
source

You can try this

 npm install express-xml-bodyparser --save 

on the client side: -

  $scope.getResp = function(){ var posting = $http({ method: 'POST', dataType: 'XML', url: '/getResp/'+$scope.user.BindData,//other bind variable data: $scope.project.XmlData,//xmlData passed by user headers: { "Content-Type" :'application/xml' }, processData: true }); posting.success(function(response){ $scope.resp1 = response; }); }; 

server side: -

 xmlparser = require('express-xml-bodyparser'); app.use(xmlparser()); app.post('/getResp/:BindData', function(req, res,next){ var tid=req.params.BindData; var reqs=req.rawBody; console.log('Your XML '+reqs); }); 
+2
source

You can also use regexp before parsing to remove items that don't match your conditions:

 var parser = new xml2js.Parser(); fs.readFile(__dirname + '/../public/sitemap.xml', "utf8",function(err, data) { // handle err... var re = new RegExp("<url>(?:(?!<article)[\\s\\S])*</url>", "gmi") data = data.replace(re, ""); // remove node not containing article node console.log(data); //... parse data ... }); 

Example:

  var str = "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>bcd</hello></url><url><hello>efd</hello><moto>poi</moto></url></data>"; var re = new RegExp("<url>(?:(?!<moto>)[\\s\\S])*</url>", "gmi") str = str.replace(re, "") // "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>efd</hello><moto>poi</moto></url></data>" 
+1
source

To read the XML file in Node , I like the XML2JS package . This package allows me to easily work with XML in JavaScript.

 var parser = new xml2js.Parser(); parser.parseString(fileData.substring(0, fileData.length), function (err, result) { var json = JSON.stringify(result); }); 
0
source

being late on this topic, just add one simple tip here, if you plan to use the analyzed data in js or save it as a json file, be sure to set explicitArray to false . The output will be more friendly js

so what it will look like
letparser=newxml2js.Parser({explicitArray:false})

Ref: https://github.com/Leonidas-from-XIV/node-xml2js

0
source

Why didn't anyone mention the libxmljs package? I just read about it, and it seemed pretty easy to parse xml using it for me.

0
source

Source: https://habr.com/ru/post/1232646/


All Articles