Parsing XML files without using loops?

I am parsing a bunch of large xml files using the XML package for the values ​​of the variable " varname ". The code I'm using is:

 library(XML) a = xmlTreeParse("/path/filename.xml") r = xmlRoot(a) namelist = list() for(i in 1:xmlSize(r)){namelist[[i]] <- xmlValue(xmlChildren(r[[i]])$varname)} 

Since this is time consuming, I tried parallel processing:

 library(foreach) library(doMC) registerDoMC() namelist = list() namelist <- foreach(i = 1:xmlSize(r)) %dopar% {namelist[[i]] <- xmlValue(xmlChildren(r[[i]])$varname)} 

It's faster, but my machine still freezes for large enough files. Is there any way around this problem?

+4
source share
1 answer

As indicated in the original question poster:


For anyone reading this post: the simplest solution would appear in the xmlToDataFrame function in the XML library. This only requires a minor read setup in the XML file in my case. Highly recommended. I found this apology only after posting the question.

+2
source

Source: https://habr.com/ru/post/1395376/


All Articles