I am trying to get all the information from this page: http://ws.parlament.ch/affairs/19110758/?format=xml
First, I upload the file to fileand parse it with xmlParse(file).
download.file(url = paste0(http://ws.parlament.ch/affairs/19110758/?format=xml), destfile = destfile)
file <- xmlParse(destfile[])
Now I want to extract all the information I need. For example, heading and identification number. I tried something like this:
title <- xpathSApply(file, "//h2", xmlValue)
But this only gives me an error: unable to find an inherited method for function ‘saveXML’ for signature ‘"XMLDocument"
The next thing I tried is:
library(plyr)
test <-ldply(xmlToList(file), function(x) { data.frame(x[!names(x)=="id"]) } )
This gives me data.framesome information. But I am losing information such as id(most importantly).
I would like to get data.framewith a line (just one line per case) containing all the information of one case, for example id``updated additionalIndexing``affairType, etc.
It works with this (example for id):
infofile <- xmlRoot(file)
nodes <- getNodeSet(file, "//affair/id")
id <-as.numeric(lapply(nodes, function(x) xmlSApply(x, xmlValue)))