I have some data in the form:
<people> <person first="Mary" last="Jane" sex="F" /> <person first="Susan" last="Smith" sex="F" height="168" /> <person last="Black" first="Joseph" sex="M" /> <person first="Jessica" last="Jones" sex="F" /> </people>
I need a data frame that looks like this:
first last sex height 1 Mary Jane F NA 2 Susan Smith F 168 3 Joseph Black M NA 4 Jessica Jones F NA
I got this far:
library(XML) xpeople <- xmlRoot(xmlParse(xml)) lst <- xmlApply(xpeople, xmlAttrs) names(lst) <- 1:length(lst)
But I canβt understand for life how to get a list in a data frame. I can get a βsquareβ list (i.e., fill in the blanks) and then put it in a data frame:
lst <- xmlApply(xpeople, function(node) { attrs = xmlAttrs(node) if (!("height" %in% names(attrs))) { attrs[["height"]] <- NA } attrs }) df = as.data.frame(lst)
But I have the following problems:
- Data frame migrated
- the first and last are factors, not chr
- height is a factor, not a number.
- the first and last names have been swapped for Joseph Black (not a big problem, since my data is usually consistent, but annoying nonetheless)
How can I get the data frame in the correct form?
dwurf source share