I ran into a coding problem with the SPARQL package for R. I am running the following code:
library(SPARQL)
rights_query <- '
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?edmrights ?provider (COUNT(*) as ?count)
WHERE {
?agg rdf:type ore:Aggregation .
?agg edm:rights ?edmrights .
?agg edm:dataProvider ?provider .
?proxy ore:proxyIn ?agg .
?proxy edm:type "IMAGE" .
}
GROUP BY ?edmrights ?provider
ORDER BY ?provider DESC(?count)'
eur <- "http://europeana.ontotext.com/sparql"
eur_data <- SPARQL(eur, rights_query)$results
write.csv(eur_data, "results.csv")
The code works without any errors or warnings, however, the resulting data frame viewed in RStudio, as well as CSV, clearly has encoding problems.
For example, the latter should be partially Cyrillic: / Chouvashia State Art Museum
However, it looks like this: ЧÑваÑÑкий гоÑÑдаÑÑÑвеннÑй ÑÑдожеÑÑвеннÑй мÑзей / Chouvashia State Art Museum
I checked the XML returned by the SPARQL query. It passes XML validation and contains the correct UTF-8 encoding declaration. R An XML package (which uses the SPARQL package to parse XML output into a data frame) should recognize this, right?
XML, CSV . R 3.1.0 RStudio, OS X Mavericks. RStudio UTF-8.