Reading unrelated backslashes in JSON in R

I am trying to read some data from the Facebook Graphics API in R to do some text analysis. However, it looks like there are raw backslashes in the JSON channel, which causes rjson to have a barf value. The following is a minimal example of the type of input that causes problems.

library(rjson) txt <- '{"data":[{"id":2, "value":"I want to \\"post\\" a picture\\video"}]}' fromJSON(txt) 

(Note that double backslashes in \\" and \\video converted to single backslashes after parsing, which is what my real data is.)

I also tried the RJSONIO package, which also gave errors, and even occasionally crashed R.

Has anyone encountered this problem before? Is there a way to fix this so that you don’t manually hack all the errors that occur? There, potentially megabytes of JSON are parsed, and error messages are not very informative about exactly where the problematic input is.

0
source share
2 answers

Just replace backslashes that don't escape double quotes, tabs, or newlines with double backslashes.

In a regular expression, '\\\\' converted to one backslash (two escapes are needed, one for R, one for the regular expression mechanism). We need the relx perl engine to use lookahead.

 library(stringr) txt2 <- str_replace_all(txt, perl('\\\\(?![tn"])'), '\\\\\\\\') fromJSON(txt2) 
0
source

The problem is that you are trying to parse invalid JSON:

 library(jsonlite) txt <- '{"data":[{"id":2, "value":"I want to \\"post\\" a picture\\video"}]}' validate(txt) 

The problem is with the picture\\video , because \v not a valid JSON escape sequence, even if it is a valid escape sequence in R and some other languages. Perhaps you mean:

 library(jsonlite) txt <- '{"data":[{"id":2, "value":"I want to \\"post\\" a picture\\/video"}]}' validate(txt) fromJSON(txt) 

In any case, the problem is the JSON data source, which generates invalid JSON. If this data really comes from Facebook, you have found an error in their API. But, most likely, you will not return it correctly.

0
source

Source: https://habr.com/ru/post/972825/


All Articles