Convert case to data.frame to R

I use the tm package to apply stemning, and I need to convert the received data into a data frame. The solution for this can be found here R tm package vcorpus: error in converting corpus to data frame , but in my case I have the contents of the case as:

[[2195]] i was very impress 

instead

 [[2195]] "i was very impress" 

and because of this, if I applied

 data.frame(text=unlist(sapply(mycorpus, `[`, "content")), stringsAsFactors=FALSE) 

the result will be

 <NA>. 

Any help is much appreciated!

The code below is an example:

 sentence <- c("a small thread was loose on the sandals, otherwise it looked good") mycorpus <- Corpus(VectorSource(sentence)) mycorpus <- tm_map(mycorpus, stemDocument, language = "english") inspect(mycorpus) [[1]] a small thread was loo on the sandals, otherwi it look good data.frame(text=unlist(sapply(mycorpus, `[`, "content")), stringsAsFactors=FALSE) text 1 <NA> 
+6
source share
2 answers

Applying

 gsub("http\\w+", "", mycorpus) 

the output has a class = character, so it works in my case.

+2
source

I cannot reproduce the problem using tm_0.6 in R 3.1.0 on Mac:

 > data.frame(text=unlist(sapply(mycorpus, `[`, "content")), stringsAsFactors=FALSE) text content a small thread was loos on the sandals, otherwis it look good 

If I got these unwanted results, I would immediately try:

  data.frame(text=unlist(sapply(mycorpus, `[[`, "content")), stringsAsFactors=FALSE) 

... arguing that since 'constent' is the name of the list that [['content']] should have been able to be sequentially retrieved. He also looked at me that a list might not be necessary for this approach:

 > data.frame(text=sapply(mycorpus, `[[`, "content"), stringsAsFactors=FALSE) text 1 a small thread was loos on the sandals, otherwis it look good 
+1
source

Source: https://habr.com/ru/post/974317/


All Articles