Search for text using R

I am using R text mining package and its a really great tool. I did not find search support or maybe there are functions that I do not see. How can a simple VSM model be implemented using the R text mining package?

+3
source share
2 answers
# Sample R commands in support of my previous answer
require(fortunes)
require(tm)
sentences <- NULL
for (i in 1:10) sentences <- c(sentences,fortune(i)$quote)
d <- data.frame(textCol =sentences )
ds <- DataframeSource(d)
dsc<-Corpus(ds)
dtm<- DocumentTermMatrix(dsc, control = list(weighting = weightTf, stopwords = TRUE))
dictC <- Dictionary(dtm)
# The query below is created from words in fortune(1) and fortune(2)
newQry <- data.frame(textCol = "lets stand up and be counted seems to work undocumented")
newQryC <- Corpus(DataframeSource(newQry))
dtmNewQry <- DocumentTermMatrix(newQryC, control = list(weighting=weightTf,stopwords=TRUE,dictionary=dict1))
dictQry <- Dictionary(dtmNewQry)
# Below does a naive similarity (number of features in common)
apply(dtm,1,function(x,y=dictQry){length(intersect(names(x)[x!= 0],y))})
+1
source

Assuming VSM = a vector space model, you can use a simple search engine as follows:

  • Create a matrix of documents in your collection / body
  • (Jaccard, Euclidean ..). . RSiteSearch .
  • ( 1 , ).
  • .
  • n.

-R- - GINI ( - ) PostgreSQL. ts_vector, .

0

Source: https://habr.com/ru/post/1772493/


All Articles