For example, I have billions of short phrases, and I want their clusters to be similar.
> strings.to.cluster <- c("Best Toyota dealer in bay area. Drive out with a new car today", "Largest Selection of Furniture. Stock updated everyday" , " Unique selection of Handcrafted Jewelry", "Free Shipping for orders above $60. Offer Expires soon", "XXXX is where smart men buy anniversary gifts", "2012 Camrys on Sale. 0% APR for select customers", "Closing Sale on office desks. All Items must go" )
suppose this vector contains hundreds of thousands of lines. Is there a package in R for grouping these phrases in meaning? or someone may suggest a way to rank โsimilarโ phrases within the meaning of this phrase.
source share