Tidytext - how to make community and compare word cloud

Let me start with the following and fully working code from Introduction to tidytext @CRAN

library(janeaustenr)
library(dplyr)
library(stringr)

original_books <- austen_books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
                                                 ignore_case = TRUE)))) %>%
  ungroup()

original_books

library(tidytext)
tidy_books <- original_books %>%
  unnest_tokens(word, text)

tidy_books

data("stop_words")
cleaned_books <- tidy_books %>%
  anti_join(stop_words)

Everything is still. I have a piece with six Jane Austen novels when standard garbage words are removed.

unique(cleaned_books$book)

Which gets me: feeling and sensitivity, pride and prejudice, Mansfield Park, Emma, ​​Northanger Abbey, persuasion.

So if I want to make a standard word in the word TF of all six, no problem. Similarly (color added):

library(wordcloud)
library(RColorBrewer)
dark2 <- brewer.pal(8, "Dark2")

cleaned_books %>%
  count(word) %>%
  with(wordcloud(word, n, color = dark2, max.words = 100))

It works great. But how can I then make commonality.cloud () with all six novels and compare.cloud () with the same?

cleaned_books, , . !

. .

, - .

</p>

set1 <- brewer.pal(8, "Set1") ## a second color just for other cloud type

library(reshape2)

# title size and scale optional, obviously
cleaned_books %>%
   group_by(book) %>%
   count(word) %>%
   acast(word ~ book, value.var = "n", fill = 0) %>%
   comparison.cloud(color = dark2, title.size = 1,  scale = c(3,  0.3), random.order = FALSE, max.words = 100)


cleaned_books %>%
   group_by(book) %>%
   count(word) %>%
   acast(word ~ book, value.var = "n", fill = 0) %>%
   commonality.cloud(color = set1, title.size = 1, scale = c(3,  0.3), random.order = FALSE,  max.words = 100)

.

+4
1

. . .

set1 <- brewer.pal(8, "Set1") ## a second color just for other cloud type
library(reshape2)

. reshape2 . .

cleaned_books %>%
   group_by(book) %>%
   count(word) %>%
   acast(word ~ book, value.var = "n", fill = 0) %>%
   comparison.cloud(color = dark2, title.size = 1, scale = c(3,  0.3), random.order = FALSE,  max.words = 100)

cleaned_books %>%
   group_by(book) %>%
   count(word) %>%
   acast(word ~ book, value.var = "n", fill = 0) %>%
   commonality.cloud(color = set1, title.size = 1, scale = c(3,  0.3), random.order = FALSE,  max.words = 100)

!

+1

Source: https://habr.com/ru/post/1689235/


All Articles