Building clusters with nominal data in R

Imagine that we have 7 categories (for example, religion), and we would like to build them not in a linear way, but in clusters that are automatically selected to be well aligned. Here, individuals within groups have the same answer, but should not be displayed on the same line (which happens when constructing ordinal data).

So, to summarize:

  • automatically using the available graph space

  • grouping out of order, spreading around the canvas

  • people remain visible; does not overlap

  • it would be nice if individuals within groups were connected by some (invisible) circle

Are there any packages designed for this purpose? What keywords do I need to look for?

Sample data:

religion <- sample(1:7, 100, T) # No overlap here, but I would like to see the group part come out more. plot(religion) 
-one
source share
2 answers

After assigning coordinates to the center of each group, you can use wordcloud::textplot to avoid overlapping labels.

 # Data n <- 100 k <- 7 religion <- sample(1:k, n, TRUE) names(religion) <- outer(LETTERS, LETTERS, paste0)[1:n] # Position of the groups x <- runif(k) y <- runif(k) # Plot library(wordcloud) textplot( x[religion], y[religion], names(religion), xlim=c(0,1), ylim=c(0,1), axes=FALSE, xlab="", ylab="" ) 

wordcloud

Alternatively, you can build a graph with a click (or tree) for each group, and use one of the many graph construction algorithms in igraph .

 library(igraph) A <- outer( religion, religion, `==` ) g <- graph.adjacency(A) plot(g) plot(minimum.spanning.tree(g)) 

igraph

+6
source

In the image that you linked, each point has three related numbers: the x and y coordinates and the group (color). If you have only one information for each person, you can do something like this:

 set.seed(1) centers <- data.frame(religion=1:7, cx=runif(7), cy=runif(7)) eps <- 0.04 data <- within(merge(data.frame(religion=sample(1:7, 100, T)), centers), { x <- cx+rnorm(length(cx),sd=eps) y <- cy+rnorm(length(cy),sd=eps) }) with(data, plot(x,y,col=religion, pch=16)) 

Notice that I create random centers for each group, and also create small movements around these centers for each observation. You will have to play with the eps parameter and possibly set up the centers manually if you want to continue this path.

+1
source

Source: https://habr.com/ru/post/1499860/


All Articles