Error building Kohonen maps in R?

I read this blog post on R-bloggers and I am confused by the last part of the code and cannot understand.

http://www.r-bloggers.com/self-organising-maps-for-customer-segmentation-using-r/

I tried to recreate this with my own data. I have 5 variables that follow an exponential distribution with 2755 points.

I am doing a great job and can build the map that he creates:

plot(som_model, type="codes") 

enter image description here

The section of code that I don’t understand is:

 var <- 1 var_unscaled <- aggregate(as.numeric(training[,var]),by=list(som_model$unit.classif),FUN = mean, simplify=TRUE)[,2] plot(som_model, type = "property", property=var_unscaled, main = names(training)[var], palette.name=coolBlueHotRed) 

As I understand it, this piece of code, presumably, should display one of the variables above the map to see how it looks, but in this I ran into problems. When I run this section of code, I get a warning:

 Warning message: In bgcolors[!is.na(showcolors)] <- bgcol[showcolors[!is.na(showcolors)]] : number of items to replace is not a multiple of replacement length 

and he creates a graph:

enter image description here

Which just doesn't look right ...

Now, what I think about, it comes down to how an aggregated function reordered data. The length of var_unscaled is 789, and the length of som_model $, training [, var] and unit.classif is 2755. I tried to build aggregated data, the result was not a warning, but was an incomprehensible graph (as expected).

Now I think this was done because unit.classif has many duplicate numbers inside it and why it reduced the size.

The question is, do I worry about warning? Does it produce an accurate schedule? What is the Property section that the plot command is looking for? Is there any other way to β€œaggregate” data?

+5
source share
3 answers

I think you need to create a color palette. If you put an argument

 coolBlueHotRed <- function(n, alpha = 1) {rainbow(n, end=4/6, alpha=alpha)[n:1]} 

and then try to get the plot, for example

 plot(som_model, type = "count", palette.name = coolBlueHotRed) 

the end is complete.

This link may help you: http://rgm3.lab.nig.ac.jp/RGM/R_rdfile?f=kohonen/man/plot.kohonen.Rd&d=R_CC

+8
source

I think that not all cells on your map have dots inside. You have a 30 by 30 card and about 2700 points. On average, it is about 3 points per cell. With high probability, some cells have more than 3 points, and some cells are empty.

The code in the R-bloggers post works well when all cells have dots inside.

To make it work with your data, try changing this part:

 var <- 1 var_unscaled <- aggregate(as.numeric(training[, var]), by = list(som_model$unit.classif), FUN = mean, simplify = TRUE)[, 2] plot(som_model, type = "property", property = var_unscaled, main = names(training)[var], palette.name = coolBlueHotRed) 

with this:

 var <- 1 var_unscaled <- aggregate(as.numeric(data.temp[, data.classes][, var]), by = list(som_model$unit.classif), FUN = mean, simplify = T) v_u <- rep(0, max(var_unscaled$Group.1)) v_u[var_unscaled$Group.1] <- var_unscaled$x plot(som_model, type = "property", property = v_u, main = colnames(data.temp[, data.classes])[var], palette.name = coolBlueHotRed) 

Hope this helps.

0
source

Just add these functions to your script:

 coolBlueHotRed <- function(n, alpha = 1) {rainbow(n, end=4/6, alpha=alpha)[n:1]} pretty_palette <- c("#1f77b4","#ff7f0e","#2ca02c", "#d62728","#9467bd","#8c564b","#e377c2") 
0
source

Source: https://habr.com/ru/post/1203099/


All Articles