I am running a k-average clustering in R and would like to use NbClustit to help determine the optimal number of clusters. My dataset dfcontains 636,688 rows and 7 columns.
When I run NbClust(df, min.nc = 2, max.nc = 3, method = "kmeans"), I get:
Error: cannot allocate vector of size 1510.1 Gb
In addition: Warning messages:
1: In dist(jeu, method = "euclidean") :
Reached total allocation of 32767Mb: see help(memory.size)
2: In dist(jeu, method = "euclidean") :
Reached total allocation of 32767Mb: see help(memory.size)
3: In dist(jeu, method = "euclidean") :
Reached total allocation of 32767Mb: see help(memory.size)
4: In dist(jeu, method = "euclidean") :
Reached total allocation of 32767Mb: see help(memory.size)
Here is mine sessionInfo:
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] clusterSim_0.43-3 fpc_2.1-6 flexmix_2.3-11 mclust_4.2 cluster_1.14.4 MASS_7.3-29
[7] flexclust_1.3-4 modeltools_0.2-21 lattice_0.20-23 NbClust_1.4 rattle_2.6.26
loaded via a namespace (and not attached):
[1] ade4_1.6-2 class_7.3-9 e1071_1.6-2 nnet_7.3-7 parallel_3.0.2 R2HTML_2.2.1 rgl_0.93.996 tools_3.0.2
? , NbClust , , 1% , NbClust - , , , , , . - NbClust . cluster.stats fpc, - .