Search for nonlinear correlations in R

I have about 90 variables stored in data [2-90]. I suspect that about 4 of them will have a parabolike correlation with the data [1]. I want to determine which of them have a correlation. Is there an easy and quick way to do this?

I tried to build such a model (which I could do in a loop for each variable i = 2:90):

y <- data$AvgRating
x <- data$Hamming.distance
x2 <- x^2

quadratic.model = lm(y ~ x + x2)

And then look at the coefficient R ^ 2 / to get an idea of ​​the correlation. Is there a better way to do this?

Maybe R can build a regression model with 90 variables and choose those that are significant in themselves? Would that be possible? I can do this in JMP for linear regression, but I'm not sure I could perform non-linear regression with R for all variables in them. So I manually tried to see if I could see which ones were correlated in advance. It would be useful if there was a function for this.

+4
source share
2 answers

Another option would be to calculate the mutual information score between each pair of variables. For example, using a mutinformationfunction from infotheo package , you can do:

set.seed(1)

library(infotheo)

# corrleated vars (x & y correlated, z noise)
x <- seq(-10,10, by=0.5)
y <- x^2
z <- rnorm(length(x))

# list of vectors
raw_dat <- list(x, y, z)


# convert to a dataframe and discretize for mutual information
dat <- matrix(unlist(raw_dat), ncol=length(raw_dat))
dat <- discretize(dat)

mutinformation(dat)

Result

|   |        V1|        V2|        V3|                                                                                            
|:--|---------:|---------:|---------:|                                                                                            
|V1 | 1.0980124| 0.4809822| 0.0553146|                                                                                            
|V2 | 0.4809822| 1.0943907| 0.0413265|                                                                                            
|V3 | 0.0553146| 0.0413265| 1.0980124| 

mutinformation() . discretize() , , .

, , , .

+1

. . 22 .

+1

Source: https://habr.com/ru/post/1649764/


All Articles