What to use for multiple correlation?

I am trying to use python to compute multiple linear regression and multiple correlation between an array of answers and a set of predictor arrays. I saw a very simple example for calculating multiple linear regression, which is easy. But how to calculate multiple correlation with statsmodels? or with anything else, as an alternative. I think I could use rpy and R, but I would prefer to stay in python if possible.

change [explanation]: Given the situation described here: http://sph.bu.edu/otlt/lamorte/EP713/Web_Pages/EP713_Regression/EP713_Regression3.html I would also like to calculate the multiple correlation coefficients for predictors in addition to the regression coefficients and other regression parameters

+4
source share
1 answer

You could do this with statsmodels and pandas. Something like this might start you up

import pandas import statsmodels.api as sm from statsmodels.formula.api import ols data = pandas.DataFrame([["A", 4, 0, 1, 27], ["B", 7, 1, 1, 29], ["C", 6, 1, 0, 23], ["D", 2, 0, 0, 20], ["etc.", 3, 0, 1, 21]], columns=["ID", "score", "male", "age20", "BMI"]) print data.corr() model = ols("BMI ~ score + male + age20", data=data).fit() print model.params print model.summary() 

See the documentation:

http://statsmodels.sourceforge.net/devel/

http://pandas.pydata.org/

Edit: I am not familiar with the terminological coefficient of multiple correlation, but I believe that this is just the square root of the R-square of the multiple regression model no?

 print model.rsquared**.5 print model.rsquared_adj**.5 

That's what you need?

+10
source

Source: https://habr.com/ru/post/1446924/


All Articles