Obtaining the Spark linear regression covariance matrix

I studied the Spark documentation, but still could not find how to get the covariance matrix after performing linear regression.

Given the data entry data, I made a very simple linear regression similar to this :

val lr = new LinearRegression()
val fit = lr.fit(training)

Obtaining regression parameters is as simple as fit.coefficients, but there seems to be no information on how to get the covariance matrix.

And just to clarify, I'm looking for a function similar to vcovin R. With this, I should be able to do something like vcov(fit)to get a covariance matrix. Any other methods that can help achieve this are also fine.


EDIT

An explanation of how to obtain the covariance matrix from linear regression is discussed in detail here . The standard deviation is easy to obtain as it is provided fit.summary.meanSsquaredError. However, the parameter (X'X) -1 is difficult to obtain. It would be interesting to see if this can be used to somehow calculate the covariance matrix.

+4
source share
1 answer

Although the entire covariance matrix is compiled into a driver , it cannot be obtained without adopting your own solver. You can do this by copying WLS and installing additional "getters".

lrModel.summary.coefficientStandardErrors, (A ^ T * W * A), ().

, , .

+2

Source: https://habr.com/ru/post/1692023/


All Articles