I studied the Spark documentation, but still could not find how to get the covariance matrix after performing linear regression.
Given the data entry data, I made a very simple linear regression similar to this :
val lr = new LinearRegression()
val fit = lr.fit(training)
Obtaining regression parameters is as simple as fit.coefficients
, but there seems to be no information on how to get the covariance matrix.
And just to clarify, I'm looking for a function similar to vcov
in R. With this, I should be able to do something like vcov(fit)
to get a covariance matrix. Any other methods that can help achieve this are also fine.
EDIT
An explanation of how to obtain the covariance matrix from linear regression is discussed in detail here . The standard deviation is easy to obtain as it is provided fit.summary.meanSsquaredError
. However, the parameter (X'X) -1 is difficult to obtain. It would be interesting to see if this can be used to somehow calculate the covariance matrix.
source
share