I'm just starting to work with Mahout, and one thing that puzzled me a lot was the lack of linear regression. Even logistic regression, which is much more complicated, is supported to some extent by research, but all this is silent on linear regression!
From what I understand, OLS is one of the easiest problems to solve -
Y = Xb + e
has a linear regression solution b = (X ^ TX) ^ (- 1) X ^ TY, where X ^ T is transposed X, and if the matrix (X ^ TX) turns out to be special (i.e. not reversible), then itโs fine, to show an error message even if there is a solution using a generic converse.
Calculating both X ^ TX and X ^ Y is just a calculation of the sums and sums of the products of the elements, which is probably the easiest MapReduce to use, as I understand it.
(What makes me think ... is there any module that supports its own matrix operators needed to calculate the regression coefficients? This will really make the regression module unnecessary ...)
Am I missing something that makes it difficult to compute regression in Mahout?
source share