Why interaction attributes improve linear regression performance

I am working on Weka using a linear regression model. I realized that by multiplying the two corresponding attributes from my dataset and adding this as an additional attribute, I improve the performance of linear regression. However, I can’t understand why! Why are my multiplications of two corresponding attributes have better results.

+4
source share
1 answer

This is a sign that the function you are approximating is not linear in the original inputs, but it is in their product. Essentially, you reinvented multidimensional polynomial regression .

For example, suppose the function you approximate is y = a × x² + b × x + c. A linear regression model set only to x will not give good results, but when you feed it both x² and x, it can find out the correct a and b.

The same is true in a multivariate setting: the function may not be linear in x 1 and x 2 separately, but it may be in x 1 × x 2 , which you call the “interaction attribute”. (I know that these are cross-commodity functions or function connectors, this is what the polynomial kernel computes in SVM, and why SVMs are more powerful students than linear models.)

+1
source

Source: https://habr.com/ru/post/1441651/


All Articles