After reading several guides, I was confused about the value of model.matrix(~0+x) ountil recently that I found this beautiful chapter in a book .
In mathematics, 0+a is equal to a , and a record of type 0+a very strange. However, we are dealing with linear models: a simple high school equation, such as y=ax+b , that reveals the relationship between the predictor variable (x) and the observation (y).
Thus, we can think of ~0+x or equally ~x+0 as the equation of the form: y=ax+b . By adding 0 , we force b to be zero, which means that we are looking for a line passing through the origin (without interception). If we indicated a model such as ~x+1 or simply ~x , then it would be established there that the equation could contain a nonzero term b . Equally, we can restrict b formula ~x-1 or ~-1+x , which both mean: no interception (we exclude the same row or column in R by a negative index). However, something like ~x-2 or ~x+3 does not make sense.
Thanking @mnel for the helpful comment, finally, why use ~ , not = ? In standard mathematical terminology / symbolism, y~x means that y is equivalent to x, it is slightly weaker than y=x . When you set up a linear model, you don't really say y=x , but moreover, you can model y as a linear function of x ( y = ax+b for example)
source share