I suspect that you are mistaken, this is: you can see where scikit-learn calculates the negative gradient of the loss function and is suitable for a basic estimate of this negative gradient. It seems that the _update_terminal_region method _update_terminal_region responsible for determining the step size, but you nowhere see that it can solve the problem of minimizing linear search, as written in the documentation.
The reason you cannot find string searches is because for the particular case of regression decision trees, which are only piecewise constant functions, the optimal solution is usually known. For example, if you look at the _update_terminal_region method of the _update_terminal_region loss LeastAbsoluteError , you will see that the foliage of the tree is assigned the value of the weighted median of the difference between y and the predicted value for examples for which this sheet matters. This median is a known optimal solution.
To summarize what happens, for each iteration with gradient descents, the following steps are performed:
Calculate the negative gradient of the loss function in the current forecast.
Set DecisionTreeRegressor to a negative gradient. This fitting creates a tree with good splits to reduce losses.
Replace the values ββin the DecisionTreeRegressor sheets with values ββthat minimize loss. They are usually calculated using some simple known formula that uses the fact that the decision tree is just a piecewise constant function.
This method should be at least as good as what is described in the documents, but I think that in some cases it may not be identical to it.
source share