Why Monte Carlo tree search reset tree

I had a small but potentially stupid question about searching the Monte Carlo tree . I understand most of this, but looked at some implementations and noticed that after starting MCTS for a given state and returning the best result, the tree is discarded. So, for the next step, we have to start the MCTS from scratch in this new state in order to get the next best position.

I'm just wondering why we are not saving some of the information from the old tree. It seems that there is valuable information about the conditions in the old tree, especially considering that the best way is the one that most studied the MCTS. Is there any special reason why we cannot use this old information in some useful way?

+4
source share
1 answer

Some implementations do retain information.

For example, AlphaGo Zero paper says:

The search tree is reused in subsequent time steps: the child element of the node corresponding to the action being played becomes the new root of the node; the subtree below this child is saved along with all its statistics, while the rest of the tree is discarded

+4
source

Source: https://habr.com/ru/post/1689558/


All Articles