One way to solve this problem is to rethink how your binary search tree is constructed, containing aggregate totals. Instead of building a binary search tree, consider that each node is interpreted as follows:
- Each node stores a range of values ββintended for the node itself.
- The nodes in the left subtree represent a sample from the probability distribution only to the left of this range.
- The nodes in the right subtree represent a sample from the probability distribution to the right of this range.
For example, suppose our weights are 3, 2, 2, 2, 2, 1, and 1 for events A, B, C, D, E, F, and G. We are building this binary tree containing A, B, C, D, E, F and G:
D / \ BF / \ / \ ACEG
Now we annotate the tree with probabilities. Since A, C, E and G are all sheets, each of them gives a lot of probability:
D / \ BF / \ / \ ACEG 1 1 1 1
Now look at the tree for B. B has a weight of 2 selected, A has a weight of 3 selected, and C has a probability of 2 choices. If we normalize them to the range [0, 1), then A takes into account 3/7 probabilities, and B and C - an account for 2 / 7s. Thus, we have a node for B that everything that is in the range [0, 3/7) goes to the left subtree, everything in the range [3/7, 5/7) maps to B, and everything in the range [ 5/7, 1) is displayed in the right subtree:
D / \ BF [0, 3/7) / \ [5/7, 1) / \ ACEG 1 1 1 1
Similarly, let a process F. E has a weight of 2 selected, and F and G have a probability weight of 1. Thus, the subtree for E here takes into account 1/2 of the mass of probability, node F takes into account 1/4, and the subtree for G - 1 /4. This means that we can assign probabilities as
D / \ BF [0, 3/7) / \ [5/7, 1) [0, 1/2) / \ [3/4, 1) ACEG 1 1 1 1
Finally, look at the root. The total weight of the left subtree is 3 + 2 + 2 = 7. The total weight of the right subtree is 2 + 1 + 1 = 4. The weight of D itself is 2. Thus, the left subtree has a probability of 7/13 of being selected, D has a probability of 2 / 13 be selected, and the right subtree has a probability of 4/13 to be selected. Thus, we can refine the probabilities as
D [0, 7/13) / \ [9/13, 1) BF [0, 3/7) / \ [5/7, 1) [0, 1/2) / \ [3/4, 1) ACEG 1 1 1 1
To create a random value, you must repeat the following:
- Starting at the root:
- Select a uniformly random value in the range [0, 1].
- If it is in the range for the left subtree, go down into it.
- If it is in the range for the right subtree, go down into it.
- Otherwise, return the value corresponding to the current node value.
Probabilities themselves can be determined recursively when building a tree:
- The left and right probabilities are 0 for any leaf node.
- If the interior of the node itself has a weight W, its left tree has a total weight W L , and its right tree has a total weight W R , then the probability on the left is (W L ) / (W + W L + W R ), and the correct probability is (W R ) / (W + W L + W R ).
The reason this reformulation is useful is because it allows us to update probabilities in O (log n) time for each updated probability. In particular, think about which invariants will change if we update some node weight. For simplicity, suppose node is now a leaf. When we update the weight of the node leaf, the probabilities are still true for the node leaf, but they are incorrect for the node just above it, because the weight of one of these node subtrees has changed. Thus, we can (in O (1) times) recalculate the probabilities of the parent node using the same formula as above. But then the parent element node no longer has the correct values, because one of its subtree weights has changed, so we can also recalculate the probability there. This process is repeated right up to the root of the tree, and we do an O (1) calculation per level to correct the weights assigned to each edge. Assuming the tree is balanced, we have to do the O (log n) common work to update one probability. The logic is identical if node is not a leaf node; we just start somewhere in the tree.
In short, it gives
- O (n) time for building a tree (using the bottom-up approach),
- O (log n) to generate a random value and
- O (log n) time to update any value.
Hope this helps!