Effectively simulate rolling weighted dice (or moving a weighted graph), with frequent updates

I have a weighted directed graph, dense with about 20,000 nodes.

  • Given a node in the graph, I randomly select an adjacent node with a probability associated with relative weights.
  • After each choice, I get feedback on whether the choice was good or bad, and update the network. For example, after a poor selection, I reduce the weight of all edges pointing to the selected node.

I learned yesterday about an alias method for simulating rolling a weighted matrix, which is the same as the choice (each node is one weighted matrix, and the sides correspond to other nodes). One roll is highly efficient, but updating the balance is not; the alias method may not be acceptable because I will update more blocks than I will ride!

Which data structure should be used, which allows for frequent updates, and which appropriate algorithm is best for selection?


Some ideas / notes:

  • I can reduce updates by writing down each weight setting, and then actually updating the node / die if necessary (i.e. just before the roll). But I will still pre-compute the alias data once for each roll.
  • Instead, I could just keep the schedule as it is (so that updates are cheap) and abandon the alias method. I would calculate the relative weights on the fly before each throw (the binary search works here).
  • An additional advantage of calculating relative weights on the fly is that I could eliminate the “global weight” for each node to further reduce updates. Then a poor choice will lead to only two updates: the weight of the incoming edge and the global weight of the node.
  • Added: Maybe there is something in between: a way to maintain local relative weights in the data structure (for example, a tree or an alias method), and then merge them with "global weights" on the fly during each roll.

The truth is that in practice I don’t need to make a choice very often (no more than once a minute), so I don’t need the most effective solution. But this is an interesting project, and I'm interested in finding a theoretically optimal solution.

+6
source share
1 answer

I think you can do this with log (k) complexity, where k is the number of faces in the bone.

for one particular node, let p1, p2, ..., pk be relative probabilities. Let p1 + p2, ..., + pk = p.

Build a tree structure with these relative probabilities like leaves. the parent of each non-leaf node is the sum of the relative probabilities of their children. To roll the dice, draw a random value between 0 and p and follow it through the tree. If you want to update the relative probability of the face of the bone, just change the corresponding value of the leaf node and spread it over the tree.

Thus, you can select a random value with one random number using the log (k) steps needed to find a sheet matching a random number, and when updating one sheet, it takes log (k) time to update the tree.

This is a very simple description of the solution and let me know if you need a full description. I'm sure it works, but not sure if it is effective enough for you.

to generalize this algorithm: 1. Only one random number between 0 and p 2. O (log (k)) difficulty “roll cubes” (ie Find the next node), where k is the number of faces in the bone 3. O ( log (k)) to "update the bones" of this node. If there are m edges in the original node, then the complexity is O (log (k1)) + O (log (k2)) ... O ((km)) where k1, k2, ... km is the connectedness of adjacent nodes.

====Tree Example==== 

If there are 4 faces on the bone, and the relative probabilities are 1:50, 2:80, 3:20, 4:70, build the tree as follows:

  220 / \ 130 90 / \ / \ 50 80 20 70 | | | | 1 2 3 4 

Create a random number v between 0 and 220. If it's v = 100: take the left route (starting at 100 and 130), and then follow the correct route (starting at 100> 80) and update v = v-80 = 20. Since we are in the sheet, declare o / p ie 2

If v = 210: left and v = v-130 = 80, left v = v-70 = 10, return leaf = 4

If 4 changes to 60, change the value from 70 to 60, from 90 to 80, and from 220 to 210.

 ==== Lazy update variation ==== 

When changing weight, do not update the tree immediately. Instead, just mark it as "dirty weight", wait until you need to predict this particular node.

When you need to make a forecast from a specific node, and if some of the weights are dirty, either a. update the tree with only dirty nodes or b. refresh the whole tree. If the number of dirty weights is t and the number of full weights k, if t * log (k) <k, then update the tree corresponding to the dirty weights (t * O (k)), otherwise update the whole tree (O (k) )

+1
source

Source: https://habr.com/ru/post/905336/


All Articles