Let's say that I have a list of data: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, where n = 10 elements
I would like to randomly select k elements of this set to form a sublist, say k = 5.
In this case, I could get a sublist that looks like {9, 3, 5, 2, 7}
I could do this:
- Randomly determining the offset in the list between 0 and the current list size minus 1
- Adding this item to my sublist
- Removing this item from the source list
- Repeat until the desired size is found.
The problem is that as the original list grows, the offset and deletion times increase, and for any significantly large list (for example, more than 1,000,000 items), this algorithm takes quite a lot of time.
Is there a faster way to generate a random sequence from a list of data? The implementation of the random number generator should be rejected for this task, instead focusing on how the RNG result is used in the proposed algorithm.
Any thoughts?
I am now using the STL C ++ list
source
share