Data Structure Recommendations

Developing in Java, do I need a data structure to select N different random numbers from 0 to 999999?

I want to be able to quickly distribute N numbers and make sure that they are not repeated.

The main goal is not to use too much memory and maintain performance.

I am considering using BitSet, but I'm not sure what the consequences of memory are. Can someone tell me if the memory requirements of this class are related to the number of bits or the number of bits set? and what is the difficulty of setting / testing bits?

UPDATE: Thanks for all the answers.

I think I had this in my original wording of this Q, but deleted it when I first saw the BitSet class. In any case, I would like to add the following information: Currently I am considering N only a few thousand (most likely about 1000-2000) and a range from 0 to 999999. But I would like my choice to take into account the possibility of increasing the range to 8 digits (then there are from 0 to 99,999,999), keeping N approximately in the same range (possibly increasing it to 5K or 10K). Thus, the “values ​​used” are quite rare.

0
source share
2 answers

It depends on how big N .

With small N values, you can use the HashSet<Integer> to store numbers that you have already issued. This gives you an O(1) lookup and O(N) space usage.

A BitSet for the range 0-999999 will use approximately 125 Kbytes, regardless of the value of N With sufficiently large N values, this will be more space than a HashSet . I'm not sure if the value is N , where BitSet will use less space, but my guest will be between 10,000 and 20,000.


Can someone tell me if the BitSet memory requirements BitSet number of bits or the number of bits set?

The size is determined either by the largest bit that has ever been set, or by the nBits parameter if you use the BitSet(int nBits) constructor BitSet(int nBits) .

and what is the difficulty of setting / testing bits?

Testing B O(1) bits O(1) .

The setting bit B is the best O(1) and O(B) if you need to expand an array of bit set support. However, since the size of the support array is the next highest power of 2, the cost of expansion can usually be amortized over several BitSet operations.

+4
source

A BitSet will take up as much space as 1,000,000 Booleans, which is 125,000 bytes or approximately 122 kB, plus some minor overhead and room for growth. An array of actual numbers i.e. int[] , takes an N × 4B space, plus some overhead. Breakeven point

 4 × N = 125,000 N = 31250 

I am not familiar with the internal components of Java, but I suspect that it will not allocate more than two times the actual space, so you are using less than 250 KB of memory with a bitrate. In addition, the array makes it difficult to find duplicates when you need unique integers, so I would use the bitrate anyway and maybe convert it to an array at the end, if it is more convenient for further processing.

Setting / getting bits in BitSet will be of constant complexity, although several operations are required than getting one of boolean[] .

0
source

Source: https://habr.com/ru/post/1497746/


All Articles