What is the best technique for creating a random access data structure lazily?

In Haskell, I would like to generate a list of random integers of indefinite length. (However, less than 1 million.)

Iโ€™m unlikely to need all the elements of the list at once, so I would like to generate it lazily. However, after the generation, I will need to access the items in the random access list. So, I believe that the best method would be to copy an infinite list to an array. However, I donโ€™t know if arrays can be "swapped" with lists very easily - for example, as soon as I want to generate more list elements, I want to start where I left off but expand the array.

In any case, maybe I should solve for some tree structure log (n) instead of an array? What do you think?

Does Haskell have a tree structure for sequential items that can be accessed using a list-like API?

+4
source share
5 answers

If you have good PRNG, then the values โ€‹โ€‹that are generated by seeds that are close to each other should be independent, so you can just select the initial seed, then each cell i in the array will be prng(seed+i) .

This essentially just uses it as a hash function: P

If you do it this way, you donโ€™t even need an array, you can just have the function getRandomValue seed i = prng (seed+i) . Whether this is better than an array of lazy values โ€‹โ€‹will depend on the complexity of your PRNG, but in any case you will get O (1) access.

+7
source

This seems to be useful, but you will need to generate larger trees as the list grows: http://www.codeproject.com/KB/recipes/persistentdatastructures.aspx#RandomAccessLists (it is a log (n) structure, although). Maybe something like http://research.janelia.org/myers/Papers/applicative.stack.pdf , but I donโ€™t see how to do this efficiently (with a general structure) when the list is generated dynamically.

One possible approach would be to use a pseudo-random number generator that allows โ€œslippingโ€ or going forward steps n in sublinear time. One method (slow, but works in constant time) is to use block encryption in counter mode; see http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator#Designs_based_on_cryptographic_primitives . See http://cs.berkeley.edu/~mhoemmen/cs194/Tutorials/prng.pdf for details.

+4
source

Not an answer, but a consideration: note that any standard technique for creating a list of random values โ€‹โ€‹will include threads through the seed. Therefore, if I create a lazy random infinite list and force the value of the 100th element, this forces not only the spine of the cells minus 0-99, but also their values.

In any case, I would recommend lazy persistent vectors: http://hackage.haskell.org/package/storablevector

You still have O (n) access, but you reduced it by the size factor of your chunk, so in practice it is much faster than a simple list. And you still get the modulo size, laziness properties that interest you.

+2
source

What about Data.Sequence.Seq ? It is not lazy, but supports O (1) from both ends, and lookups - O (log (min (i, ni))), which should be better than most other tree structures. You can save Seq and generate / add additional outputs as needed.

There is also a ListLike instance here if you prefer this API over the one provided by Seq .

0
source

You can only create a list of arrays. Therefore, if each array stores the elements M your sequence, access to element n is O(n/M) . This can be a good environment.

For instance:

 import Data.Array import Data.List fibsList :: (Num a) => [a] fibsList = 1 : 1 : zipWith (+) fibsList (tail fibsList) chunkSize :: (Num a, Ix a) => a chunkSize = 100000 fibsChunks :: (Num i, Ix i, Num e) => [Array ie] fibsChunks = mkChunks fibsList where mkChunks fs = listArray (0,chunkSize-1) xs : mkChunks ys where (xs,ys) = splitAt chunkSize fs lookupFib :: (Integral i, Ix i, Num e) => i -> e lookupFib n = fibsChunks `genericIndex` q ! r where (q,r) = n `divMod` chunkSize 
0
source

Source: https://habr.com/ru/post/1339122/


All Articles