Bruteforce with lazy pricing and memory consumption

I implemented a small bruteforce function using lazy evaluation to find the first feasible solution to the problem:

 import Data.Maybe bruteforce :: (a -> Bool) -> [a] -> Maybe a bruteforce f xs | null result = Nothing | otherwise = Just $ head result where result = mapMaybe bruteforce' xs -- test one instance bruteforce' x | fx = Just x | otherwise = Nothing generatorString :: Int -> [String] generatorString 0 = [""] generatorString deep = concatMap (\x -> map (\ys -> (x:ys)) nextgen) ['a'..'z'] where nextgen = generatorString (deep - 1) main :: IO () main = do putStrLn $ fromJust $ bruteforce ((==) "zabcde") (generatorString 6) 

also available as gist . yes, the test function is stupid, but that’s enough to show what the problem is. It works as expected, but the memory consumption is quite large:

 $ ghc --make -O2 brute.hs -o brute -rtsopts $ ./brute +RTS -s zabcde 24,752,866,120 bytes allocated in the heap 15,748,430,224 bytes copied during GC 581,741,352 bytes maximum residency (22 sample(s)) 18,464,056 bytes maximum slop 1719 MB total memory in use (0 MB lost due to fragmentation) [...] 

so I tried to get RTS to use less memory (i.e. most often call GC). 200 MB should be enough?

 $ ./brute +RTS -s -M200M Heap exhausted; Current maximum heap size is 209715200 bytes (200 MB); use `+RTS -M<size>' to increase it. 

well, no (I still would not like this approach ...)

Any pointers to how bruteforce can be rewritten to be more memory friendly?

+6
source share
4 answers

If memory consumption is your primary concern, stop sharing nextgen . This is a huge list.

So replace

 generatorString deep = concatMap (\x -> map (\ys -> (x:ys)) nextgen) ['a'..'z'] where nextgen = generatorString (deep - 1) 

from

 generatorString deep = concatMap (\x -> map (\ys -> (x:ys)) $ generatorString (deep - 1)) ['a'..'z'] 

and tell the compiler that you are not seriously sharing it:

 $ ghc -O2 -fno-full-laziness -rtsopts bruteforce 

It works in

  $ ./bruteforce +RTS -s zabcde 189,448,835,904 bytes allocated in the heap 18,301,350,520 bytes copied during GC 29,504 bytes maximum residency (16901 sample(s)) 37,248 bytes maximum slop 2 MB total memory in use (0 MB lost due to fragmentation) 

small resident memory. Of course, recalculation means that the overall distribution is much higher, and it takes much longer to calculate the result.

A better algorithm can reduce space and time consumption.

+10
source

You can skip all those Just and Nothing , I think ...

 import Data.Maybe (listToMaybe) bruteforce :: (a -> Bool) -> [a] -> Maybe a bruteforce f = listToMaybe . filter f 

It is probably just as good for memory as bruteforce ; any other problems arise due to the function f , which is roughly forced.


The generatorString function can be rewritten as follows:

 import Control.Monad (replicateM) generatorString :: Int -> [String] generatorString = flip replicateM ['a'..'z'] 

If you want me to explain how this works, let me know in the comments. What is improving is that it uses prefix sharing instead of suffix sharing. That is, it generates strings as follows:

 "aa" "ab" ... "az" "ba" 

... instead of:

 "aa" "ba" ... "za" "ab" "bb" 

This means that the prefix (in this simple example, only ('a':) and then ('b':) ) is common, instead of storing a list of suffixes and reusing it (list ["a", "b", "c", ..., "z"] in this example). You can imagine that as n increases, the list of suffixes will grow with 26^n , and the list of prefixes will grow with n . Of course, this is associated with the cost of building the entire current line at each iteration, but memory usage is much lower.

+7
source

I think the generator is too strict. It is assumed that the optimal generator will give as many results as possible with minimal information about the result of the recursive application.

Consider the following line.

 concatMap (\x -> map (\ys -> (x:ys)) nextgen) ['a'..'z'] 

Now let's check what happens if we only know that nextgen is not an empty list. To illustrate this, we will replace the nextgen variable nextgen expression undefined:undefined . The following equations illustrate the evaluation of the expression in question.

  concatMap (\x -> map (\ys -> (x:ys)) (undefined:undefined)) ['a'..'z'] = concat (map (\x -> map (\ys -> (x:ys)) (undefined:undefined)) ['a'..'z']) = concat (map (\ys -> ('a':ys)) (undefined:undefined) : map (\x -> map (\ys -> (x:ys)) (undefined:undefined)) ['b'..'z']) = concat (('a':undefined) : undefined) : map (\x -> map (\ys -> (x:ys)) (undefined:undefined)) ['b'..'z']) = ('a':undefined) : undefined 

Your particular application may already refuse many results by comparing the first character of the generated string and the search string. Therefore, we are looking for a generator that gives heads as quickly as possible.

Let's change the roles of static information (list of characters) and a recursive application. We get the following expression.

 concatMap (\ys -> map (:ys) ['a'..'z']) nextgen 

Now we make the same calculation as before.

  concatMap (\ys -> map (:ys) ['a'..'z']) (undefined:undefined) = concat (map (\ys -> map (:ys) ['a'..'z']) (undefined:undefined)) = concat (map (:undefined) ['a'...'z'] : map (\ys -> map (:ys) ['a'..'z']) undefined) = map (:undefined) ['a'...'z'] ++ concat (map (\ys -> map (:ys) ['a'..'z']) undefined 

The map (:undefined) ['a'...'z'] application map (:undefined) ['a'...'z'] already provides a list in which all chapters are defined. Thus, the test may already fail for most of these lines, only evaluating the recursive application to handle the normal form.

With this modified implementation, we get the following results.

 $ ./brute +RTS -s zabcde 4,165,170,696 bytes allocated in the heap 5,569,320 bytes copied during GC 29,848 bytes maximum residency (5 sample(s)) 26,560 bytes maximum slop 2 MB total memory in use (0 MB lost due to fragmentation) 

However, since this change is very specific to the application at hand, it may not apply to your actual application.

+6
source

Since no one seems to have mentioned this, I would like to point out that

bruteforce = Data.List.find

+1
source

Source: https://habr.com/ru/post/920009/


All Articles