Why does this Haskell program lose space when compiling with optimizations?

Consider the following toy program, which calculates all combinations of character substitutions in a word that are often used in passwords.

import Data.Char (isLower, toUpper) variants :: String -> [String] variants "" = [""] variants (c:s) = [c':s' | c' <- subst c, s' <- variants s] where subst 'a' = " aA@ " subst 'e' = "eE3" subst 'i' = "iI1" subst 'l' = "lL1" subst 'o' = "oO0" subst 's' = "sS$5" subst 'z' = "zZ2" subst x | isLower x = [x, toUpper x] subst x = [x] main :: IO () main = putStrLn $ show $ length $ variants "redistributables" 

I will compile it with and without optimization:

 $ ghc -fforce-recomp -Wall Test.hs -o test0 [1 of 1] Compiling Main ( Test.hs, Test.o ) Linking test0 ... $ ghc -fforce-recomp -O -Wall Test.hs -o test1 [1 of 1] Compiling Main ( Test.hs, Test.o ) Linking test1 ... 

Now test0 and test1 produce the same output, but test1 uses much more memory and spends most of its time collecting garbage:

 $ ./test0 +RTS -s 2>&1 | grep total 2 MB total memory in use (0 MB lost due to fragmentation) Productivity 93.2% of total user, 93.3% of total elapsed $ ./test1 +RTS -s 2>&1 | grep total 188 MB total memory in use (0 MB lost due to fragmentation) Productivity 15.0% of total user, 15.0% of total elapsed 

Why?

Im using GHC 7.4.1; I should probably use a newer compiler, but at the moment this is what I have, and the error probably still lies with me.

+6
source share
2 answers

Do you want to

 variants (c:s) = [c':s' | c' <- subst c, s' <- variants s] 

to compile into the outer loop and inner loop. But the GHC sees that the inner loop is in no way dependent on an external "loop counter". Therefore, a complete conversion of laziness raises the inner loop from the outer loop. One pretty effective trick is to hide the fact that the inner loop is independent. This is done by dividing the inner loop into a separate function that takes a dummy argument, and hiding fictitiousness, designating the function as NOINLINE . Then you can call the function using the outer loop counter, and the GHC usually refrains from messing with you.

+5
source

The trick is to cause the redistribution of suffixes instead of storing them in memory. It looks like

 powerset (x:xs) = map (x:) (powerset xs) ++ powerset xs 

where adding the where clause is harmful (or is it powerset (x:xs) = powerset xs ++ map (x:) (powerset xs) ...?).

In your case, the code to check is mapM subst or

 variants (c:cs) = variants cs >>= \s-> map (:s) (subst c) 

You can see that the latter works in the β€œopposite direction” from your list comprehension code, so maybe just

 variants (c:s) = [c':s' | s' <- variants s, c' <- subst c] 

will work too.

All this is equivalent, so this is a compiler. Hope someone can provide more details on this.

+3
source

Source: https://habr.com/ru/post/989642/


All Articles