Using referential transparency to pre-compute values ​​in haskell

Say we have a program like this:

list = [1..10000000] main = print $ sum list 

I want this to be compiled in such a way that the executable simply prints 50000005000000, without spending so much time and effort.

Basically, any expression that will necessarily be computed (maybe rigor analysis can help here) can be precomputed at compile time (i.e. we use referential transparency to say that it doesn't really matter when calculating the value )

In short: "must be computed" + referential transparency = can be precomputed

It will be like starting a program until we press something that depends on the input (i.e. the core of the program, which is common to all inputs, will be pre-computed)

Is there an existing mechanism to achieve this at present (in Haskell or in any other language)? [Please do not point to something like templates in C ++, because it does not have referential transparency in the first place.]

If not, how complicated is this problem? [What are the related technical (and theoretical) issues?]

+6
source share
3 answers

The general purpose response to doing compile-time calculations is to use the Haskell pattern. But for this specific use case, you can use vector and LLVM server, and GHC optimizes this amount.

 sorghum:~% cat >test.hs import Data.Vector.Unboxed as V main = print (V.sum $ V.enumFromN (1 :: Int) 10000000) sorghum:~% ghc --make -O2 -fllvm test.hs [1 of 1] Compiling Main ( test.hs, test.o ) Linking test ... sorghum:~% time ./test 50000005000000 ./test 0.00s user 0.00s system 0% cpu 0.002 total sorghum:~% objdump -d test | grep 2d7988896b40 40653d: 48 bb 40 6b 89 88 79 movabs $0x2d7988896b40,%rbx 406547: 49 be 40 6b 89 88 79 movabs $0x2d7988896b40,%r14 

(If this is not immediately obvious, 0x2d79888896b40 is 50000005000000 )

+11
source

This is unsafe in the general case. The reason is that Haskell's expressions may be pure, but they may also not end. The compiler should always end, so the best thing you could do is "evaluate the 1000 steps of this result." 1 But if you add such a limit, what if you compile a program to run on a supercomputer cluster with terabytes of RAM, and the compiler runs out of memory?

You can add many restrictions, but in the end you will reduce the optimization to a slow form of constant folding (especially for most programs whose calculations depend on user input at runtime). And since sum [1..10000000] takes half a second, it is unlikely that it will fall under any reasonable limit.

Of course, specific cases like this can often be optimized, and the GHC often optimizes complex complex expressions like this. But the compiler cannot just evaluate any expression at compile time; this should be very limited, and this proves how useful it would be for these restrictions. This is a compiler, not an interpreter :)

1 That would significantly slow down the compilation of any program that contains many endless loops, which, since Haskell is not strict, is more likely than you think). Or, more often, any program that contains many lengthy calculations.

+10
source

Sounds like work for supercompilation! The question sounds like a description of it, and the discussion of non-termination reflects the problems faced by supercompiler developers. I saw on the GHC wiki that someone was working on a supercompiler for production, but did not.

0
source

Source: https://habr.com/ru/post/912871/


All Articles