Although I have a good implementation of LSFR C, I thought I would try the same in Haskell - just to see how this happens. What I came up with is currently two orders of magnitude slower than the C implementation, which asks the question: How to improve performance? Obviously, bit operations are a bottleneck, and the profiler confirms this.
Here's the basic Haskell code using lists and Data.Bits :
import Control.Monad (when) import Data.Bits (Bits, shift, testBit, xor, (.&.), (.|.)) import System.Environment (getArgs) import System.Exit (exitFailure, exitSuccess) tap :: [[Int]] tap = [ [], [], [], [3, 2], [4, 3], [5, 3], [6, 5], [7, 6], [8, 6, 5, 4], [9, 5], [10, 7], [11, 9], [12, 6, 4, 1], [13, 4, 3, 1], [14, 5, 3, 1], [15, 14], [16,15,13,4], [17, 14], [18, 11], [19, 6, 2, 1], [20, 17], [21, 19], [22, 21], [23, 18], [24,23,22,17], [25, 22], [26, 6, 2, 1], [27, 5, 2, 1], [28, 25], [29, 27], [30, 6, 4, 1], [31, 28], [32,22,2,1], [33,20], [34,27,2,1], [35,33], [36,25], [37,5,4,3,2,1],[38,6,5,1], [39,35], [40,38,21,19], [41,38], [42,41,20,19], [43,42,38,37], [44,43,18,17], [45,44,42,41], [46,45,26,25], [47,42], [48,47,21,20], [49,40], [50,49,24,23], [51,50,36,35], [52,49], [53,52,38,37], [54,53,18,17], [55,31], [56,55,35,34], [57,50], [58,39], [59,58,38,37], [60,59], [61,60,46,45], [62,61,6,5], [63,62] ] xor' :: [Bool] -> Bool xor' = foldr xor False mask :: (Num a, Bits a) => Int -> a mask len = shift 1 len - 1 advance :: Int -> [Int] -> Int -> Int advance len tap lfsr | d0 = shifted | otherwise = shifted .|. 1 where shifted = shift lfsr 1 .&. mask len d0 = xor' $ map (testBit lfsr) tap' tap' = map (subtract 1) tap main :: IO () main = do args <- getArgs when (null args) $ fail "Usage: lsfr <number-of-bits>" let len = read $ head args when (len < 8) $ fail "No need for LFSR" let out = last $ take (shift 1 len) $ iterate (advance len (tap!!len)) 0 if out == 0 then do putStr "OK\n" exitSuccess else do putStr "FAIL\n" exitFailure
It basically checks to see if the LSFR defined in tap :: [[Int]] is for any given maximum bit length. (More precisely, it simply checks to see if the LSFR reaches its initial state (zero) after 2 n iterations.)
According to the profiler, the most expensive line is the feedback bit d0 = xor' $ map (testBit lfsr) tap' .
What I have tried so far:
- use
Data.Array : Attempt thrown because no foldl / r - use
Data.Vector : slightly faster than the baseline
The compiler options I use are: -O2 , LTS Haskell 8.12 (ghc-8.0.2) .
Link to C ++ - the program can be found at gist.github.com .
Haskell (?) Code cannot be expected to run as fast as C code, but two orders of magnitude too much, there should be a better way to do bit-driving.
Update: results of applying optimizations proposed in answers
- The C ++ help program with input 28, compiled with LLVM 8.0.0, runs on 0.67 from my machine (same thing with clang 3.7 a bit slower, 0.68 s)
- Haskell source code runs about 100 times slower (due to space inefficiency, do not try to use it with inputs greater than 25)
- With the rewriting of @ thomas-m-dubuisson still using the default GHC backend, runtime comes down to 5.2s
- With the rewriting of @ thomas-m-dubuisson, now using the LLVM backend (option ghc
-O2 -fllvm ), the execution time is reduced to 1.7 s- Using the ghc option
-O2 -fllvm -optlc -mcpu=native results in 0.73s
- Replacing
iterate with iterate' for @cirdec does not matter when Thomas code is used (both with source code "native" and LLVM by default). However, it does matter when the underlying code is used.
So, we came from 100x to 8x to 1.09x, that is, only 9% slower than C!
Note LLVM backend for Ghc 8.0.2 requires LLVM 3.7. On Mac OS X, this means installing this version with brew , and then opt and llc symbolic binding. See 7.10. GHC Backends .