I am in the middle of porting David Bleigh's original C-implementation of highlighting the hidden Dirichlet distribution in Haskell, and I'm trying to decide whether to leave some of the low-level material in C. The next function, one example, is to approximate the second derivative of lgamma :
double trigamma(double x) { double p; int i; x=x+6; p=1/(x*x); p=(((((0.075757575757576*p-0.033333333333333)*p+0.0238095238095238) *p-0.033333333333333)*p+0.166666666666667)*p+1)/x+0.5*p; for (i=0; i<6 ;i++) { x=x-1; p=1/(x*x)+p; } return(p); }
I translated this into a more or less idiomatic Haskell as follows:
trigamma :: Double -> Double trigamma x = snd $ last $ take 7 $ iterate next (x' - 1, p') where x' = x + 6 p = 1 / x' ^ 2 p' = p / 2 + c / x' c = foldr1 (\ab -> (a + b * p)) [1, 1/6, -1/30, 1/42, -1/30, 5/66] next (x, p) = (x - 1, 1 / x ^ 2 + p)
The problem is that when I run both through Criterion , my Haskell version is six or seven times slower (I compile with -O2 on GHC 6.12.1). Some of these features are even worse.
I don't know anything about Haskell's performance, and I'm not really interested in digging through Core or something like that, since I can always just call a handful of C-functions with mathematical intensity through FFI.
But I'm curious if there are low-grade fruits that I miss - some kind of extension or library or annotation that I could use to speed up this numerical material without making it too ugly.
UPDATE: Here are two best solutions, thanks to Don Stewart and Yitz . I modified Yitz a bit to use Data.Vector .
invSq x = 1 / (x * x) computeP x = (((((5/66*p-1/30)*p+1/42)*p-1/30)*p+1/6)*p+1)/x+0.5*p where p = invSq x trigamma_d :: Double -> Double trigamma_d x = go 0 (x + 5) $ computeP $ x + 6 where go :: Int -> Double -> Double -> Double go !i !x !p | i >= 6 = p | otherwise = go (i+1) (x-1) (1 / (x*x) + p) trigamma_y :: Double -> Double trigamma_y x = V.foldl' (+) (computeP $ x + 6) $ V.map invSq $ V.enumFromN x 6
The performance of the two, apparently, is almost the same: one or the other gain in percentage or two, depending on the flags of the compiler.
As camccann said in Reddit , the moral of the story is βFor best results, use Don Stewart as a GHC code generator.β If you prohibit this decision, the safest bet seems to be to simply convert the C management structures directly to Haskell, although merging fusion can produce similar performance in a more idiomatic style.
I will probably end up using the Data.Vector approach in my code.