Interesting about HashTable Performance Issues

I read that the hash tables in Haskell had performance issues (in the Haskell-Cafe in 2006 and the Flying Frog Consultancy blog in 2009), and since I like Haskell, it bothered me.

That was a year ago, what is the status now (June 2010)? Is the hash table fixed in the GHC?

+49
hashtable haskell ghc
Jun 17 '10 at 2:29
source share
3 answers

The problem was that the garbage collector must traverse modified pointer arrays (“boxed arrays”) that are looking for pointers to data that might be ready to be freed. Arranged, modified arrays are the main mechanism for implementing a hash table, so in a particular structure there was a problem of GC bypass. This is typical for many languages. Symptom is excessive garbage collection (up to 95% of the time spent in the GC).

The fix was to implement "map marking" in the GC for mutable pointer arrays that occurred at the end of 2009. You should not see excessive GC when using mutable pointer arrays in Haskell. In simple tests, adding a hash table for large hashes improved by 10 times.

Note that the GC walking problem does not affect purely functional structures , as well as non-decompressed arrays (for example, most data are parallel arrays or vector- like arrays in Haskell. It also does not affect hashtables stored on the C heap (for example, judy ) This means that it did not affect everyday Haskellers without using real hash tables.

If you use hashtables in Haskell, you should not be observing any problem now. Here, for example, there is a simple hash table that inserts 10 million ints into the hash. I will do benchmarking as the source quote does not contain any code or tests.

import Control.Monad import qualified Data.HashTable as H import System.Environment main = do [size] <- fmap (fmap read) getArgs m <- H.new (==) H.hashInt forM_ [1..size] $ \n -> H.insert mnn v <- H.lookup m 100 print v 

With GHC 6.10.2, before patching, insert 10M ints:

 $ time ./A 10000000 +RTS -s ... 47s. 

With GHC 6.13 after correction:

 ./A 10000000 +RTS -s ... 8s 

Increase default heap area:

 ./A +RTS -s -A2G ... 2.3s 

Avoiding hash tables and using IntMap:

 import Control.Monad import Data.List import qualified Data.IntMap as I import System.Environment main = do [size] <- fmap (fmap read) getArgs let k = foldl' (\mn -> I.insert nnm) I.empty [1..size] print $ I.lookup 100 k 

And we get:

 $ time ./A 10000000 +RTS -s ./A 10000000 +RTS -s 6s 

Or, alternatively, using a judy array (which is a Haskell wrapper calling C code through an external function interface):

 import Control.Monad import Data.List import System.Environment import qualified Data.Judy as J main = do [size] <- fmap (fmap read) getArgs j <- J.new :: IO (J.JudyL Int) forM_ [1..size] $ \n -> J.insert (fromIntegral n) nj print =<< J.lookup 100 j 

By running this

 $ time ./A 10000000 +RTS -s ... 2.1s 

So, as you can see, the GC issue with hashtables has been fixed, and there have always been other libraries and data structures that were perfect. So this is not a problem.

Note: as of 2013, you probably should just use the hashtables package, which supports a range of mutable hash tables natively.

+132
Jun 17 '10 at 2:40
source share

A question like this can really only be solved by experiment. But if you don’t have the time or money to experiment, you should ask other people what they think. When you do this, you might want to look at the source and think about whether any information has been verified or verified.

John Harrop made some interesting statements about Haskell. Let me suggest that you search Google Groups and elsewhere to confirm Harrop's experience in Haskell, Lisp, and other functional languages. You can also read the work of Chris Okasaki and Andy Gill on the trees of Patricia in Haskell, see how their experience is viewed. You can also find whose claims, if any, have been verified by a third party. Then you can think for yourself how to seriously accept the different demands of people about the performance of different functional languages.

Oh, and don't feed the troll.




PS It would be quite reasonable if you made your own experiments, but perhaps not necessarily, since the reliable Don Stewart in his excellent answer offers pleasant microfunctions . Here is an addendum to Don:




Appendix: Using the Don Stewart code on an AMD Phenom 9850 Black Edition clocked at 2.5 GHz with 4 GB of RAM in 32-bit mode with ghc -O ,

  • With the default heap, IntMap is 40% faster than a hash table.
  • With a bunch of 2G, the hash table is 40% faster than IntMap .
  • If I go to ten million elements with a bunch of defaults, IntMap will be four times faster than a hash table (CPU time) or twice as fast as wall time.

I am a little surprised by this result, but assured that functional data structures work very well. And he confirmed in his conviction that it is really worth planning your code in the real world in which it will be used.

+25
Jun 17 2018-10-06T00:
source share

In short, even with a fix in the latest GHC, Haskell is still unable to provide a dictionary (mutable or immutable) that is competitive.

Haskell hash tables were 32 × slower than alternatives like C ++ and .NET with GHC 6.10. This was partly due to a performance error . But the results of Simon Marlow show only a 5 × performance improvement that still leaves Haskell hash tables many times slower than most alternatives.

Purely functional alternatives are also much slower than a decent hash table. For example, Haskell IntMap is 10 times slower than a .NET hash table .

Using F # 2010 and the latest version of Haskell Platform 2010.2.0.0 (released yesterday!) With GHC 6.12.3 on this 2.0GHz E5405 Xeon running 32-bit Windows Vista to insert 20M int-> int bindings into an empty hash table, we find that Haskell is still 29 × slower than F # in real time and more than 200 × slower in terms of CPU time, since Haskell burns all the cores: / p>

  GHC 6.12.3 Data.HashTable: 42.8s (new!)
 .NET hash table: 1.47s 

If you run only short-term micro-captures, you can turn off the GHC garbage collector that Don Stewart talks about. Asking for such child production to be so large that this particular program would never fill it, he brought the time for the Haskell hash table to 1.5 s here. However, this completely undermines the whole point of creating a kindergarten and will significantly degrade the performance of another code, because freshly released values ​​will now always be cold in the cache (therefore, the generation of kindergarten usually represents the size of the L2 cache, an order of magnitude smaller than this).

+6
Jun 18 '10 at 22:44
source share



All Articles