Avoiding performance degradation during lengthy parallel and parallel Haskell computing

I have an AWS instance. I would like to run a bunch of tasks, some memory and processor. Ideally, I would like to calculate time information for each task. If I run them in sequential order, it calculates accurate time information, but slowly. If I run them in parallel, everything happens faster, but individual tasks run slower, which is reported by both the time on the wall and the processor time.

This slowdown increases as the number of threads increases to the number of processors.

A cursory study with ghc-events-analyzeand +RTS -ssuggests that the source of moderation (unsurprisingly) GC is paused. Playback using RTS parameters shows that +RTS -qg -qb -qa -A256m(disable parallel GC, disable load balancing GC, disable thread migration and increase the GC allocation area) improves this, but does not completely eliminate it.

I use streams using forkIO, but the streams are independent and clean, apart from the print progress information. I use parallel-io to control the number of threads running, but when I briefly tried the more traditional approach with a fixed thread pool and task queue, I still had this problem.

Any suggestions for debugging?

EDIT:

@jberryman asked for an example. Each of the tasks is as follows:

computation params = do
  !x <- force params
  print $ "Starting computation on " ++ show params
  t1 <- getCPUTime
  !y <- fmap force $ do $
    ...some work with x ...
  t2 <- getCPUTime
  print $ "Finished computation on " ++ show params
  return (t2 - t1, y)
+4
1

, AWS (, Linux), , , , forkProcess. , GC-, , , , , , , .

+2

Source: https://habr.com/ru/post/1659628/


All Articles