I have a Haskell program that generates ~ 280M to write text data during a run inside the ST monad. Here, almost all memory consumption goes (with the protocol disabled, the program allocates a total of 3 MB of real memory).
The problem is that my memory is running out. While the program memory consumption exceeds 1.5 GB, and finally, it ends when it tries to write a log line to a file.
The log function takes a string and accumulates the log data into the line builder stored in STRef in the environment:
import qualified Data.ByteString.Lazy.Builder as BB ... myLogFunction s = do ... lift $ modifySTRef myStringBuilderRef (<> BB.stringUtf8 s)
I tried to introduce rigor using beat patterns and modify STRef ', but this further worsened memory consumption.
I am writing a log line according to the recommendation of the hPutBuilder documentation, for example:
hSetBinaryMode h True hSetBuffering h $ BlockBuffering Nothing BB.hPutBuilder h trace
This consumes several additional GB of memory. I tried various buffering settings and first converted to lazy ByteString (slightly better).
Qs:
How can I minimize memory consumption while the program is running? I would expect that given the hard-coded ByteString representation and the corresponding degree of rigor, I would need a little more memory than the ~ 280M of actual log data that I store.
How to write the result to a file without allocating memory? I don’t understand why Haskell needs GBs of memory to just transfer some resident data to a file.
Edit:
Here's a memory profile for a small run (~ 42 MB of log data). Total memory usage is 3 MB with log disabled.
15,632,058,700 bytes allocated in the heap 4,168,127,708 bytes copied during GC 343,530,916 bytes maximum residency (42 sample(s)) 7,149,352 bytes maximum slop 931 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 29975 colls, 0 par 5.96s 6.15s 0.0002s 0.0104s Gen 1 42 colls, 0 par 6.01s 7.16s 0.1705s 1.5604s TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 32.38s ( 33.87s elapsed) GC time 11.97s ( 13.31s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 44.35s ( 47.18s elapsed) Alloc rate 482,749,347 bytes per MUT second Productivity 73.0% of total user, 68.6% of total elapsed
Edit:
I ran a memory profile with a little log run:
profile http://imageshack.us/a/img14/9778/6a5o.png
I tried to add beat patterns, $ !, deepseq / $ !!, force, etc. in appropriate places, but it does not seem to make any difference. How to get Haskell to actually take my string / printf expression etc. And put it in a tight ByteString instead of storing all those [Char] lists and unappreciated tricks around?
Edit:
Here's the actual full trace function
trace s = do enable <- asks envTraceEnable when (enable) $ do envtrace <- asks envTrace let b = B8.pack s lift $ b `seq` modifySTRef' envtrace (<> BB.byteString b)
Is this "strict" enough? Do I need to keep track of anything if I call this typeclass function inside my ReaderT / ST monad? Just so that it is actually called and not put off in any way.
do trace $ printf "%i" myint
excellent?
Thanks!