Similar to this question @Gabriel Gonzalez: How to make fast data deserialization in Haskell
I have a big map full of integers and text, which I serialize using Cerial. The file is about 10 m.
Every time I run my program, I deserialize it all so that I can find a few elements. Deserialization takes about 500 ms, which doesn't really matter, but I always like to profile on Friday.
It seems wasteful to always deserialize from 100k to 1M elements when I only need a few of them.
I tried decodeLazy and also changed the map to Data.Map.Lazy (not quite understanding how Map could be Lazy, but fine, there), and this does not affect the time, except maybe a little slower.
I am wondering if there is something that can be a little smarter, just download and decrypt what is needed. Of course, a database such as sqlite can be very large, but it only loads what is needed to complete the query. I would like to find something similar, but without creating a database schema.
Update
Do you know what would be great? Some merger of Mongo with Sqlite. Just as you might have a database of JSON documents using file storage ... and of course, someone did this https://github.com/hamiltop/MongoLiteDB ... in Rubin: (
Thought mmap can help. Tried mmap library and surpassed GHCI for the first time. I donβt know how to even report this error.
The bytestring bytestring-mmap library was bytestring-mmap , and it works, but not improved. Just replace this:
ser <- BL.readFile cacheFile
Wherein:
ser <- unsafeMMapFile cacheFile
Update 2
keyvaluehash can only be a ticket. Performance seems really good. But the API is strange, the documentation is missing, so it will take some time.
Update 3: I'm an idiot
Clearly, I want a non-lazy deserialization of the Map here. I need a key database and there are several options available such as dvm, tokyo-cabinet and this DB level that I have never seen before.
Keyvaluehash looks like a base database with a Haskell key, which I like, but I still don't know about quality. For example, you cannot query the database for a list of all keys or all values ββ(the only valid operations are readKey , writeKey and deleteKey ), so if you need it, then you need to store them in another place. Another disadvantage is that you must specify the size when creating the database. I used a size of 20M, so I would have enough space, but the actual created database takes up 266M. I donβt know why, because there is no documentation.