How to summarize reads from url and file in Haskell

I am developing an application that borrows data from the Internet in chunks with a given offset. For testing purposes, I have a dump file that contains lines, where each line corresponds to a separate fragment. I want to generalize the read operations from the url and dump file. I currently have the following functions:

getChunk :: DataSourceMode -> Config -> Int -> Int -> IO FetchResult getChunk DSNormal config ownerId' offset' = do ... getChunk DSFromFile config ownerId' offset' = do ... 

The problem with the current implementation is that it reads the dump file with every call to getChunk and is obviously inefficient. The first idea is to save the lines from the dump file to a list, but then it would not be easy to generalize it with reading from url. I believe that channels or pipes can be used to create the source of the pieces, but I am not familiar with these libraries; should one of them be used or maybe the best solution?

+5
source share
1 answer

I ended up with channels. The generic processFeed function is used as a receiver, and then data from postUrlSource or Data.Conduit.Binary.sourceFile is inserted into it, depending on the mode.

 import Data.Conduit.Binary as CB(sourceFile, conduitFile, lines) processFeed :: MonadIO m => Config -> OwnerId -> (OwnerId -> [Post] -> IO ()) -> Sink BS.ByteString m FetchResult processFeed config ownerId' processFn = do ... postUrlSource :: MonadIO m => Config -> OwnerId -> Source (StateT FetchState (m)) BS.ByteString postUrlSource config ownerId' = do ... ... _ <- case (dsMode config) of DSFromFile -> do runResourceT $ CB.sourceFile dumpFile $= CB.lines $$ (processFeed config publicId' saveResult) DSNormal -> do let postsFromUrlConduit = (postUrlSource config publicId') $$ (processFeed config publicId' saveResult) fetchedPosts <- runStateT postsFromUrlConduit (FetchState 0 "") return $ fst fetchedPosts ... 

StateT is used for the case when we retrieve data from a URL, so each piece is retrieved with a new offset. To read from a file, this is IO monad, it just reads the lines sequentially from the dump.

+4
source

Source: https://habr.com/ru/post/1200378/


All Articles