I'm having problems using the conduit lib channel to split text by line.
The raw data that I work with, unfortunately, it is not consistent with the end of the line containing the sequence \r\nand \nin the same file.
I found the function linesin Data.Conduit.Binary, but it "breaks" into one byte ( \n, reasonably reasonably), which in some cases leaves me with a tail \r.
I understand why the current implementation works the way it is, and I'm basically sure I can hack some kind of solution together, but the only way I could do it is something like:
lines' = do
loop $ T.pack ""
where loop acc = do
char <- await
case char of
Nothing -> return ()
Just x -> do
case (isOver $ acc `T.append` x) of
(True,y) -> yield y
(False,y) -> loop y
where isOver n
| (T.takeEnd 2 n == _rLn) = (True, T.dropEnd 2 n)
| (T.takeEnd 1 n == _Ln) = (True, T.dropEnd 1 n)
| otherwise = (False,n)
where _rLn = T.pack $! "\r\n"
_Ln = T.pack $! "\n"
... which seems inelegant, kludgy and terribly slow.
, , , "", , , \r, .
- ?