This is probably the easiest way, like a fold over a decoded text stream.
{-#LANGUAGE BangPatterns #-} import Pipes import qualified Pipes.Prelude as P import qualified Pipes.ByteString as PB import qualified Pipes.Text.Encoding as PT import qualified Control.Foldl as L import qualified Control.Foldl.Text as LT main = do n <- L.purely P.fold (LT.count '\n') $ void $ PT.decodeUtf8 PB.stdin print n
It takes about 14% longer than wc -l for the file I created, which was just long lines of commas and numbers. IO should be done using Pipes.ByteString , as indicated in the documentation, the rest are various amenities.
You can match the attoparsec parser for each line, which has different view lines , but keep in mind that the attoparsec parser can accumulate all the text as you like, and this can be a great idea for 1 gigabyte piece of text, If there is a repeating digit on each line ( for example, word-separated numbers), you can use Pipes.Attoparsec.parsed to stream them.
source share