I want to count the number of occurrences of each character in a large file. Although I know that counting should be strictly implemented in Haskell (which I tried to achieve with foldl), I still do not have enough memory. For comparison: the file size is about 2 GB, and the computer is 100 GB of memory. There are not many different characters in this file - maybe 20. What am I doing wrong?
ins :: [(Char,Int)] -> Char -> [(Char,Int)]
ins [] c = [(c,1)]
ins ((c,i):cs) d
| c == d = (c,i+1):cs
| otherwise = (c,i) : ins cs d
main = do
[file] <- getArgs
txt <- readFile file
print $ foldl' ins [] txt
source
share