I am trying to parse a string that may contain escaped characters, here is an example:
import qualified Data.Text as T
exampleParser :: Parser T.Text
exampleParser = T.pack <$> many (char '\\' *> escaped <|> anyChar)
where escaped = satisfy (\c -> c `elem` ['\\', '"', '[', ']'])
The parser above creates String
and then packs it in Text
. Is there a way to parse a string with screens like the ones above using functions to efficiently handle strings that attoparsec provides? As String
, scan
, runScanner
, takeWhile
,...
An analysis of something like "one \"two\" \[three\]"
will result in one "two" [three]
.
Update
Thanks to @epsilonhalbe, I was able to come up with a generic solution, perfect for my needs; Please note that the following function is not seeking appropriate escapes, such as [..]
, ".."
, (..)
etc .; and also, if it finds an escape character that is invalid, it treats it \
as a literal.
takeEscapedWhile :: (Char -> Bool) -> (Char -> Bool) -> Parser Text
takeEscapedWhile isEscapable while = do
x <- normal
xs <- many escaped
return $ T.concat (x:xs)
where normal = Atto.takeWhile (\c -> c /= '\\' && while c)
escaped = do
x <- (char '\\' *> satisfy isEscapable) <|> char '\\'
xs <- normal
return $ T.cons x xs
source
share