Trees
You were right that you should probably use a tree to store data. I will copy how Data.Tree does Data.Tree :
data Tree a = Node a (Forest a) deriving (Show) type Forest a = [Tree a]
Building a tree
Now we want to take your weakly typed list of tuples and convert it to (slightly) stronger Tree from String s. Each time you need to convert a weakly typed value and check it before converting to a stronger type, you use Parser :
type YourData = [(Int, [String])] type Parser a = YourData -> Maybe (a, YourData)
A type synonym for YourData is the weak type that you are processing. A variable of type a is the value that you extract from the analysis. Our Parser type returns Maybe , because Parser may fail. To understand why the following input does not match a valid Tree , since it lacks level 1 of the tree:
[(0, ["val1"]), (2, ["val2"])]
If Parser succeeds, it also returns an unused input so that subsequent parsing steps can use it.
Now, oddly enough, the above type of Parser exactly matches the famous monad transformer stack:
StateT s Maybe a
This can be seen if you deploy the base implementation of StateT :
StateT s Maybe a ~ s -> Maybe (a, s)
This means that we can simply determine:
import Control.Monad.Trans.State.Strict type Parser a = StateT [(Int, [String])] Maybe a
If we do this, we will get a copy of Monad , Applicative and Alternative for our Parser type for free. This makes it easy to define parsers!
First we need to define a primitive parser that consumes a single tree node:
parseElement :: Int -> Parser String parseElement level = StateT $ \list -> case list of [] -> Nothing (level', strs):rest -> case strs of [str] -> if (level' == level) then Just (str, rest) else Nothing _ -> Nothing
This is the only non-trivial piece of code we need to write that, since it is complete, handles all of the following corner cases:
- the list is empty
- Your node has several meanings in it
- The number in the tuple does not match the expected depth
In the next part, everything becomes very elegant. Then we can define two mutually recursive parsers: one for parsing a Tree and one for parsing a Forest :
import Control.Applicative parseTree :: Int -> Parser (Tree String) parseTree level = Node <$> parseElement level <*> parseForest (level + 1) parseForest :: Int -> Parser (Forest String) parseForest level = many (parseTree level)
The first parser uses the Applicative style, as StateT provided us with an Applicative instance for free. However, instead, I could use an instance of StateT Monad to give code that is more readable for an imperative programmer:
parseTree :: Int -> Parser (Tree String) parseTree level = do str <- parseElement level forest <- parseForest (level + 1) return $ Node str forest
But what about the many function? What does it do? Let's look at its type:
many :: (Alternative f) => fa -> f [a]
What is required is that it returns a value and implements Applicative , and instead calls it again to return a list of values ββinstead. When we defined our Parser type in State terms, we got an Alternative instance for free, so we can use the many function to convert what one Tree (i.e. parseTree ) parseTree into something that a Forest (i.e. parseForest ).
To use our Parser , we simply rename the existing StateT function to make its purpose understandable:
runParser :: Parser a β [(Int, [String])] β Perhaps runParser = evalStateT
Then we just run it!
>>> runParser (parseForest 0) [(0,["day 1"]),(1,["Person 1"]),(2,["Bill 1"]),(1,["Person 2"]),(2,["Bill 2"])] Just [Node "day 1" [Node "Person 1" [Node "Bill 1" []],Node "Person 2" [Node "Bill 2" []]]]
This is just magic! Let's see what happens if we give it an invalid input:
>>> runParser (parseForest 0) [(0, ["val1"]), (2, ["val2"])] Just [Node "val1" []]
He succeeds on the part of the entrance! We can actually indicate that it should consume the entire input by defining a parser that matches the end of the input:
eof :: Parser () eof = StateT $ \list -> case list of [] -> Just ((), []) _ -> Nothing
Now try:
>>> runParser (parseForest 0 >> eof) [(0, ["val1"]), (2, ["val2"])] Nothing
Perfect!
Tree smoothing
To answer your second question, we solve the problem again using mutually recursive functions:
flattenForest :: Forest a -> [[a]] flattenForest forest = concatMap flattenTree forest flattenTree :: Tree a -> [[a]] flattenTree (Node a forest) = case forest of [] -> [[a]] _ -> map (a:) (flattenForest forest)
Give it a try!
>>> flattenForest [Node "day 1" [Node "Person 1" [Node "Bill 1" []],Node "Person 2" [Node "Bill 2" []]]] [["day 1","Person 1","Bill 1"],["day 1","Person 2","Bill 2"]]
Now, technically, I didn't have to use mutually recursive functions. I could do one recursive function. I just followed the Tree type definition from Data.Tree .
Conclusion
That way, theoretically, I could cut the code even further by skipping the intermediate Tree type and just parsing the smoothing result directly, but I decided that you could use the Tree view for other purposes.
The key selects the following items from it:
- Explore Haskell abstractions to simplify your code.
- Always record common functions
- Learn to use recursion effectively.
If you do this, you will write reliable and elegant code that exactly matches this problem.
application
Here is the final code that includes everything I said:
import Control.Applicative import Control.Monad.Trans.State.Strict import Data.Tree type YourType = [(Int, [String])] type Parser a = StateT [(Int, [String])] Maybe a runParser :: Parser a -> [(Int, [String])] -> Maybe a runParser = evalStateT parseElement :: Int -> Parser String parseElement level = StateT $ \list -> case list of [] -> Nothing (level', strs):rest -> case strs of [str] -> if (level' == level) then Just (str, rest) else Nothing _ -> Nothing parseTree :: Int -> Parser (Tree String) parseTree level = Node <$> parseElement level <*> parseForest (level + 1) parseForest :: Int -> Parser (Forest String) parseForest level = many (parseTree level) eof :: Parser () eof = StateT $ \list -> case list of [] -> Just ((), []) _ -> Nothing flattenForest :: Forest a -> [[a]] flattenForest forest = concatMap flattenTree forest flattenTree :: Tree a -> [[a]] flattenTree (Node a forest) = case forest of [] -> [[a]] _ -> map (a:) (flattenForest forest)