How does "node binding" to this circularly linked list in Haskell work?

Question

How does "node binding" to this circularly linked list in Haskell work?

I am studying Haskell and reading Tying the Knot on how to build a circularly linked list. In code

data DList a = DLNode (DList a) a (DList a) mkDList :: [a] -> DList a mkDList [] = error "must have at least one element" mkDList xs = let (first,last) = go last xs first in first where go :: DList a -> [a] -> DList a -> (DList a, DList a) go prev [] next = (next,prev) go prev (x:xs) next = let this = DLNode prev x rest (rest,last) = go this xs next in (this,last)

I am trying to understand the call, where they associate the last element with the first through a “little trick” (!) Called node binding:

 mkDList xs = let (first,last) = go last xs first

But I can hardly see how it works. Why did you initially call? And for the comment in the article, how did the first result from “go” “come back”?

Thanks!

+5

haskell lazy-evaluation

user1840624 Jan 6 '17 at 16:26

source share

2 answers

chepner · Answer 1 · 2017-01-06T18:12:11+0000

Since Haskell is lazy, values are evaluated to a strict value. We can go through a simple example with equational reasoning to see where this happens.

Start with the simplest example: a list of one item.

 mkDList [1] == let (first, last) = go last [1] first in first

You can't seem to call go because you don't know that last and first are equal so far. However, you can think of them as invaluable black boxes: it doesn't matter what they are, just so you can continue with them the equatorial reasoning.

 -- Just plug last and first into the definition of go -- last2 is just a renaming of the argument for clarity go last [1] first == let this = DLNode last 1 rest (rest, last2) = go this [] first in (this, last2)

Try evaluating the next go call in the same way.

 go this [] first == (first, this)

Conveniently, we did not need to introduce any new black boxes; go just slightly repackages the original arguments.

OK, now we can go back to where we came and replace the recursive call with go our estimate.

 go last [1] first == let this = DLNode last 1 rest (rest, last2) = (first, this) in (this, last2)

And we will bring this back to our original equation using mkDList :

 mkDList [1] == let (first, last) = let this = DLNode last 1 rest (rest, last2) = (first, this) in (this, last2) in first

This does not look too useful, but remember that we will not name mkDList ; we simply used equatorial reasoning to simplify our definition a bit. In particular, there are no recursive calls to go , only one let expression nested in another.

Since Haskell is lazy, we do not need to evaluate this until it is absolutely necessary, for example, when we try to match the template with the return value of mkDlist [1] :

 let (DLNode pxn) = mkDList [1] in x

To appreciate this expression, we just need to ask the following questions:

"What is the meaning of x ?" Answer: first you need to map the template to mkDlist [1] .
"What is the meaning of mkDList ?" Answer: first .
"What is the meaning of first ?" Answer: this .
"What is the meaning of this ?" Answer: DLNode last 1 rest

At this point, you will have enough information to see that x == 1 and last and rest do not need to be evaluated further. You could, however, combine the pattern again to see that, for example, p , and find that

 p == last == last2 == this == DLNode last 1 rest

and

 n == rest == first == this == DLNode last 1 rest

Magic is a call of type (first, last) = go last xs first , which actually does not need values for its arguments; he just needs placeholders to keep track of which first and last values will eventually get when they are evaluated. These placeholders are called "thunks" and they are fragments of unreasonable code. They allow us to refer to boxes that we haven’t filled anything yet, and we can pass empty cells to go safe, knowing that someone will fill them before anyone else tries to look into them. (Actually, go never does this, it just keeps passing them until someone outside mkDList tries to look at them.)

Will ness · Answer 2 · 2017-01-06T18:12:14+0000

First, we can try simple input to see what happens there:

 mkDList [1] = first where (first,last) := go last [1] first = let { prev=last; next=first; -- prev = last x=1; xs=[] } -- next = first in go prev (x:xs) next = let this := (DLNode prev 1 rest) -- a node is constructed, with -- the two pointers still pointing into the unknown (rest,last2) := go this [] next = (next,this) -- rest := next -- last2 := this in (this,last2) -- first := this -- last := last2

let in Haskell is recursive: the same name can appear on both the left and right sides of the equation sign and will refer to the same object.

First, go is called with last , [1] and first . Both last and first do not yet refer to any value; they exist only as an identity, a kind of "named pointers" for still empty boxes, boxes that are not yet filled with values.

Passing into the courage of go , both names are “smoothed out”, and then the final value is returned (this, last2) ; then the template (first, last) corresponds to this value. Thus, last finally gets its value, although it was used as a named name inside go invocations. This is what is called node binding: imagine the arrow going from last to go calls, returning to it from the depths; and the same with first ; thus creating equivalence chains:

  first := this = (DLNode prev 1 rest) last := last2 := this prev = last rest := next = first

The above follows a somewhat imperative view of Haskell's semantics as a single-assignment language. The operator := used as a pseudo-code to provide a visual key to the fact that the value is calculated using the expression on the right and is "passed" to the variables in the template to the left of the equal sign (when this template matches the calculated value).

Actually, the name "next" does not fit, because we just go through the first node to the end to use it as the last node of the next node:

  mkDList xs@ (_:_) = first where (first,last) = go last xs first go :: DList a -> [a] -> DList a -> (DList a, DList a) go prev (x:xs) first = (this, last) -- (this , last ) where this := DLNode prev x rest ( rest, last) := go this xs first go prev [] first = (first, -- first --> rest of the last node prev)

This can be "described" by the equivalent definition of Prolog:

 mkDList( [X|XS], First) :- % mkDList( in, out) go( Last, [X|XS], First, First, Last). % go( in, in, in, out, out) go( Prev, [X|XS], First, This, Last) :- This = dlnode(Prev, X, Rest), go( This, XS, First, Rest, Last). % fill the Rest, return Last go( Prev, [], First, First, Prev).

Really,

 ?- mkDList([1,2,3],X). X = dlnode(_S1, 1, _S2), % where _S2 = dlnode(X, 2, _S1), _S1 = dlnode(_S2, 3, X).

How does "node binding" to this circularly linked list in Haskell work?

More articles: