The updated question clarifies that the OP wants a formalized definition of overlapping patterns. Here, “overlap” is implied in the sense that the GHC uses when it issues its warnings: that is, when it detects that the case branch is unreachable because its template does not match with anything that is not being processed by the previous branch.
A possible formal definition may indeed follow this intuition. That is, for any pattern p , you can first define a set of values (notation) [[p]] corresponding to p . (For this, it is important to know the type of variables involved in p - [[p]] , it depends on the environment of the Gamma type.) Then we can say that in the sequence of patterns
q0 q1 ... qn p
the pattern p overlaps if if [[p]] , as a set, is included in [[q0]] union ... union [[qn]] .
The above definition is unlikely to apply, although it does not immediately lead to an overlap verification algorithm. Indeed, the calculation of [[p]] impossible, since it is an infinite set in general.
To define an algorithm, I would try to define a representation for a set of terms “not yet consistent” with any q0 .. qn pattern. As an example, suppose we are working with logical lists:
Remaining: _ (that is, any list) q0 = [] Remaining: _:_ (any non empty list) q1 = (True:xs) Remaining: False:_ p = (True:False:ys) Remaining: False:_
Here the "remaining" set has not changed, so the last template overlaps.
As another example:
Remaining: _ q0 = True:[] Remaining: [] , False:_ , True:_:_ q1 = False:xs Remaining: [], True:_:_ q2 = True:False:xs Remaining: [], True:True:_ q3 = [] Remaining: True:True:_ p = True:xs Remaining: nothing -- not overlapping (and exhaustive as well!)
As you can see, at each step we compare each of the remaining samples with the sample at hand. This creates a new set of remaining samples (maybe not). A collection of all these patterns forms the new remaining set.
To do this, note that it is important to know the list of constructors for each type. This is because when comparing with True you should know that another case of False remains. Similarly, if you compare with [] , there remains one more case of _:_ . Roughly speaking, when comparing with the constructor K , all other constructors of the same type remain.
The above examples are not an algorithm yet, but hopefully you can start with you.
All this, of course, ignores the protective covers (which make the overlap insoluble), protective masks, GADT (which can further clarify the remaining set in rather subtle ways).