Prevent propagation of type variable in Haskell

Parameterization of type variables is good, but not scalable. As an example of what might happen, http://oleg.fi/gists/posts/2017-04-26-indexed-poptics.html gives an abstraction containing 9 type variables. I am working on a framework for software transformations that are parameterized by a programming language and may well have tens or hundreds of parameters in the future.

So, the main question: I have a data type T that is parameterized over N types. How to write a function over T without writing variables of type N every time I use it?

Here are some approaches that I looked at, none of which are feasible:

Parameterize type variables * -> *

 data V = Var1 | Var2 | Var3 | Var4 myfunc :: forall (v :: V -> *). Constraints v => v Var1 -> v Var2 myfunc = ... 

So, now instead of parameterizing more than 4 variables of type Var1, Var2, Var3, Var4 I only need to parameterize one variable of the type of the form V -> * .

This works, except that in this example, myfunc cannot be called because v cannot be output. You will need to change it to Proxy v -> v Var1 -> v Var2 . And then, every time you want to use myfunc with difference variables, you will need to define a separate GADT with its own template. Something like that:

 data MyV a where MyVar1 :: Int -> MyV Var1 MyVar2 :: String -> MyV Var2 MyVar3 :: Bool -> MyV Var3 MyVar4 :: [Int] -> MyV Var4 

Needless to say, this is very unsatisfactory.

This approach is exactly what is taken by the partitioned parts of the compdata library . This is pretty good, because this template template exactly matches the regular data type that you will write anyway.

Parameterize a type variable of type (*,*,*,...) or record type

 {-# LANGUAGE TypeInType #-} import Data.Kind type Vars = (*,*,*,*) myfunc :: forall (v :: Vars). ... myfunc = ... 

This does not work because, as far as I can tell, there is no way to destroy a variable of type v form (*,*,*,*) . I can create an instance (Int, String, Bool, [Int]) , but I cannot extract from it the components Int, String, etc. The best I can do is write a type family:

 type family Fst (v :: Vars) where Fst '(a,b,c,d) = a type family Snd .... myfunc :: forall (v :: Vars). Fst Vars -> Snd Vars 

However, this has the same problem as the previous solution: it cannot output v unless you pass separately Proxy v . I also tried to add a type Extensional v = (v ~ '(Fst v, Snd v, Third v, Fourth v)) , but that didn't help.

Using Existentially Quantized Variables

 data HasFourTypeVariables abcd = .... data IOnlyCareAboutTwo ab = forall c d. IOnlyCareAboutTwo (HasFourTypeVariables abcd) 

This does not work. Here's what it looks like when you tried to use it:

 update :: IOnlyCareAboutTwo ab -> IOnlyCareAboutTwo ab update = ... useUpdate :: HasFourTypeVariables abcd -> HasFourTypeVariables abcd useUpdate x = case update (IOnlyCareAboutTwo x) of IOnlyCareAboutTwo y -> y 

This is not typecheck, because typechecker does not know that the input and output of update have the same existential witnesses.

Use a backpack

The backpack so far looks like the best rival. Depending on the signature of the module with 5 types and associated operations / restrictions, as in each operation, there are 5 universally limited variable variables, except that they do not need to be written.

The backpack is still fairly new and not yet integrated with the Stack; I have no experience yet. In addition, it, apparently, was created for parameterization for all packages, and not for less functionality; I got the impression that he does not support the explicit instantiation that is required here.

Have a long list of type variables and agree with it

The solution I am considering :(


A bit of background

This problem threatens to arise in my work on multilingual software conversion by expanding https://arxiv.org/pdf/1707.04600.pdf .

On my system, the term in the programming language is of type Term fl , where f is the signature for the programming language (of the form (*->*)->(*->*) - see the CDT document for an explanation), and l is sorting. Thus, an operator in C can be of type Term CSig StmtL , while a Java expression can be of type Term Java ExpL .

So far, these are just two type variables. But, as I push the system toward greater generality and the ability to abstract from deeper and deeper semantic properties, the number of type variables can explode. Here are some examples of how:

  • I want to keep annotations on AST nodes. Some, such as the node label and its original origin, are worthy ideas for each tree, but others, such as character resolution or a sign of whether this subtree has been changed, are desirable on some trees, but not on others. Therefore, I would like my views to be able to flexibly add annotations to some views, not others, and be able to write statements that only care about whether a subset of these annotations is present. How to do it? One or more type variables for annotations, as well as many HasSymbolResolutionAnnotation a restrictions.

  • After much experience and many hours of reflection, I decided that volatile AST is actually a pretty good idea. Then I would like to be able to write operators that can work with both clean and variable ASTs. I haven't figured out how to do this yet, but you can bet that it will add at least one type variable to my type.

  • Inside this CSig or JavaSig signature, CSig can be many language nodes, such as Add. For many simple analyzes and transformations, it’s enough to simply say that “this language has a complement” and is inserted into the general “Add” node. But for more complex questions, the question may arise whether the addition in your language can overflow and how, whether the (+) operator is primitive or can be redefined, like in C ++ or Haskell, and what restrictions it imposes on the surrounding types. Now, instead of the “Add” node, your language may have Add MonomorphicAddition NonOverridable OverflowWrapsToNegative node, and your analytic signs define the transfer functions for languages ​​with Add ab OverflowWrapsToNegative node. There is no limit to the number of operator options that you can encode as follows. And while there are many useful things that can be said about such a parameterized operator that refers only to a couple of its parameters, it would be desirable to consider them in this way.

Hope this helps explain why this is a problem.

+5
source share
2 answers

This answer was inspired by Trees That Grow . It is assumed that there is only a limited number of “options” for the data type, which uses only a small subset of all possible instances of the N parameters.

 {-# language DataKinds #-} {-# language TypeFamilies #-} {-# language KindSignatures #-} {-# language PolyKinds #-} {-# language FlexibleContexts #-} 

We define our data type as follows:

 data MyType (v::Variant) = MyType (Xa v) (Xb v) (Xc v) Int data Variant = V1 | V2 type family Xa (v::Variant) :: * where Xa V1 = Int Xa V2 = Bool type family Xb (v::Variant) :: * where Xb V1 = Bool Xb V2 = Char type family Xc (v::Variant) :: * where Xc V1 = String Xc V2 = Maybe Int 

There are two “options” for the data type. Each changing field has its own (presumably boiler) type family, which maps a variant to the actual field type in this variant.

Here's a simple function that works for all data type options:

 getInt :: MyType (v :: Variant) -> Int getInt (MyType _ _ _ i) = i 

Using -XConstraintKinds we can define a constraint shared by all fields:

 {-# language ConstraintKinds #-} import GHC.Exts (Constraint) type MyForAll (p :: * -> Constraint) (v::Variant) = (p (Xa v),p (Xb v),p (Xc v)) 

We can use it to define functions of type

 myShow :: MyForAll Show (v :: Variant) => MyType v -> String myShow (MyType abci) = show a ++ show b ++ show c ++ show i 

We can also include -XTypeApplications to indicate an option:

 λ :t MyType MyType :: forall {v :: Variant}. Xa v -> Xb v -> Xc v -> Int -> MyType v λ :set -XTypeApplications λ :t MyType @V1 MyType @V1 :: Int -> Bool -> String -> Int -> MyType 'V1 
+3
source

If you want to group several type variables together, you can do this. At the value level, you just use the notation, and you can do it at the type level too. Create a record type and then a raised version that you can use to group types. Access to the record becomes a bit awkward, as the syntax of the value level record selector does not rise.

Here is an example that should clarify what I mean.

 {-# LANGUAGE StandaloneDeriving, TypeInType, UndecidableInstances #-} module RecordTyVars where import Data.Kind -- The normal way, with 3 type variables. data OExpr sym lit op = OVar sym | OLit lit | OPrimOp op [OExpr sym lit op] deriving (Show) oe :: OExpr String Integer Op oe = OPrimOp Add [OVar "x", OLit 1] data Op = Add | Sub deriving (Show) -------- -- Record that when lifted will contain the types. data ExprTypes = Types Type Type Type -- Record access functions, since the record syntax doesn't lift. type family SymType (r :: ExprTypes) :: * where SymType ('Types sym lit op) = sym type family LitType (r :: ExprTypes) :: * where LitType ('Types sym lit op) = lit type family OpType (r :: ExprTypes) :: * where OpType ('Types sym lit op) = op -- Using the record of types data Expr r = Var (SymType r) | Lit (LitType r) | PrimOp (OpType r) [Expr r] -- Must use standalone deriving when thing the going gets tough. deriving instance (Show (SymType r), Show (LitType r), Show (OpType r)) => Show (Expr r) e :: Expr ('Types String Integer Op) e = PrimOp Add [Var "x", Lit 1] 
+2
source

Source: https://habr.com/ru/post/1272609/


All Articles