TL DR:
Do not create a data type where invalid data is possible:
T :: Nucleotide RNA is possible, and this is stupid biologically, so you can get r2d T (a run-time crash that you could prevent at compile time).
Please note that Chris Drostβs answer deserves respect for being a good answer to the technical question as asked.
Problem
I noticed a potential source of failure in that your r2d function r2d not complete - r2d T is undefined and realized that it is because you do not intend to have T :: Nucleotide RNA (and U :: Nucleotide DNA ). This is a problem, because anytime you accidentally (user-generated error) r2d T your entire program will work.
This is a design flaw in your type. The main point of the type system is to make invalid data impossible, but your code allows T :: Nucleotide RNA and even T :: Nucleotide [Bool] .
Direct decision
Unfortunately, the solution is to make more boring / less smooth types where there is a difference between C DNA and C RNA, but you can use a derived instance of Enum to convert them without typing.
data DNA = A | C | G | T deriving (Eq, Show, Read, Enum) data RNA = A' | C' | G' | U' deriving (Eq, Show, Read, Enum) r2d :: RNA -> DNA r2d = toEnum.fromEnum d2r :: DNA -> RNA d2r = toEnum.fromEnum
toEnum.fromEnum :: (Enum a, Enum b) => a -> b works by converting from the Enum type to Int , then from Int to another type of enumeration.
Now r2d T is simply a type error, so the program will not compile if you allow it, while with the phantom type it will compile and crash at runtime if the user manages to enter invalid data.
We must distinguish between RNA C and DNA C
(No....)
You may feel that it is wrong to distinguish between C and C' , since they are the same from a biological / chemical point of view, and there may be some compromise situation where you have a phantom type with A | C | G | TU A | C | G | TU A | C | G | TU and read user data differently depending on context:
{-# LANGUAGE FlexibleInstances #-} data Nucleotide a = A | C | G | TU deriving (Eq,Enum) data RNA = RNA data DNA = DNA instance Show (Nucleotide DNA) where show A = "A" show C = "C" show G = "G" show TU = "T" instance Show (Nucleotide RNA) where show A = "A" show C = "C" show G = "G" show TU = "U" r2d :: (Nucleotide RNA) -> (Nucleotide DNA) r2d = toEnum.fromEnum d2r :: (Nucleotide DNA) -> (Nucleotide RNA) d2r = toEnum.fromEnum
Slick, but ...
Sometimes creating a complex type simply increases the number of extensions you need to use if, if you can tolerate a few ' , you will have something with fewer potential problems.
It seems to me that you will be better off with my first decision and writing custom instances for Show RNA and Read RNA , where the user does not need to put ' at the end of the letter.
Always avoid runtime errors if you can
Please note that read never a complete function (i.e. the cause of program crashes), and you are better off using readMay from safe so that you can gracefully restore and give your user a polite error message and the ability to fix it, rather than crashing, or by writing a parser using Parsec or the like to read large amounts of complex structured data, where read or readMay is uselessly slow.