A somewhat failed adventure in Haskell abstraction
I usually blog about weird and wonderful things you can do in Haskell. Today I'm going to talk about something very plain and not wonderful at all.
If you want to try out the code below, use these Haskell extensions:
{-# LANGUAGE TypeFamilies, MultiParamTypeClasses, OverloadedStrings,
FlexibleInstances, TypeSynonymInstances, ScopedTypeVariables,
FunctionalDependencies, RecordWildCards, FlexibleContexts,
GeneralizedNewtypeDeriving #-}
The simple problem
We want to define a type for a person which has a few fields and operations. Like this
module Person(Person(..), display) where
data Person = Person {
firstName :: String,
lastName :: String,
height :: Double
}
display :: Person -> String
display p = firstName p ++ " " ++ lastName p ++ " " ++ show (height p + 1)
Very simple. To use it we can just import the module and the write something like
print $ display $ Person { firstName = "Manuel", lastName = "Peyton Jones", height = 255 }
But being efficiancy conscious I'm not happy with using
String and
Double.
I'd like to experiment with different types for these. Maybe I should use
ByteString and
Int instead?
Simple enough, let's abstract out the types and operations into a different module.
module Ops(XString, XDouble, (+++), xshow) where
import Data.String
newtype XString = XString String deriving (Eq, Show, IsString)
newtype XDouble = XDouble Double deriving (Eq, Show, Num)
(+++) :: XString -> XString -> XString
XString x +++ XString y = XString (x ++ y)
xshow :: XDouble -> XString
xshow (XDouble x) = XString (show x)
module Person(Person(..), display) where
import Ops
data Person = Person {
firstName :: XString,
lastName :: XString,
height :: XDouble
}
display :: Person -> XString
display p = firstName p +++ " " +++ lastName p +++ " " +++ show (height p + 1)
There, problems solved. By changing the import in the
Person module you can try out different types for
XString and
XDouble.
No, this is not problem solved. To try out different implementations I need to edit the Person module. That's not abstraction, that's obstruction. It should be possible to write the code for the Person module once and for all once you decided to abstract, and then never change it again.
I also didn't really want to necessarily have newtype in my module. Maybe I'd want this:
module Ops(XString, XDouble, (+++), xshow) where
type XString = String
type XDouble = Double
(+++) :: XString -> XString -> XString
(+++) = (++)
xshow :: XDouble -> XString
xshow = show
You can define Ops that way, but then the implementation of Ops may leak into the Person module. What you really want is to type check Person against the signature of Ops, like
interface Ops where
type XString
type XDouble
(+++) :: XString -> XString -> XString
xshow :: XDouble -> XString
And later supply the actual implementation. Alas, Haskell doesn't allow this.
In ML (SML or O'Caml) this would be solved by using a functor. The Person module would be a functor that takes the Ops module as an argument and yields a new module. And then you can just plug and play with different Ops implementations. This is where ML shines and Haskell sucks.
Type classes
But the defenders of Haskell superiority say, Haskell has type classes, that's the way to abstract! So let's make Ops into a type class. Let's do old style with multiple parameters first. Since Ops defines two types it will correspond to having two type parameters to the class.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
(+++) :: xstring -> xstring -> xstring
xshow :: xdouble -> xshow
Ok, so how do we have to rewrite the Person module?
data Person xstring xdouble = Person {
firstName :: xstring,
lastName :: xstring,
height :: xdouble
}
display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p = firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
An implementation is provided by an instance declaration:
instance Ops String Double where
(+++) = (++)
xshow = show
We see the major flaw in this approch at once. The
Person data type now has two parameters. This might be bearable, but imagine a more complicated example where Ops contains 15 types. And every time you add a field with a new type to Person you have to update every single place in the program that mentions the Person type. That's not abstraction.
But in fact, it's even worse than that. The definition of display might look plausible, but it's full of ambiguities. Compiling it gives lots of errors of this kind:
Could not deduce (Ops xstring xdouble)
from the context (Ops xstring xdouble4)
Well, we can remove the type signature and let GHC figure it out. The we get this
display :: (Ops xstring xdouble,
Ops xstring xdouble3,
Ops xstring xdouble2,
Ops xstring xdouble1,
Ops xstring xdouble4) =>
Person xstring xdouble4 -> xstring
And this function can, of course, never be used because most of the type variables do not occur outside the context so they will never be determined. I don't even know how to put explicit types in the function to make it work.
Well, it's common knowledge that multi-parameter type classes without functional dependencies is asking for trouble. So can we add some functional dependencies? Sure, if we use
class (IsString xstring, Num xdouble) => Ops xstring xdouble | xstring -> xdouble where
then things work beautifully. Until we decide that another instance that would be interesting to try is
instance Ops String Int
which is not valid with the FD present.
So we can't have functional dependencies if we want to have flexibilty with the instances. So what is it that goes wrong without the FDs? It's that all the uses (+++) and xshow are not tied together, they could potentially have different types. Let's try and be sneaky and tie them together:
display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
let (++++) = (+++); xxshow = xshow
in firstName p ++++ " " ++++ lastName p ++++ " " ++++ xxshow (height p + 1)
This only generates one error message, because there's still nothing that says the the two operations come from the same instance. We need to make the tie even closer.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
ops :: (xstring -> xstring -> xstring, xdouble -> xstring)
instance Ops String Double where
ops = ((++), show)
display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
let ((+++), xshow) = ops
in firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
This actually works!
We can make it neater looking.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
ops :: DOps xstring xdouble
data DOps xstring xdouble = DOps {
(+++) :: xstring -> xstring -> xstring,
xshow :: xdouble -> xstring
}
instance Ops String Double where
ops = DOps (++) show
display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
let DOps{..} = ops
in firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
We have basically packaged up the dictionary and unpack it ourselves to get access to the operations. It's not pleasent, but it works.
But as I already said, the multiparameter type class version isn't really a good solution to the problem even if it works; it introduces too many parameters to the Person record.
Associated types
The new and shiny way of doing type classes is to use associated types instead of FDs. So let's give that a try. So what should the associated types be in the class. The associated type is supposed to be the one that can be computed from the main one. But we have two types that are on equal footing, so there is no main one. We can remedy that by introducing an artificial third type that is the main one, it can then determine the other two.
class (IsString (XString t), Num (XDouble t)) => Ops t where
type XString t :: *
type XDouble t :: *
(+++) :: XString t -> XString t -> XString t
xshow :: XDouble t -> XString t
data Person t = Person {
firstName :: XString t,
lastName :: XString t,
height :: XDouble t
}
That looks pretty neat. Note how the Person record has one parameter and no matter how many new associated type we add it will still only have one parameter. One parameter is reasonable, the Person record is after all parameterized over what kind of Ops we are providing.
Let's do an instance. It will need the extra type that is somehow the name of the instance.
data Basic = Basic
instance Ops Basic where
type XString Basic = String
type XDouble Basic = Double
(+++) = (++)
xshow = show
Now what about the
display function? Alas, now it breaks down again. The
display function is full of type errors again. And the reason is similar to the multiparameter version; there's nothing that ties the operations together.
We can play the same trick as with DOps above, but for some reason it doesn't work this time. The type comes out as
display :: (XString t ~ XString a,
XDouble t ~ XDouble a,
Ops a,
Num (XDouble t)) =>
Person t -> XString a
I have no clue why. I find associated types very hard to get a grip on.
OK, multi-parameter type classes made things work, but had too many type parameters. And associated types is the other way around. You can try combining them, but it didn't get me anywhere closer.
Associated data types
OK, I won't admit defeat yet. There's still associated data types. They are easier to deal with than associated types, because the type function is guaranteed to be injective.
class (IsString (XString t), Num (XDouble t)) => Ops t where
data XString t :: *
data XDouble t :: *
(+++) :: XString t -> XString t -> XString t
xshow :: XDouble t -> XString t
data Basic = Basic
instance Ops Basic where
newtype XString Basic = XSB String deriving (Eq, Ord, Show)
newtype XDouble Basic = XDB Double deriving (Eq, Ord, Show)
XSB x +++ XSB y = XSB (x ++ y)
xshow (XDB x) = XSB (show x)
instance Num (XDouble Basic) where
XDB x + XDB y = XDB (x+y)
fromInteger = XDB . fromInteger
instance IsString (XString Basic) where
fromString = XSB
At last, this actually works! But it's at a price. We can no longer use the types we want in the instance declaration, instead we are forced to invent new types. Using this approach the original multi-parameter version could have been made to work as well.
Normally the GeneralizedNewtypeDeriving language extension makes it relatively painless to introduce a newtype that has all the instances of the underlying type. But due to a bug in ghc you can't use this extension for associated newtypes. So we have to make manual instance declarations which makes this approach very tedious.
Conclusion
I have found no way of doing what I want. My request is very simple, I want to be able to abstract over the actual implementation of a module, where the module contains types, values, and instances.
Haskell normally excels in abstraction, but here I have found no natural way of doing what I want. Perhaps I'm just not clever enough to figure out how, but that is a failure of Haskell too. It should not take any cleverness to do something as simple as this. In ML this is the most natural thing in the world to do.
Associated types are not a replacement for a proper module system. They let you do some things, but others just don't work.
I'd be happy to see anyone doing this in Haskell in a simple way.
Labels: Haskell, Modules, overloading