Sunday, July 01, 2007

Massive overload In my last post I had a little DSL embedded in Haskell. Defining the Fibonacci function looked like this:
mdo
fib <- fun \$ \ n -> cond (n .< 2) 1 (fib(n-1) + fib(n-2))
return fib
fib n = if n < 2 then 1 else fib(n-1) + fib(n-2)
Why do they look different? Could I make my DSL look exactly like Haskell? First, lets look at the parts that look the same, like fib(n-1). In Haskell, with n of type Int, this just an expression and it has some value (depending on n). In the DSL, with n of type M Int the value of this expression is in fact an abstract syntax tree, namely (something like) M (App (Func (FuncNo 0)) [App (FPrimOp I_Sub) [Arg 0, Con (ILit 1)]]). The reason we can make the same expression mean different things depending on the type is that numerical operators like - are overloaded, and so they do different things for Int and M Int. So what about <, can we overload that too? The < operator is already overloaded and has type (Ord a) => a -> a -> Bool. But the .< operator in the DSL returns M Bool, so there's no way we can use <; the return type is fixed. What to do? Well, there's nothing sacred with the operator <, it's just another name, but defined in the Prelude. We can make our own if we hide the Prelude. So let's do that, but carefully. We want to be able to still use < as before, and for our M Bool return type. So we need to overload the return type as well as the argument type. Overloaded Booleans Before we do that, let's define a class that is analogous to Num, but for Boolean values. We'll need this class soon. The Bool type has a few important functions and values, let's put those in the class.
b -> b
not :: b -> b

instance Boolean Bool where
false = False
true = True
(&&) = (P.&&)
(||) = (P.||)
not = P.not
First we import the Prelude explicitly, this allows us to hide some things we want to redefine. The we define the Boolean class, which gives new meaning to some prelude functions. Finally, we make an instance saying that the new (&&), (||), and not behave as the old ones for good oldfashioned Bool. After this definition we can use the Boolean operators just as before, they are just overloaded now, but for Bool they resolve to their old meaning. Overloaded comparison So back to comparison, Eq first.
class (Boolean b) => Eq a b | a -> b where
(==), (/=) :: a -> a -> b
x /= y  =  not (x == y)
The new Eq (we need to hide the Prelude one) is now more general; the return type is overloaded as well. Too avoid ambiguities we have a functional dependency. The type of the values we compare will determine the type of the Boolean result. (This isn't necessary, but without this we might need many type annotations.) Nothing is a member of this class, so we need to add the types we need. We really should all types that have equality, but let's do two samples.
instance Eq Int Bool where
(==) = (P.==)
(/=) = (P./=)
instance Eq Double Bool where
(==) = (P.==)
(/=) = (P./=)
And since the whole point of this was to allow special equality for our M values, let's look at that too.
instance Eq (M Int) (M Bool) where
(==) = binOp I_EQ
(/=) = binOp I_NE
A new (simplified) Ord looks similar, both the class and the instances.
class (Eq a b) => Ord a b | a -> b where
(<), (<=), (>), (>=) :: a -> a -> b
Overloaded conditionals So now for the next challange. The Haskell Fibonacci function uses if, and the DSL uses a special function cond. Well, now we are stuck. These is no way at all to overload the special if syntax in Haskell; it's wired in. This could be viewed as a design mistake in Haskell, but so it is. So could we use the cond function in the Haskell version too? Well, not the original cond, it has type M Bool -> M Int -> M Int -> M Int. We need to overload that too.
class (Boolean b) => Cond a b | a -> b where
cond :: b -> a -> a -> a
Again, we add a functional dependency. And again, it will save some ambiguities. The obvious instances:
instance Cond Int Bool where
cond x y z = if x then y else z

instance Cond (M Int) (M Bool) where
cond = terOp I_Cond
instance Cond (M Bool) (M Bool) where
cond = terOp I_Cond
terOp op (M c) (M t) (M e) = M \$ App (FPrimOp op) [c, t, e]
Oh, and the instance for M Bool that I skipped before.
instance Boolean (M Bool) where
false  = M \$ Con \$ LInt 0
true   = M \$ Con \$ LInt 1
x && y = cond x y false
x || y = cond x true y
not x  = cond x false true
After this effort we can write the body of the Fibonacci function the same way for both type Int and type M Int, namely
cond (n < 2) 1 (fib(n-1) + fib(n-2))
And what about all these new classes and instances. how do we package them to make them easy to use?
module MyPrelude(module Prelude, Boolean(..), Eq(..), Ord(..), Cond(..)) where
import qualified Prelude as P
import Prelude hiding (Eq(..), Ord(..), (&&), (||), not)

...
We hide the stuff we want to override, and re-export the rest of the Prelude. Any module that wants to use this new Prelude now has to say:
module M where
import Prelude()
import MyPrelude
...

Labels: , sorear said...

Notice that you are not restricted to simple hindley-milner stuff with this; with your functional dependencies and a few options to disable the GHC footguard, you can embed arbitrary syntax-directed type rules in the GHC type system. Uniqueness types, recursive types, associated type synonyms, subtyping, explicit polymorphism a la F, etc etc. (But you probably already knew this)

Sunday, July 1, 2007 at 5:20:00 AM GMT+1 augustss said...

My ambition here is more towards generating code rather than embedding a fancy type system. But that would be fun too.

Sunday, July 1, 2007 at 10:00:00 AM GMT+1 sigfpe said...

Really minor comment. On your line "fib <- fun \$ \ n -> cond (n .< 2) 1 (fib(n-1) + fib(n-2))" the "<-" is represented in the HTML with a verbatim '<' character instead of using the ampersand prefixed HTML. Makes it come out wrong on Haskell Planet.

Monday, July 2, 2007 at 6:07:00 PM GMT+1