Sunday, March 09, 2008

Simple reflections of higher order

In a recent blog post by Twan van Laarhoven he showed how to reflect Haskell expressions so that we can actually see them symbolically. This has now been included in lambdabot on the Haskell IRC (source). Let's play with it for a moment:
*SimpleReflect> x+y
x + y
*SimpleReflect> foldr f z [1,2,3]
f 1 (f 2 (f 3 z))
*SimpleReflect> let swap (x,y) = (y, x)
*SimpleReflect> swap (a,b)
(b,a)
*SimpleReflect> map swap [(a,b),(c,d)]
[(b,a),(d,c)]
*SimpleReflect> :t x
x :: Expr
That's very cool! Read Twan's post to find out how it works. All we need to know is on the last line, x :: Expr. So Expr is a special type with special instances. Let's try something else
*SimpleReflect> \ x -> x + y

:1:0:
    No instance for (Show (Expr -> Expr))
    ...
Oh, that's annoying, we can't print functions. But why can't we? If we made up some variable of type Expr and then applied the function we'd get something we could print. Let's add some code to Twan's module. I've just randomly picked the variable a. (To make this compile you need the extension language FlexibleInstances.)
instance Show (Expr -> Expr) where
    show f = "\\ " ++ show a ++ " -> " ++ show (f a)
And try again with the example
*SimpleReflect> \ x -> x + y
\ a -> a + y
Pretty smooth. Except it doesn't really work.
*SimpleReflect> \ x -> x + a
\ a -> a + a
Those two functions are not the same. We can't just arbitrarily pick the variable a since it might be free in the expression.

So we need to pick a variable that is not free in the expression. How could we do that? No matter what we pick it could be used. And we have no idea what is being used. Or do we? If we just turn the function in to text we could look at the string, and pick some variable that is unused in that string. This is really a gruesome hack, but who cares?

How do we print the function? Just invent some variable, I use _, turn it into an expression and tokenize the string. We will then have a string of tokens not to use. To find a variable to use, just pick an infinite supply of variables, remove the ones being used, and then pick one of the remaining ones.

instance Show (Expr -> Expr) where
    show f = "\\ " ++ show v ++ " -> " ++ show (f v)
      where v = var (head vars)
            vars = supply \\ tokenize (show $ f $ var "_")
            supply = [ "x" ++ show i | i <- [1..]]
            tokenize "" = []
            tokenize s = case lex s of (x,s') : _ -> x : tokenize s'
And try to fool it:
*SimpleReflect> \ x -> x + y
\ x1 -> x1 + y
*SimpleReflect> \ x -> x + var "x1"
\ x2 -> x2 + x1
OK, what about multiple arguments?
*SimpleReflect> \ x y -> x + y + z

:1:0:
    No instance for (Show (Expr -> Expr -> Expr))
    ...
Well, yeah, that's true. There is no such instance. But wait, why is the instance we have for the type Expr->Expr? No particular reason, that's just what I wrote. In fact it works equally well for Expr->r as long as we can show r, because that's the only thing we do with f v. So we change the first line of the instance:
instance (Show r) => Show (Expr -> r) where
And now
*SimpleReflect> \ x y -> x + y + z
\ x2 -> \ x1 -> x2 + x1 + z
*SimpleReflect> foldr (.) id [f::Expr->Expr,g,h] -- a little type help needed
\ x1 -> f (g (h x1))
*SimpleReflect> scanr (.) id [(*2),f::Expr->Expr,(+1)]
[\ x1 -> f (x1 + 1) * 2,\ x1 -> f (x1 + 1),\ x1 -> x1 + 1,\ x1 -> x1]
Well, that wasn't too hard. So let's try another example.
*SimpleReflect> \ (x,y) -> x+y+z

:1:0:
    No instance for (Show ((Expr, Expr) -> Expr))
    ...
Hmmm, yes, the argument must be an Expr. That's annoying. We need to generalize the argument type. We want be able to use Expr and pair of them etc. Time for a type class. What does it need to do? It has to invent expressions and never reuse variables doing so. So when we invent an Expr we need to consume one variable and leave the rest for others to consume. This sounds like a state monad. So we're going to use a state monad where the state is the (infinite) list of strings that are available for making variables.
instance (Show a, ExprArg a, Show r) => Show (a -> r) where
    show f = "\\ " ++ show v ++ " -> " ++ show (f v)
      where v = evalState exprArg vars
            dummy = evalState exprArg $ repeat "_"
            vars = supply \\ tokenize (show $ f dummy)
            supply = [ "x" ++ show i | i <- [1..]]
            tokenize "" = []
            tokenize s = case lex s of (x,s') : _ -> x : tokenize s'

class ExprArg a where
    exprArg :: State [String] a

instance ExprArg Expr where
    exprArg = do v:vs <- get; put vs; return (var v)
Using this we're back where we were before, but now we can make some more instances.
instance ExprArg () where
    exprArg = return ()

instance (ExprArg a, ExprArg b) => ExprArg (a, b) where
    exprArg = liftM2 (,) exprArg exprArg

instance (ExprArg a, ExprArg b, ExprArg c) => ExprArg (a, b, c) where
    exprArg = liftM3 (,,) exprArg exprArg exprArg
And finally:
*SimpleReflect> \ (x, y) -> x + y + z
\ (x1,x2) -> x1 + x2 + z
*SimpleReflect> curry f :: Expr -> Expr -> Expr
\ x2 -> \ x1 -> f (x2,x1)
*SimpleReflect> uncurry f :: (Expr, Expr) -> Expr
\ (x1,x2) -> f x1 x2
*SimpleReflect> \ () -> 1
\ () -> 1
The last example is a curiosity since it does not involve Expr at all. Well, that's enough for a Sunday night hack.

3 comments:

  1. That's very cool. I think you've just written the core of a general serialization library there...

    ReplyDelete
  2. It's not that useful since it doesn't work for all types.

    ReplyDelete
  3. That's very cool. I think you've just written the core of a general serialization library there...

    ReplyDelete