## Friday, June 29, 2007

Generating more code with Harpy After a slight detour into expression representation I'm back to generating code again. To recap, here's how I'm going to do my embedded expressions in Haskell. For now I'll stick to ugly looking boolean operators. Fixing that is a whole other story.
module Exp(Exp(..), Lit(..), Fun(..), PrimOp(..)) where

data Exp
= Con Lit
| Arg Int
| App Fun [Exp]
deriving (Eq, Ord, Show)

data Lit
= LInt  Int
deriving (Eq, Ord, Show)

data Fun
= FPrimOp PrimOp
deriving (Eq, Ord, Show)

data PrimOp
= I_Add | I_Sub | I_Mul | I_Quot | I_Rem | I_Neg
| I_EQ | I_NE | I_LT | I_LE | I_GT | I_GE
| I_Cond
deriving (Eq, Ord, Show)
-----------------------------------------------------
module MExp(M(..)) where
import Exp

newtype M a = M { unM :: Exp }
deriving (Show)
-----------------------------------------------------
module M(M, cond,
(.==), (./=), (.<), (.<=), (.>), (.>=),
false, true, (.&&), (.||)
) where
import Exp
import MExp

instance Eq (M a)
instance Ord (M a)

instance Num (M Int) where
x + y  =  binOp I_Add x y
x - y  =  binOp I_Sub x y
x * y  =  binOp I_Mul x y
negate x =  unOp I_Neg x
fromInteger i = M \$ Con \$ LInt \$ fromInteger i

instance Enum (M Int)
instance Real (M Int) where

instance Integral (M Int) where
quot x y = binOp I_Quot x y
rem x y = binOp I_Rem x y

binOp op (M x) (M y) = M \$ App (FPrimOp op) [x, y]
unOp op (M x) = M \$ App (FPrimOp op) [x]

--------

infix 4 ./=, .==, .<, .<=, .>, .>=
(.==), (./=), (.<), (.<=), (.>), (.>=) :: M Int -> M Int -> M Bool
(.==) = binOp I_EQ
(./=) = binOp I_NE
(.<)  = binOp I_LT
(.<=) = binOp I_LE
(.>)  = binOp I_GT
(.>=) = binOp I_GE

cond :: M Bool -> M Int -> M Int -> M Int
cond (M c) (M t) (M e) = M \$ App (FPrimOp I_Cond) [c, t, e]

condB :: M Bool -> M Bool -> M Bool -> M Bool
condB (M c) (M t) (M e) = M \$ App (FPrimOp I_Cond) [c, t, e]

false, true :: M Bool
false = M \$ Con \$ LInt 0
true  = M \$ Con \$ LInt 1

infixr 3 .&&
infixr 2 .||
(.&&), (.||) :: M Bool -> M Bool -> M Bool
x .&& y = condB x y false
x .|| y = condB x true y
So the Exp type is the internal representation of expressions. It's just constants, variables, and some primitive operations. The M (for machine) type is the phantom type that the DSL user will see. The module M contains all the user visible functions. Note that it handles booleans by represented the C way, by an int that is 0 or 1. There's a conditional function, cond, and a corresponding primitive that will serve as our if. So now we need to generate code for these. This is all very similar to what I did in a previous posting, I have just refactored some of the code generation. The invariant (that I arbitrarily decided on) is that each block of code that implements a primitive operation will may clobber EAX but must leave all other registers intact. This is mostly just a chunk of rather boring code. Division is causing problems as usual, since the IA32 instructions don't allow you to specify a destination register. One point worth a look is the code generation for cond. It has to test a boolean and then select one of two code blocks, but that is just what you'd expect.
module CodeGen(cgExp, CGen, cgPrologue, cgEpilogue, compileFun) where
import Prelude hiding(and, or)
import Data.Maybe(fromJust)
import Foreign
import Harpy.X86Assembler

import Exp
import MExp

type StackDepth = Int
type Env = ()
type CGen a = CodeGen Env StackDepth a

addDepth :: StackDepth -> CGen ()
d <- getState
setState (d+i)

cgExp :: Exp -> CGen ()
cgExp (Con l) = cgLit l
cgExp (Arg a) = cgArg a
cgExp (App (FPrimOp I_Cond) [c, t, e]) = cgCond c t e
cgExp (App f es) = do mapM_ cgExp es; cgFun f (length es)

cgFun (FPrimOp p) _ = cgPrimOp p

cgPrimOp I_Sub = twoOp sub
cgPrimOp I_Mul = twoOp (imul InPlace)
cgPrimOp I_Quot = twoOp (cgQuotRem eax)
cgPrimOp I_Rem = twoOp (cgQuotRem edx)
cgPrimOp I_Neg = oneOp neg
cgPrimOp I_EQ = cmpOp sete
cgPrimOp I_NE = cmpOp setnz
cgPrimOp I_LT = cmpOp setl
cgPrimOp I_LE = cmpOp setle
cgPrimOp I_GT = cmpOp setg
cgPrimOp I_GE = cmpOp setge

cgCond c t e = do
cgExp c
popReg eax
test eax eax
l1 <- newLabel
l2 <- newLabel
jz l1
cgExp t
addDepth (-1)               -- pretend last cgExp didn't push anything
jmp l2
l1 @@ cgExp e
l2 @@ return ()

cmpOp op = twoOp \$ \ r1 r2 -> do
cmp r1 r2
op (reg32ToReg8 r1)
and r1 (1 :: Word32)

cgLit (LInt i) = do
mov eax (fromIntegral i :: Word32)
pushReg eax

cgArg :: Int -> CGen ()
cgArg n = do
d <- getState
let o = 4 * (d + n)
mov eax (Disp (fromIntegral o), esp)
pushReg eax

cgQuotRem res r1 r2 = do
push edx                    -- save temp reg
push r2                     -- push second operand (in case it's edx)
xmov  eax r1                -- first operand must be in eax
mov  edx eax
sar  edx (31 :: Word8)      -- sign extend
idiv (Disp 0, esp)          -- divide by second operand (now on stack)
add  esp (4 :: Word32)      -- remove second operand
xmov  r1 res                -- put result in r1
pop  edx                    -- restore edx

-- Do a register to register move, but do generate no-ops
xmov r1 r2 = if r1 == r2 then return () else mov r1 r2

reg32ToReg8 (Reg32 r) = Reg8 r

--------

twoOp op = do
pop ebx
pop ecx
op ecx ebx
push ecx

oneOp op = do
pop ebx
op ebx
push ebx

pushReg r = do
push r

popReg r = do
pop r

--------

compileFun :: (M Int -> M Int) -> CGen ()
compileFun f = do
ensureBufferSize 1000
setState startDepth
cgPrologue
cgExp \$ unM \$ f \$ M \$ Arg 0
cgEpilogue

----

savedRegs = [ebx, ecx]

startDepth = length savedRegs + 1

-- Push all register we'll be using, except eax.
cgPrologue :: CGen ()
cgPrologue = do
mapM_ push savedRegs

-- Pop return value to eax, restore regs, and return
cgEpilogue :: CGen ()
cgEpilogue = do
popReg  eax
mapM_ pop (reverse savedRegs)
ret
Given all this, we can generate code for a function and call it from Haskell code. But that's rather boring since we have only primitive functions and no way to call DSL functions from within the DSL. So we have no recursion. And as we all know, recursion is where the fun starts. So we need to add expressions and code generation for calling functions. First we extend the Exp module.
data Fun
= FPrimOp PrimOp
| Func FuncNo

newtype FuncNo = FuncNo Int
The idea being that our DSL functions will be represented by a FuncNo which is an Int. Next, we will extend the code generation. While generating code we will keep an environment that maps function numbers to the corresponding labels to call.
type Env = [(FuncNo, Label)]
...
cgFun (Func f) n = do
e <- getEnv
call \$ fromJust \$ lookup f e
add esp (fromIntegral (4 * n) :: Word32)
pushReg eax
To call a function we look up its label and generate a call instruction to that label. Then we pop off the arguments that we pushed before the call, and finally we push EAX (where the return value is). Simple, huh? But where did the environment come from? Well, it's time to be a little inventive and introduce a monad. Functions are represented by unique integers in the expression type, so we'll need a state monad to generate them.
module Gen where
import Exp
import MExp
import CodeGen

data GenState = GenState { funcs :: [(FuncNo, CGen ())], nextFunc :: Int }

startGenState :: GenState
startGenState = GenState { funcs = [], nextFunc = 0 }

type G a = State GenState a

fun :: MInt2MInt -> G MInt2MInt
fun f = do
s <- get
let n = nextFunc s
fno = FuncNo n
put GenState{ funcs = (fno, compileFun f) : funcs s,
nextFunc = n + 1 }
return \$ \ x -> M \$ App (Func fno) [unM x]

runG :: G MInt2MInt -> CGen ()
runG g = do
let (ret, s) = runState g startGenState
funs = reverse \$ funcs s
funMap <- mapM (\ (fno, _) -> do l <- newLabel; return (fno, l)) funs
withEnv funMap \$ do
compileFun ret
zipWithM_ (\ (f, l) (_, g) -> l @@ g) funMap funs
This code actually has some interesting points. First, the monad, G, keeps a list of defined functions. The list has the function number and the CodeGen block that will generate code for it. Second, the fun function is the one that creates a new function number. It also calls compileFun to get a CodeGen block that will generate code for the function. Note that no code is generated at this point. The G monad is just a simple state monad, not the IO based codeGen monad. Also note how fun returns an expression that is of the same type as the argument, but it now uses an App to call the function. Finally, the runG function generates all the code. It uses runState to obtain the list of defined function and the result to return. Now code generation starts. For each function we generate a new label. Pairing up the function number and the label forms our environment, funMap. With this environment installed we start code generation for the functions, first the return value, then all the defined function; each with its label attached. Well, that's it. Let's wrap it all up.
module Compile(compileIO, disasmIO, module M, module Gen) where
import Convert
import Gen
import M

type MInt = M Int

compileIO :: G (MInt -> MInt) -> IO (Int -> Int)
compileIO f = fmap flex \$ compileIOW32 f
where flex g = fromIntegral . g . fromIntegral

compileIOW32 :: G (MInt -> MInt) -> IO (Word32 -> Word32)
compileIOW32 = conv_Word32ToWord32 [] undefined . runG

disasmIO :: G (MInt -> MInt) -> IO String
disasmIO = disasm [] undefined . runG
And, of course, a test. Here we hit a snag. We want recursion, but monadic bindings (do) are not recursive. Luckily, ghc does implement recursive do called mdo (why that name, I have no idea) for any monad that is in the class MonadFix. And the state monad is, so we are in luck.
{-# OPTIONS_GHC -fglasgow-exts #-}
module Main where
import Compile

main = do
let g = mdo
fib <- fun \$ \ n -> cond (n .< 2) 1 (fib(n-1) + fib(n-2))
return \$ fib
test <- compileIO g
print (test 40)
It even runs and produces the right answer, 165580141. Running time for this example is about 4.0s on my machine. Running fib compiled with 'ghc -O' takes about 3.4s, so we're in the same ballpark. Oh, and if anyone wonders what the code looks like for this example, here it is. This is really, really bad code.
003d6cd0  53                            push   ebx
003d6cd1  51                            push   ecx
003d6cd2  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6cd6  50                            push   eax
003d6cd7  e8 08 00 00 00                call   [3d6ce4H]
003d6cdc  83 c4 04                      add    esp,4H
003d6cdf  50                            push   eax
003d6ce0  58                            pop    eax
003d6ce1  59                            pop    ecx
003d6ce2  5b                            pop    ebx
003d6ce3  c3                            ret
003d6ce4  53                            push   ebx
003d6ce5  51                            push   ecx
003d6ce6  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6cea  50                            push   eax
003d6ceb  b8 02 00 00 00                mov    eax,2H
003d6cf0  50                            push   eax
003d6cf1  5b                            pop    ebx
003d6cf2  59                            pop    ecx
003d6cf3  3b cb                         cmp    ecx,ebx
003d6cf5  0f 9c c1                      setl   cl
003d6cf8  83 e1 01                      and    ecx,1H
003d6cfb  51                            push   ecx
003d6cfc  58                            pop    eax
003d6cfd  85 c0                         test   eax,eax
003d6cff  0f 84 0b 00 00 00             je     [3d6d10H]
003d6d05  b8 01 00 00 00                mov    eax,1H
003d6d0a  50                            push   eax
003d6d0b  e9 37 00 00 00                jmp    [3d6d47H]
003d6d10  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6d14  50                            push   eax
003d6d15  b8 01 00 00 00                mov    eax,1H
003d6d1a  50                            push   eax
003d6d1b  5b                            pop    ebx
003d6d1c  59                            pop    ecx
003d6d1d  2b cb                         sub    ecx,ebx
003d6d1f  51                            push   ecx
003d6d20  e8 bf ff ff ff                call   [3d6ce4H]
003d6d25  83 c4 04                      add    esp,4H
003d6d28  50                            push   eax
003d6d29  8b 44 24 10                   mov    eax,dword ptr [esp+16]
003d6d2d  50                            push   eax
003d6d2e  b8 02 00 00 00                mov    eax,2H
003d6d33  50                            push   eax
003d6d34  5b                            pop    ebx
003d6d35  59                            pop    ecx
003d6d36  2b cb                         sub    ecx,ebx
003d6d38  51                            push   ecx
003d6d39  e8 a6 ff ff ff                call   [3d6ce4H]
003d6d3e  83 c4 04                      add    esp,4H
003d6d41  50                            push   eax
003d6d42  5b                            pop    ebx
003d6d43  59                            pop    ecx
003d6d46  51                            push   ecx
003d6d47  58                            pop    eax
003d6d48  59                            pop    ecx
003d6d49  5b                            pop    ebx
003d6d4a  c3                            ret

Labels: ,

Levent Erkok said...

Here we hit a snag. We want recursion, but monadic bindings (do) are not recursive. Luckily, ghc does implement recursive do called mdo (why that name, I have no idea) for any monad that is in the class MonadFix.

The name mdo was motivated by the least-fixed-point operator μ in domain theory; where people use the notation:

μx. E

to mean

fix (λx. E)

(where x can appear free in E).

Motivated by this, we used to call the recursive version of the do-notation the "μdo-notation" informally. The closest ASCII rendering of μdo was mdo. When Jeff Lewis and I implemented it in Hugs for the first time (in 2000), we used the mdo keyword. Later on, I started adding it to GHC (late 2001), and Simon PJ finished it up, where the name mdo was used following suit with Hugs.

Incidentally, the original name of the corresponding class was MonadRec, but that was changed to MonadFix during the GHC implementation. I guess nobody had a better suggestion for mdo, so it just stuck.

The "grand vision" at that time was that mdo would replace do eventually, i.e., we would have just one syntax instead of two (just like we don't have a separate let and letrec). But that never happened, so it looks like we're stuck with this forever.

I vaguely remember a suggestion along the lines of using the keyword dorec. Thankfully, that one never saw the light of day.. :-)

-Levent.

Thursday, February 21, 2008 at 6:02:00 AM GMT