Monday, June 25, 2007

Playing with Harpy Recently there was an announcement of the Harpy package for Haskell. Harpy is a very cool package that lets you generate machine code from a Haskell program into a memory buffer. Using the Haskell FFI interface you can then turn the machine code into a Haskell function and call it. The way that you generate machine code from Harpy looks very much like assembly code. Oh, btw, apologies to those who are not using x86 machines, this blog post is very machine dependent, since Harpy generate x86 machine code. A small Harpy example First, some sample code (stolen from the Harpy tutorial). It might look like assembly code, but it's actually Haskell.
asm_fac = do
    loopTest  <- newLabel
    loopStart <- newLabel
    ensureBufferSize 160
    push ecx
    mov  ecx (Disp 8, esp)
    mov  eax (1 :: Word32)
    jmp  loopTest
    loopStart @@ mul ecx
    sub  ecx (1 :: Word32)
    loopTest @@ cmp ecx (0 :: Word32)
    jne  loopStart
    pop  ecx
    ret
Just some comments: This is (obviously) monadic code. The newLabel operation creates a new label that later has to be defined and that can be used as a jump target. The ensureBufferSize is a tedious function to make sure there is enough space to emit the instructions. I hope the next version of Harpy will not require this, since it's really the computers job to count bytes, not mine (and it's not hard to do). Apart from that, most of the code should be obvious for anyone who has done assembly programming on an x86. To generate code and then convert the generated code we need some utility functions.
type Importer f = FunPtr f -> f
foreign import ccall safe "dynamic" import_Word32ToWord32 :: Importer (Word32 -> Word32)

conv_Word32ToWord32 :: e -> s -> CodeGen e s a -> IO (Word32 -> Word32)
conv_Word32ToWord32 = conv import_Word32ToWord32

conv :: (Importer f) -> e -> s -> CodeGen e s a -> IO f
conv imp e s cg = do
    let cg' = do cg; getEntryPoint
    (_, r) <- runCodeGen cg' e s
    case r of
        Left msg -> error (show msg)
        Right addr -> return $ imp $ castPtrToFunPtr addr
The important function is conv_Word32ToWord32 which takes a block like asm_fac and turns it into a Haskell function. The e and s arguments we can ignore for now, they are for more advanced uses of the machine code generator. Note that there is no type safety here. The asm_fac block has nothing that says what kind of function it might implement, so it's all up to us to use it correctly. (Once you use FFI Haskell isn't any safer than C.) Finally, to use it all:
main = do
    fac <- conv_Word32ToWord32 () () asm_fac
    print (fac 5, fac 10)
Pretty nifty, huh? Calling conv_Word32ToWord32 with the asm_fac argument will emit machine code for the factorial function into a memory buffer and then wrap it up so it can be called as a regular Haskell function. Running the program will print (120,3628800) as expected. Final thoughts Harpy is really cool and uses Haskell type classes in a clever way to make Harpy "assembly" code look almost like normal assembly code. But there is room for improvement. Currently you have no control over the buffer handling; Harpy uses mallocBytes internally. I would like to see the buffer handling (mallocBytes, peek, the IO monad) abstracted out. That way I don't have to generate code into a memory buffer if I don't want to. I could generate code in, e.g., a list. Or I could not generate code at all, but just count bytes. Or I could allocate buffers with something better than mallocBytes. On the whole, I think Harpy is great for experimenting with code generation. More about that soon.

No comments:

Post a Comment