<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-25541020</id><updated>2012-01-25T05:30:07.753Z</updated><category term='Benchmark'/><category term='BASIC'/><category term='Haskell'/><category term='Dependent types'/><category term='LLVM'/><category term='Modules'/><category term='OCaml'/><category term='Code generation'/><category term='Compilation'/><category term='DSL'/><category term='Lambda calculus'/><category term='overloading'/><title type='text'>Things that amuse me</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>38</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-25541020.post-2698087297602282303</id><published>2011-07-09T18:54:00.000+01:00</published><updated>2011-07-09T18:54:06.533+01:00</updated><title type='text'></title><content type='html'>&lt;h2&gt;
Impredicative polymorphism, a use case&lt;/h2&gt;
In a recent &lt;a href="http://stackoverflow.com/questions/6622524/why-is-haskell-sometimes-referred-to-as-best-imperative-language/6622857#6622857"&gt;
question on stackoverflow&lt;/a&gt; I made a comment about how Haskell can
be considered a good imperative language because you can define
abstractions to make it convenient.  When I was going to make my point
by implementing a simple example of it I found that what I wanted to
do no longer works in ghc-7.0.4.  Here's a simple example of what I
wanted to do (which works in ghc-6.12.3).  It's a simple library that
makes Haskell code look a bit like Python.&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;{-# LANGUAGE ExtendedDefaultRules, TupleSections #-}
module Main where
import qualified Prelude
import Boa

last_elt(xs) = def $: do
    assert xs "Empty list"
    lst &amp;lt;- var xs              -- Create a new variable, lst
    ret &amp;lt;- var (xs.head)
    while lst $: do
        ret #= lst.head        -- Assign variable ret
        lst #= lst.tail
    return ret

first_elt(xs) = def $: do
    l &amp;lt;- var xs
    l.reverse                  -- Destructive reverse
    return (last_elt(l))

factorial(n) = def $: do
    assert (n&amp;lt;=0) "Negative factorial"
    ret &amp;lt;- var 1
    i &amp;lt;- var n
    while i $: do
        ret *= i
        i -= 1
    return ret

test = def $: do
    print "Hello"
    print ("factorial 10 =", factorial(10))

main = do
    test
    l &amp;lt;- varc [1, 2, 3]
    print ("first and last:",)
    print (first_elt(l),)
    print (last_elt(l))
&lt;/pre&gt;
&lt;br /&gt;
On the whole it's pretty boring, except for one thing.
In imperative languages there is (usually) a disctinction between
l-values and r-values.  An l-value represent a variable, i.e.,
something that can be assigned, whereas an r-value is simply a value.
So in Python, in the statement &lt;tt&gt;x = x + 1&lt;/tt&gt; the &lt;tt&gt;x&lt;/tt&gt; on
the left is an l-value (it's being assigned), whereas on the right
it's an r-value.  You can use the same notation for both in most
imperative languages.

If you want to do the same in Haskell you have (at least) two choices.
First you can unify the concepts of l-value and r-value and have a
runtime test that you only try to assign variables.  So, e.g., &lt;tt&gt;5 =
x&lt;/tt&gt; would type check but have a runtime failure.  I will not dwell
further on this since it's not very Haskelly.&lt;br /&gt;
Instead, we want to have l-values and r-values being statically type
checked.  Here's the interesting bit of the types for a simple
example.

&lt;br /&gt;
&lt;pre&gt;&amp;nbsp;&lt;/pre&gt;
&lt;pre&gt;data LValue a
data RValue a
instance (Num a) =&amp;gt; Num (RValue a)

class LR lr
instance LR RValue
instance LR LValue

var :: RValue a -&amp;gt; IO (forall lr . (LR lr) =&amp;gt; lr a)
(#=) :: LValue a -&amp;gt; RValue a -&amp;gt; IO ()

foo = do
    x &amp;lt;- var 42
    x #= x + 1
&lt;/pre&gt;
&lt;br /&gt;
We have two type constructors &lt;tt&gt;LValue&lt;/tt&gt; and &lt;tt&gt;RValue&lt;/tt&gt;
representing l-values and r-values of some type &lt;tt&gt;a&lt;/tt&gt;.  The
r-values is an instance of &lt;tt&gt;Num&lt;/tt&gt;.  Furthermore, the class
&lt;tt&gt;LR&lt;/tt&gt; where the type is either &lt;tt&gt;LValue&lt;/tt&gt; or
&lt;tt&gt;RValue&lt;/tt&gt;.&lt;br /&gt;
The &lt;tt&gt;var&lt;/tt&gt; function creates a new variable given a value.
The return type of &lt;tt&gt;var&lt;/tt&gt; is the interesting part.  It says that
the return value is polymorphic; it can be used either as an l-value
or as r-value.  Just as we want.&lt;br /&gt;
The assignment operator,
&lt;tt&gt;(#=)&lt;/tt&gt;, takes an l-value, an r-value, and returns nothing.&lt;br /&gt;
&lt;br /&gt;
So in the example we expect &lt;tt&gt;x&lt;/tt&gt; to have type
&lt;tt&gt;forall lr . (LR lr) =&amp;gt; lr a&lt;/tt&gt;, in which case the assignment
will type check.&lt;br /&gt;
If we try to compile this we get

&lt;br /&gt;
&lt;pre&gt;    Illegal polymorphic or qualified type:
      forall (lr :: * -&amp;gt; *). LR lr =&amp;gt; lr a
    Perhaps you intended to use -XImpredicativeTypes
    In the type signature for `var':
      var :: RValue a -&amp;gt; IO (forall lr. LR lr =&amp;gt; lr a)
&lt;/pre&gt;
&lt;br /&gt;
The problem is that universal quantification is not normally allowed
as the argument to a type constructor.  This requires the
impredicative polymorphism extension to ghc.  If we turn it on it
compiles fine in ghc-6.12.3.&lt;br /&gt;
But, with ghc-7.0.4 we get
&lt;br /&gt;
&lt;pre&gt;    Couldn't match expected type `LValue a0'
                with actual type `forall (lr :: * -&amp;gt; *). LR lr =&amp;gt; lr a1'
    In the first argument of `(#=)', namely `x'
    In the expression: x #= x + 1
&lt;/pre&gt;
&lt;br /&gt;
I can't really explain the rational behind the change in the ghc type
system (Simon say it's simpler now), but I feel this has really made
ghc worse for defining DSELs.  When you define a DSEL you usually use
the &lt;tt&gt;do&lt;/tt&gt;-notation for the binding construct in the embedded
language.  This is the only programmable binding construct (by
overloading &lt;tt&gt;(&amp;gt;&amp;gt;=)&lt;/tt&gt;), so there is little choice.  With the
change in ghc-7 it means you can no longer make DSELs with a
polymorhic binding construct (like I wanted here), because the binder
seems to be monomorphic now&lt;br /&gt;
&lt;br /&gt;
Please Simon, can we have polymorphic do bindings back?&lt;br /&gt;
 &lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2698087297602282303?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2698087297602282303/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2698087297602282303' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2698087297602282303'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2698087297602282303'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2011/07/impredicative-polymorphism-use-case-in.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-992454843212471185</id><published>2011-05-02T19:28:00.001+01:00</published><updated>2011-05-02T19:30:08.146+01:00</updated><title type='text'></title><content type='html'>&lt;h2&gt;


More points for lazy evaluation&lt;/h2&gt;
In a &lt;a href="http://existentialtype.wordpress.com/2011/04/24/the-real-point-of-laziness/"&gt;recent blog post&lt;/a&gt;&amp;nbsp;Bob Harper shows one use of laziness, but I think he
misses the real import points of laziness.
So I will briefly enumerate what I think are the important points of
lazyness (excuse me for not coming up with any kind of pun about points).
&lt;br /&gt;
&lt;br /&gt;
First, I'd like to say that I don't think strict or lazy matters that
much in practice; they both work fine.  I have programmed in regular
non-strict Haskell as well as strict Haskell, and most things carry
over well to a strict Haskell. &amp;nbsp;Furthermore, when I say strict language I mean a strict language with at least non-termination as an effect; total languages is another matter.&lt;br /&gt;
&lt;h3&gt;


Lazy bindings&lt;/h3&gt;
I like to tell Haskell beginners that any subexpression can be named and "pulled out", modulo name capture, of course.
For instance
&lt;br /&gt;
&lt;pre&gt;    ... e ...
&lt;/pre&gt;
is the same as
&lt;br /&gt;
&lt;pre&gt;    let x = e
    in  ... x ...
&lt;/pre&gt;
&lt;br /&gt;
The key thing is that
&lt;br /&gt;
&lt;pre&gt;    ... e ... e ...
&lt;/pre&gt;
is the same as
&lt;br /&gt;
&lt;pre&gt;    let x = e
    in  ... x ... x ...
&lt;/pre&gt;
so that common subexpressions can be given a name.&lt;br /&gt;
&lt;br /&gt;
This is (in general) just wrong in a strict language, of course.
Just take the simple example
&lt;br /&gt;
&lt;pre&gt;    if c then error "BOO!" else 0
&lt;/pre&gt;
Which is not the same as
&lt;br /&gt;
&lt;pre&gt;    let x = error "BOO!"
    in  if c then x else 0
&lt;/pre&gt;
&lt;br /&gt;
In this case you can easily fix the problem by delaying the
computation with a lambda (a common theme).
&lt;br /&gt;
&lt;pre&gt;    let x () = error "BOO!"
    in  if c then x () else 0
&lt;/pre&gt;
&lt;br /&gt;
But for a slightly more complicated example this simple technique goes
wrong.  Consider
&lt;br /&gt;
&lt;pre&gt;    map (\ a -&amp;gt; a + expensive) xs
&lt;/pre&gt;
where &lt;tt&gt;expensive&lt;/tt&gt; does not depend on &lt;tt&gt;a&lt;/tt&gt;.  In this case
you want to move the expensive computation out of the loop (cf. loop
invariant code motion in imperative languages).  Like so
&lt;br /&gt;
&lt;pre&gt;    let x = expensive
    in  map (\ a -&amp;gt; a + x) xs
&lt;/pre&gt;
In a lazy language &lt;tt&gt;x&lt;/tt&gt; will be evaluated exactly zero times or
once, just as we want.  Using the delaying trick doesn't work here:&lt;br /&gt;
&lt;pre&gt;    let x () = expensive
    in  map (\ a -&amp;gt; a + x ()) xs
&lt;/pre&gt;
since &lt;tt&gt;expensive&lt;/tt&gt; will get evaluated once for every list
element.
&lt;br /&gt;
&lt;br /&gt;
This is easy enough to remedy, by introducing an abstraction for lazy
computations (which will contain an assignment to compute the value
just once).  The signature for the abstract type of lazy values is
something like
&lt;br /&gt;
&lt;pre&gt;    data Lazy a
    delay :: (() -&amp;gt; a) -&amp;gt; Lazy a
    force :: Lazy a -&amp;gt; a
&lt;/pre&gt;
Note that the &lt;tt&gt;delay&lt;/tt&gt; needs to take a function to avoid the
&lt;tt&gt;a&lt;/tt&gt; being evaluated early.
&lt;br /&gt;
(This is probably what Bob would name a benign effect and is easily programmed using&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: monospace; white-space: pre;"&gt;unsafePerformIO&lt;/span&gt;, which means it needs careful consideration.)&lt;br /&gt;
&lt;br /&gt;
And so we get
&lt;br /&gt;
&lt;pre&gt;    let x = delay (\ () -&amp;gt; expensive)
    in  map (\ a -&amp;gt; a + force x) xs
&lt;/pre&gt;
This isn't exactly pretty, but it works fine.  In a language with
macros the ugliness can be hidden better.

&lt;br /&gt;
&lt;h3&gt;


Lazy functions&lt;/h3&gt;
Even strict languages like ML and C have some lazy functions even if
they don't call them that, like SML's&amp;nbsp;&lt;em&gt;if&lt;/em&gt;, &lt;em&gt;andthen&lt;/em&gt;, and
&lt;em&gt;orelse&lt;/em&gt;.  You really need the &lt;i&gt;if&lt;/i&gt; construct to evaluate the
condition and then one of the branches depending on the condition.
But what if I want to define my own type with the same kind of
functions?
In ML I can't, in C I would have to use a macro.
&lt;br /&gt;
&lt;br /&gt;
The ability to define new functions that can be used as control
constructs is especially important when you want to design embedded
domain specific languages.  Take the simple example of the
&lt;tt&gt;when&lt;/tt&gt; (i.e., one-arm if) function in Haskell.
&lt;br /&gt;
&lt;pre&gt;    when :: (Monad m) =&amp;gt; Bool -&amp;gt; m () -&amp;gt; m ()
&lt;/pre&gt;
A quite common use of this function in monadic code is to check for
argument preconditions in a function, like
&lt;br /&gt;
&lt;pre&gt;    f x = do
        when (x &amp;lt; 0) $
            error "x must be &amp;gt;= 0"
        ...
&lt;/pre&gt;
If the &lt;tt&gt;when&lt;/tt&gt; function is strict this is really bad, of course,
since the call to error will happen before the &lt;tt&gt;when&lt;/tt&gt; is
called.
&lt;br /&gt;
&lt;br /&gt;
Again, one can work around this by using lazy values, like
&lt;br /&gt;
&lt;pre&gt;    myAnd :: MyBool -&amp;gt; Lazy MyBool -&amp;gt; MyBool
    ...
    ... myAnd x (delay (\ () -&amp;gt; y)) ...
&lt;/pre&gt;
But in my opinion, this is too ugly to even consider.  The intent of
the function is obscured by the extra noise to make the second
argument lazy.
&lt;br /&gt;
&lt;br /&gt;
I think every language needs a mechanism for defining some form of
call-by-name functions.  And many languages have this in the form of
macros (maybe built in like in Lisp, or as a preprocessor like C).
&lt;br /&gt;
If a language cannot define lazy function it simply lacks in
abstraction power.  I think any kind of fragment of the language
should be nameable and reusable.  (Haskell lacks this ability for
patterns and contexts; a wart in the design.)  So if you notice a
repeated expression pattern, like
&lt;br /&gt;
&lt;pre&gt;    if c then t else False
&lt;/pre&gt;
and cannot give this a name, like
&lt;br /&gt;
&lt;pre&gt;    and c t = if c then t else False
&lt;/pre&gt;
and then use it with the same effect as the orginal expression, well,
then your language is lacking.
&lt;br /&gt;
&lt;br /&gt;
For some language constructs the solution adopted by Smalltalk (and
later Ruby), i.e., a very lightweight way on constructing closures is
acceptable.  So, for instance, I could accept writing
&lt;br /&gt;
&lt;pre&gt;    ... myAnd x {y} ...&lt;/pre&gt;
&lt;br /&gt;
(In SML you could make something using functors, but it's just too ugly to contemplate.)&lt;br /&gt;
&lt;h3&gt;


&lt;/h3&gt;
&lt;h3&gt;


Lazy constructors&lt;/h3&gt;
Lazy constructors were sort of covered in what Bob claimed to be the point of laziness, so I'll just mention them for completeness. &amp;nbsp;Sometimes you need them, but in my experience it's not very often.&lt;br /&gt;
&lt;h3&gt;


Cyclic data structures&lt;/h3&gt;
&lt;div&gt;
This is related to the last point.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
Sometimes you really want cyclic data structures.  An example are the
Haskell data types in &lt;tt&gt;Data.Data&lt;/tt&gt; that describe data types and
constructors.  A data type descriptor needs to contain a list of its
constructors and a constructor descriptor needs to contain the data type
descriptor.
&lt;br /&gt;
In Haskell this can be described very naturally by having the two
descriptors reference each other.
&lt;br /&gt;
In SML this is not possible.  You will have to break the cycle by
somthing like a reference (or a function).
&lt;br /&gt;
In OCaml you can define cyclic data structures in a similar fashion to
Haskell, so this isn't really a problem with strict languages, but
rather a feature that you can have if you like.
Of course, being able to define cyclic data leads to non-standard
elements in your data types, like
&lt;br /&gt;
&lt;pre&gt;    data Nat = Zero | Succ Zero
    omega :: Nat
    omega = Succ omega
&lt;/pre&gt;
So having the ability to define cyclic data structures is a double
edged sword.
&lt;br /&gt;
I find the lack of a simple way to define cyclic data a minor nuisance only.


&lt;br /&gt;
&lt;h3&gt;


Reuse&lt;/h3&gt;
I've saved my biggest gripe of strict evaluation for last.
&lt;em&gt;Strict evaluation is fundamentally flawed for function reuse.&lt;/em&gt;&lt;br /&gt;
&lt;em&gt;&lt;/em&gt;
What do I mean?  I will illustrate with and example.
&lt;br /&gt;
Consider the &lt;tt&gt;any&lt;/tt&gt; function is Haskell:
&lt;br /&gt;
&lt;pre&gt;any :: (a -&amp;gt; Bool) -&amp;gt; [a] -&amp;gt; Bool
any p = or . map p

&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;It's quite natural to express the &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;tt&gt;any&lt;/tt&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt; function by reusing the
&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;tt&gt;map&lt;/tt&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt; and &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;tt&gt;or&lt;/tt&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt; functions.  Unfortunately, it doesn't
behave like we would wish in a strict language.  The &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;tt&gt;any&lt;/tt&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&amp;nbsp;function should scan the list from the head forwards and as soon as an
element that fulfills the predicate is found it should return true and stop
scanning the list.  In a strict language this would not happen, since
the predicate will be applied to every element before the &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;tt&gt;or&lt;/tt&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;
examines the elements.&lt;/span&gt;&lt;/pre&gt;
So we are forced to manually fuse the two functions, doing so we get:
&lt;br /&gt;
&lt;pre&gt;any :: (a -&amp;gt; Bool) -&amp;gt; [a] -&amp;gt; Bool
any p = foldr False (\ x r -&amp;gt; p x || r)

&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;pre style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;or :: [Bool] -&amp;gt; Bool
or = foldr False (||)
&lt;/pre&gt;
&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Times; white-space: normal;"&gt;&lt;pre style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/pre&gt;
&lt;/span&gt;&lt;pre&gt;&lt;/pre&gt;
foldr :: (a -&amp;gt; b -&amp;gt; b) -&amp;gt; b -&amp;gt; [a] -&amp;gt; b
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
&lt;/pre&gt;
&lt;pre&gt;&lt;/pre&gt;
&lt;br /&gt;
But the misery doesn't end here.  This still doesn't do the right
thing, because the strict language&amp;nbsp;will recurse all the way down
the list since it will call&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;foldr&lt;/span&gt;&amp;nbsp;before&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;f&lt;/span&gt;.  So we either have to fuse again, or invent a new version of
&lt;tt&gt;foldr&lt;/tt&gt; that delays the recursive call.&lt;br /&gt;
One more fusion gets us to
&lt;br /&gt;
&lt;pre&gt;any p []     = False
any p (y:ys) = y || any p ys
&lt;/pre&gt;
&lt;pre&gt;&lt;/pre&gt;
So where's the function reuse?  Nowhere in sight.
&lt;br /&gt;
&lt;br /&gt;
With strict evaluation you can no longer with a straight face tell
people:  don't use recursion, reuse the recursion patterns in map,
filter, foldr, etc.  It simply doesn't work (in general).&lt;br /&gt;
&lt;br /&gt;
Using macros doesn't really save us this time, because of the
recursive definitions.  I don't really know of any way to fix this
problem short of making all (most?) functions lazy, because the
problem is pervasive. &amp;nbsp;I.e., in the example it would not be enough to fix&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;foldr&lt;/span&gt;; all the functions involved need to be lazy to get the desired semantics.&lt;br /&gt;
&lt;br /&gt;
I find this the biggest annoyance with strict evaluation, but at the
same time it's just an annoyance, because you can always rewrite the
code to make it work.  But strict evaluation really, fundamentally
stops you from reusing functions the way you can with laziness.
&lt;br /&gt;
&lt;br /&gt;
As an aside, the definition of &lt;tt&gt;any&lt;/tt&gt; doesn't work in SML for
another reason as well, namely the value restriction.  But that's just
a language wart, like the monomorphism restriction in Haskell.

&lt;br /&gt;
&lt;h3&gt;


Complexity&lt;/h3&gt;
Another complaint Bob Harper had about lazy evaluation is the
difficulty of finding out the complexity of functions.  I totally
agree that the space complexity is very messy in a lazy language.
For (sequential) time complexity I don't see any need to worry.
&lt;br /&gt;
If  strict a function has O(&lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;)) complexity in a strict language then
it has complexity O(&lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;)) in a lazy language as well. &amp;nbsp;Why worry? :)

&lt;br /&gt;
&lt;h3&gt;


Summing up&lt;/h3&gt;
One of the most important principles in all software design is DRY,
i.e., Don't Repeat Yourself.  This means that common patterns that we
can find in a program should be abstractable so you can reuse them.
In a lazy language you can use functions for abstracting control
structures, which a strict language does not permit (at least not in a
convenient way).  This can be mitigated by providing some other
abstraction mechanism, like macros (hopefully some kind of sane macros).
&lt;br /&gt;
For the reasons above, I find it more convenient to program in a lazy
language than a strict one.  But a strict language with the ability to define lazy bindings, lazy functions, and lazy constructors is good enough that I find it usable.
&lt;br /&gt;
&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-992454843212471185?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/992454843212471185/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=992454843212471185' title='32 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/992454843212471185'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/992454843212471185'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2011/05/more-points-for-lazy-evaluation-in.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>32</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1085569182406070563</id><published>2011-04-20T23:25:00.000+01:00</published><updated>2011-04-20T23:25:53.743+01:00</updated><title type='text'></title><content type='html'>&lt;h2&gt;

Ugly memoization&lt;/h2&gt;
Here's a problem that I recently ran into.  I have a function taking a
string and computing some value.  I call this function a lot, but a
lot of the time the argument has occurred before.  The function is
reasonably expensive, about 10 us.  Only about 1/5 of the calls to the
function has a new argument.

&lt;br /&gt;
&lt;br /&gt;
So naturally I want to memoize the function.  Luckily Hackage has a
couple packages for memoization.  I found
&lt;a href="http://hackage.haskell.org/package/data-memocombinators"&gt;data-memocombinators&lt;/a&gt;
and
&lt;a href="http://hackage.haskell.org/package/MemoTrie"&gt;MemoTrie&lt;/a&gt;
and decided to try them.

The basic idea with memoization is that you have a function like
&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;  memo :: (a-&amp;gt;b) -&amp;gt; (a-&amp;gt;b)
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
I.e., you give a function to memo and you get a new function of the
same type back.  This new function behaves like the original one, but
it remembers every time it is used and the next time it gets the same
argument it will just return the remembered result.
This is only safe in a pure language, but luckily Haskell is pure.&lt;br /&gt;
In an imperative language you can use a mutable memo table that stores
all the argument-result pairs and updates the memo table each time the
function is used.  But how is it even possible to implement that in a
pure language?  The idea is to lazily construct the whole memo table
in the call to memo, and it will then be lazily filled in.&lt;br /&gt;
Assume that all values of the argument type &lt;em&gt;a&lt;/em&gt; can be
enumerated by the method &lt;tt&gt;enumerate&lt;/tt&gt;, we could then write
memo like this:
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;  memo f =
      let table = [ (x, f x) | x &amp;lt;- enumerate ]
      in  \ y -&amp;gt; let Just r = lookup y table in r
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
Note how the memo table is constructed given just f, and this memo
table is then used in the returned function.&lt;br /&gt;
The type of this function would be something like
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;  memo (Enumerate a, Eq a) =&amp;gt; (a-&amp;gt;b) -&amp;gt; (a-&amp;gt;b)
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
assuming that the class &lt;tt&gt;Enumerate&lt;/tt&gt; has the magic method &lt;tt&gt;enumerate&lt;/tt&gt;.&lt;br /&gt;
&lt;br /&gt;
This just a very simplified example, if you tried to use this it would
be terrible because the returned function does linear lookup in a
list.  Instead we want some kind of search tree, which is what the two
packages I mention implement.  The MemoTrie package does this in a
really beautiful way, I recommend reading &lt;a href="http://conal.net/blog/posts/elegant-memoization-with-functional-memo-tries/"&gt;Conal's blog post&lt;/a&gt; about it.&lt;br /&gt;
OK, enough preliminaries.
I used &lt;a href="http://hackage.haskell.org/package/criterion"&gt;criterion&lt;/a&gt; to perform the benchmarking, and I tried with no
memoization (&lt;em&gt;none&lt;/em&gt;), memo-combinators (&lt;em&gt;comb&lt;/em&gt;), and
MemoTrie (&lt;em&gt;beau&lt;/em&gt;).  I had a test function taking about 10us,
and then i called this functions with different number of repeated
arguments: 1, 2, 5, and 10.  I.e., 5 means that each argument occurred
5 times as the memoized function was called.

&lt;br /&gt;
&lt;br /&gt;
&lt;table border="4"&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;th&gt;&lt;/th&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;2&lt;/td&gt; &lt;td&gt;5&lt;/td&gt; &lt;td&gt;10&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;none &lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;comb &lt;/td&gt; &lt;td&gt;62.6&lt;/td&gt; &lt;td&gt;52.2&lt;/td&gt; &lt;td&gt;45.8&lt;/td&gt; &lt;td&gt;43.4&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;beau &lt;/td&gt; &lt;td&gt;27.6&lt;/td&gt; &lt;td&gt;17.0&lt;/td&gt; &lt;td&gt;10.4&lt;/td&gt; &lt;td&gt;8.1&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;br /&gt;
So with no memoization the time per call was 10.7 us all the time, no
surprise there.  With the memo combinators it was much slower than no
memoization; the overhead for looking something up is bigger than the
cost of computing the result.  So that was a failure.  The MemoTrie
does better, at about an argument repetition of five it starts to
break even, and at ten it's a little faster to memoize.&lt;br /&gt;
&lt;br /&gt;
Since I estimated my repetition factor in the real code to be about
five even the fastest memoization would not be any better then
recomputation.  So now what?  Give up?  Of course not!  It's time to
get dirty.&lt;br /&gt;
&lt;br /&gt;
Once you know a function can be implemented in a pure way, there's no
harm in implementing the same function in an impure way as long as it
presents the pure interface.  So lets write the memo function the way
it would be done in, e.g., Scheme or ML.  We will use a reference to
hold a memo table that gets updated on each call.  Here's the code,
with the type that the function gets.
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;import Data.IORef
import qualified Data.Map as M

memoIO :: (Ord a) =&amp;gt; (a -&amp;gt; b) -&amp;gt; IO (a -&amp;gt; IO b)
memoIO f = do
  v &amp;lt;- newIORef M.empty
  let f' x = do
        m &amp;lt;- readIORef v
        case M.lookup x m of
          Nothing -&amp;gt; do let { r = f x }; writeIORef v (M.insert x r m); return r
          Just r  -&amp;gt; return r
  return f'
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
The &lt;tt&gt;memoIO&lt;/tt&gt; allocated a reference with an empty memo table.
We then define a new function, &lt;tt&gt;f'&lt;/tt&gt;, which when it's called
with get the memo table and look up the argument.  If the argument is
in the table then we just return the result, if it's not then we
compute the result, store it in the table, and return it.
Good old imperative programming (see below why this code is not good
imperative code).&lt;br /&gt;
&lt;br /&gt;
But, horror, now the type is all wrong, there's IO in two places.
The function we want to implement is actually pure.  So what to do?
Well, if you have a function involving the IO type, but you can prove
it is actually pure, then (and only then) you are allowed to use
&lt;tt&gt;unsafePerformIO&lt;/tt&gt;.&lt;br /&gt;
&lt;br /&gt;
I'll wave my hands instead of a proof (but more later), and here we go
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;  memo :: (Ord a) =&amp;gt; (a -&amp;gt; b) -&amp;gt; (a -&amp;gt; b)
  memo f = let f' = unsafePerformIO (memoIO f) in \ x -&amp;gt; unsafePerformIO (f' x)
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
Wow, two &lt;tt&gt;unsafePerformIO&lt;/tt&gt; on the same line.  It doesn't get
much less safe than that.&lt;br /&gt;
Let's benchmark again:

&lt;br /&gt;
&lt;br /&gt;
&lt;table border="4"&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;th&gt;&lt;/th&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;2&lt;/td&gt; &lt;td&gt;5&lt;/td&gt; &lt;td&gt;10&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;none &lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt; &lt;td&gt;10.7&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;comb &lt;/td&gt; &lt;td&gt;62.6&lt;/td&gt; &lt;td&gt;52.2&lt;/td&gt; &lt;td&gt;45.8&lt;/td&gt; &lt;td&gt;43.4&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;beau &lt;/td&gt; &lt;td&gt;27.6&lt;/td&gt; &lt;td&gt;17.0&lt;/td&gt; &lt;td&gt;10.4&lt;/td&gt; &lt;td&gt;8.1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;ugly &lt;/td&gt; &lt;td&gt;13.9&lt;/td&gt; &lt;td&gt;7.7&lt;/td&gt; &lt;td&gt;3.9&lt;/td&gt; &lt;td&gt;2.7&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;br /&gt;
Not too shabby, using the ugly memoization is actually a win already
at two, and just a small overhead if the argument occurs once. &amp;nbsp;We
have a winner!&lt;br /&gt;
&lt;br /&gt;
No so fast, there's

&lt;br /&gt;
&lt;h3&gt;

A snag&lt;/h3&gt;
My real code can actually be multi-threaded, so the memo function had
better work in a multi-threaded setting.  Well, it doesn't.  There's
no guarantee about &lt;tt&gt;readIORef&lt;/tt&gt; and &lt;tt&gt;writeIORef&lt;/tt&gt; when
doing multi-threading.&lt;br /&gt;
So we have to rewrite it.  Actually, the code I first wrote is the one
below; I hardly ever use IORef because I want it to work with
multi-threading.

&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;memoIO f = do
  v &amp;lt;- newMVar M.empty
  let f' x = do
        m &amp;lt;- takeMVar v
        case M.lookup x m of
          Nothing -&amp;gt; do let { r = f x }; putMVar v (M.insert x r m); return r
          Just r  -&amp;gt; do                  putMVar v m;                return r
  return f'
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
So now we use an &lt;tt&gt;MVar&lt;/tt&gt; instead.  This makes it thread safe.
Only one thread can execute between the &lt;tt&gt;takeMVar&lt;/tt&gt; and the
&lt;tt&gt;putMVar&lt;/tt&gt;.  This guarantees than only one thread can update the
memo table at a time.  If two threads try at the same time one has to
wait a little.  How long?  The time it takes for the lookup, plus some
small constant.  Remember that Haskell is lazy, the the &lt;tt&gt;(f x)&lt;/tt&gt;
is not actually computed with the lock held, which is good.&lt;br /&gt;
So I think this is a perfectly reasonable &lt;tt&gt;memoIO&lt;/tt&gt;.  And we can
do the same unsafe trick as before and make it pure.  Performance of
this version is the same as with the &lt;tt&gt;IORef&lt;/tt&gt;.&amp;nbsp;
&lt;br /&gt;
&lt;br /&gt;
Ahhhh, bliss.

But wait, there's

&lt;br /&gt;
&lt;h3&gt;

Another snag&lt;/h3&gt;
That might look reasonable, but in fact the &lt;tt&gt;memo&lt;/tt&gt; function is
broken now.  It appears to work, but here's a simple use that fails
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;  sid :: String -&amp;gt;; String
  sid = memo id

  fcn s = sid (sid s)
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
What will happen here?  The outer call to &lt;tt&gt;sid&lt;/tt&gt; will execute
the &lt;tt&gt;takeMVar&lt;/tt&gt; and then do the lookup.  Doing the lookup with
evaluate the argument, &lt;tt&gt;x&lt;/tt&gt;.  But this argument is another call
to &lt;tt&gt;sid&lt;/tt&gt;, this will try to execute the &lt;tt&gt;takeMVar&lt;/tt&gt;.
Disaster has struck, deadlock.&lt;br /&gt;
&lt;br /&gt;
What happened here?  The introduction of &lt;tt&gt;unsafePerformIO&lt;/tt&gt;
ruined the sequencing guaranteed by the IO monad that would have
prevented the deadlock if we had used &lt;tt&gt;memoIO&lt;/tt&gt;.  I got what I
deserved for using &lt;tt&gt;unsafePerformIO&lt;/tt&gt;.&lt;br /&gt;
&lt;br /&gt;
Can it be repaired?  Well, we could make sure &lt;tt&gt;x&lt;/tt&gt; is fully
evaluated before grabbing the lock.  I settled for a different repair,
where the lock is held in a shorter portion of the code.
&lt;br /&gt;
&lt;pre&gt;
&lt;/pre&gt;
&lt;pre&gt;memoIO f = do
  v &amp;lt;- newMVar M.empty
  let f' x = do
        m &amp;lt;- readMVar v
        case M.lookup x m of
          Nothing -&amp;gt; do let { r = f x }; m &amp;lt;- takeMVar v; putMVar v (M.insert x r m); return r
          Just r  -&amp;gt; return r
  return f'
&lt;/pre&gt;
&lt;pre&gt;
&lt;/pre&gt;
This solution has its own problem.  It's now possible for several threads
to compute &lt;tt&gt;(f x)&lt;/tt&gt; for the same &lt;tt&gt;x&lt;/tt&gt; and the result of
all but one of those will be lost by overwriting the table.  This is a
price I'm willing to pay for this application.

&lt;br /&gt;
&lt;h3&gt;

Moral&lt;/h3&gt;
Yes, you can use imperative programming to implement pure functions.
But the onus is on you to prove that it is safe.  This is not as easy
as you might think.  I believe my final version is correct (with the
multiple computation caveat), but I'm not 100% sure.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1085569182406070563?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1085569182406070563/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1085569182406070563' title='24 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1085569182406070563'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1085569182406070563'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2011/04/ugly-memoization-heres-problem-that-i.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>24</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-2165437207763037673</id><published>2011-04-10T19:51:00.002+01:00</published><updated>2011-04-10T19:51:44.670+01:00</updated><title type='text'></title><content type='html'>Phew!  Cleaned out a lot of spam comments in my blog.  Hopefully my new settings will prevent the crazy onslaught of spammers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2165437207763037673?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2165437207763037673/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2165437207763037673' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2165437207763037673'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2165437207763037673'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2011/04/phew-cleaned-out-lot-of-spam-comments.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-4910575514980456234</id><published>2009-06-10T22:35:00.002+01:00</published><updated>2009-06-10T23:11:21.318+01:00</updated><title type='text'></title><content type='html'>&lt;h2&gt;More LLVM&lt;/h2&gt;

Recently someone asked me on #haskell if you could use the Haskell LLVM bindings to compile some abstract syntax to a Haskell function.  Naturally I said yes, but then I realized I had only done it for a boring language with just one type.  I had no doubt that it could be done for more complicated languages with multiple types, but it might not be totally obvious how.  So I decided to write a simple compiler, and this blog post is the result.

First, a simple example:
&lt;pre&gt;
    main = do
        let f :: Double -&gt; Double
            Just f = compile "\\ (x::Double) -&gt; if x == 0 then 0 else 1/(x*x)"
        print (f 2, f 3, f 0)
&lt;/pre&gt;
Running this program produces (as expected)
&lt;pre&gt;
    (0.25,0.1111111111111111,0.0)
&lt;/pre&gt;
What has happened is that the string has been parsed to an abstract syntax tree, translated into LLVM code, then to machine code, and finally turned back into a Haskell callable function.  Many things can go wrong along the way, like syntax and type errors, so &lt;tt&gt;compile&lt;/tt&gt; returns a &lt;tt&gt;Maybe&lt;/tt&gt; type to indicate if things went right or wrong.  (A more serious version of the &lt;tt&gt;compile&lt;/tt&gt; function would return an error message when something has gone wrong.)

The definition of the compilation function is simple and illustrates the flow of the compiler
&lt;pre&gt;
compile :: (Translate a) =&gt; String -&gt; Maybe a
compile = fmap translate . toTFun &amp;lt;=&amp;lt; mParseUFun
&lt;/pre&gt;
The context &lt;tt&gt;Translate&lt;/tt&gt; is there to limit the types that can actually be translated; it's a necessary evil and exactly what types are allowed depends on how advanced we make the compiler.  Had we ignored the &lt;tt&gt;Maybe&lt;/tt&gt; type the definitions would have been
&lt;pre&gt;
compile = translate . toTFun . mParseUFun
&lt;/pre&gt;
which says, first parse to the type &lt;tt&gt;UFun&lt;/tt&gt; (untyped expressions), then type check and turn it into the type &lt;tt&gt;TFun a&lt;/tt&gt;, and finally translate &lt;tt&gt;TFun a&lt;/tt&gt; into an &lt;tt&gt;a&lt;/tt&gt; by LLVM compilation.

Let's see how this all works.

&lt;h3&gt;The UExp module&lt;/h3&gt;
The first step is to just define an abstract syntax for the expressions that we want to handle.  I'm only allowing leading lambdas (this a very first order language), so there's a distiction between the top level &lt;tt&gt;UFun&lt;/tt&gt; type and the expression type &lt;tt&gt;UExp&lt;/tt&gt;.  The U prefix indicates that this version of the syntax is not yet type checked.

The definition is pretty boring, but here it is:
&lt;pre&gt;
{-# OPTIONS_GHC -fno-warn-unused-binds #-}
{-# LANGUAGE RecordWildCards #-}
module UExp(Id, UFun(..), UTyp(..), UExp(..), parseUFun, showOp, mParseUFun) where
import Data.Maybe
import Data.List
import Data.Function
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
import Text.ParserCombinators.Parsec.Token
import Text.ParserCombinators.Parsec.Language

type Id = String

data UFun = UFun [(Id, UTyp)] UExp

data UTyp = UTBol | UTDbl

data UExp
    = UDbl Double        -- ^ Double literal
    | UBol Bool          -- ^ Bool literal
    | UVar Id            -- ^ Variable
    | UApp Id [UExp]     -- ^ Function application
    | ULet Id UExp UExp  -- ^ Local binding
&lt;/pre&gt;

Naturally, we want to be able to show the expressions, if nothing else so for debugging.  So I make a &lt;tt&gt;Show&lt;/tt&gt; instance that shows them in a nice way respecting operator precedences etc.  There's nothing exciting going on, the large number of lines is just to cover operator printing.
&lt;pre&gt;
instance Show UFun where
    showsPrec p (UFun [] e) = showsPrec p e
    showsPrec p (UFun vts e) = showParen (p&gt;0) (showString "\\ " . foldr (.) (showString "-&gt; ") (map f vts) . showsPrec 0 e)
      where f (v, t) = showParen True (showString v . showString " :: " . showsPrec 0 t) . showString " "

instance Show UTyp where
    showsPrec _ UTDbl = showString "Double"
    showsPrec _ UTBol = showString "Bool"

instance Show UExp where
    showsPrec p (UDbl d) = showsPrec p d
    showsPrec p (UBol b) = showsPrec p b
    showsPrec _ (UVar i) = showString i
    showsPrec p (UApp "if" [c, t, e]) =
      showParen (p&gt;0) (showString "if " . showsPrec 0 c . showString " then " . showsPrec 0 t . showString " else " . showsPrec 0 e)
    showsPrec p (UApp op [a, b]) = showOp p op a b
    showsPrec _ (UApp op _) = error $ "Uxp.show " ++ op
    showsPrec p (ULet i e b) =
      showParen (p&gt;0) (showString "let " . showString i . showString " = " . showsPrec 0 e . showString " in " . showsPrec 0 b)

showOp :: (Show a, Show b) =&gt; Int -&gt; String -&gt; a -&gt; b -&gt; String -&gt; String
showOp q sop a b = showParen (q&gt;mp) (showsPrec lp a . showString sop . showsPrec rp b)
  where (lp,mp,rp) = case lookup sop ops of
                    Just (p, AssocLeft)  -&gt; (p,   p, p+1)
                    Just (p, AssocRight) -&gt; (p+1, p, p)
                    Just (p, AssocNone)  -&gt; (p+1, p, p+1)
                    Nothing              -&gt; (9,   9, 10)

ops :: [(String, (Int, Assoc))]
ops = [("+",  (6, AssocLeft)),
       ("-",  (6, AssocLeft)),
       ("*",  (7, AssocLeft)),
       ("/",  (7, AssocLeft)),
       ("==", (4, AssocNone)),
       ("&amp;lt;=", (4, AssocNone)),
       ("&amp;&amp;", (3, AssocRight)),
       ("||", (2, AssocRight))
      ]
&lt;/pre&gt;

We also want to be able to parse, so I'm using &lt;tt&gt;Parsec&lt;/tt&gt; to parse the string and produce an AST.  Again, there's nothing interesting going on.  I use the Haskell lexical analysis provided by &lt;tt&gt;Parsec&lt;/tt&gt;.  This is available as a &lt;tt&gt;TokenParser&lt;/tt&gt; record, which can be conveniently opened with the &lt;tt&gt;RecordWildcard&lt;/tt&gt; notation &lt;tt&gt;TokenParser{..}&lt;/tt&gt;.
&lt;pre&gt;
parseUFun :: SourceName -&gt; String -&gt; Either ParseError UFun
parseUFun = parse $ do f &amp;lt;- pFun; eof; return f
  where TokenParser{..} = haskell
        pFun = do
            vts &amp;lt;- between (reservedOp "\\")
                           (reservedOp "-&gt;")
                           (many $ parens $ do v &amp;lt;- identifier; reservedOp "::"; t &amp;lt;- pTyp; return (v, t))
               &amp;lt;|&gt; return []
            e &amp;lt;- pExp
            return $ UFun vts e
        pTyp = choice [do reserved "Bool"; return UTBol, do reserved "Double"; return UTDbl]
        pExp = choice [pIf, pLet, pOExp]
        pIf = do reserved "if"; c &amp;lt;- pExp; reserved "then"; t &amp;lt;- pExp; reserved "else"; e &amp;lt;- pExp; return $ UApp "if" [c, t, e]
        pLet = do reserved "let"; i &amp;lt;- identifier; reservedOp "="; e &amp;lt;- pExp; reserved "in"; b &amp;lt;- pExp; return $ ULet i e b
        pOExp = buildExpressionParser opTable pAExp
        pAExp = choice [pDbl, pVar, parens pExp]
        pVar = fmap eVar identifier
        pDbl = fmap (either (UDbl . fromInteger) UDbl) naturalOrFloat
        eVar i = if i == "False" then UBol False else if i == "True" then UBol True else UVar i

        opTable = reverse . map (map mkOp) . groupBy ((==) `on` prec) . sortBy (compare `on` prec) $ ops
          where mkOp (s, (_, a)) = Infix (do reservedOp s; return $ \ x y -&gt; UApp s [x, y]) a
                prec = fst . snd

mParseUFun :: String -&gt; Maybe UFun
mParseUFun = either (const Nothing) Just . (parseUFun "")
&lt;/pre&gt;

The parser is packaged up in &lt;tt&gt;mParseUFun&lt;/tt&gt; which returns an AST if it all worked.


&lt;h3&gt;The TExp module&lt;/h3&gt;
Since the LLVM API is typed it's much easier to translate a typed abstract syntax tree than an untyped abstract syntax tree.  The &lt;tt&gt;TExp&lt;/tt&gt; module contains the definition of the typed AST and the type checker that converts to it.

There are many ways to formulate type safe abstract syntax trees.  I've chosen to use GADTs.  I've also picked to represent variables (still) by identifiers, which means that the syntax tree is not necessarily type safe.  Furthermore, I've chosen a very limited way to represent function application since this is all I need for this example.  The variantions on this are endless.

&lt;pre&gt;
{-# LANGUAGE GADTs, ExistentialQuantification, PatternGuards #-}
module TExp(Id,
            TFun(..), TTyp(..), TExp(..), DblOp(..), BolOp(..), CmpOp(..),
            Equal(..), test,
            Type(..),
            AFun(..), extractFun,
            typeCheck, toTFun) where
import Data.Maybe
import Control.Monad
import UExp

data TFun a where
    TBody :: TExp a                 -&gt; TFun a
    TLam  :: Id -&gt; TTyp a -&gt; TFun b -&gt; TFun (a-&gt;b)

data TTyp a where
    TTBol ::                     TTyp Bool
    TTDbl ::                     TTyp Double
    TTArr :: TTyp a -&gt; TTyp b -&gt; TTyp (a-&gt;b)

data TExp a where
    TDbl   :: Double                                            -&gt; TExp Double
    TBol   :: Bool                                              -&gt; TExp Bool
    TDblOp :: DblOp     -&gt; TExp Double -&gt; TExp Double           -&gt; TExp Double
    TBolOp :: BolOp     -&gt; TExp Bool   -&gt; TExp Bool             -&gt; TExp Bool
    TCmpOp :: CmpOp     -&gt; TExp Double -&gt; TExp Double           -&gt; TExp Bool
    TIf    :: TExp Bool -&gt; TExp a      -&gt; TExp a                -&gt; TExp a
    TLet   :: Id        -&gt; TTyp a      -&gt; TExp a      -&gt; TExp b -&gt; TExp b
    TVar   :: Id                                                -&gt; TExp a

data DblOp = DAdd | DSub | DMul | DDiv
    deriving (Eq, Show)

data BolOp = BAnd | BOr
    deriving (Eq, Show)

data CmpOp = CEq | CLe
    deriving (Eq, Show)
&lt;/pre&gt;
So for instance, &lt;tt&gt;UApp "+" [UVar "x", UDbl 2.2]&lt;/tt&gt; will be represented by &lt;tt&gt;TDblOp DAdd (TVar "x") (TDbl 2.2)&lt;/tt&gt; which has type &lt;tt&gt;TExp Double&lt;/tt&gt;.  So the type of the expression is now accurately reflected in the type of the syntax tree.  Even the &lt;tt&gt;UTyp&lt;/tt&gt; type now has a typed equivalent where the real type is reflected.

For completeness, here's some code for pretty printing etc.
&lt;pre&gt;
{-# LANGUAGE GADTs, ExistentialQuantification, PatternGuards #-}
module TExp(Id,
            TFun(..), TTyp(..), TExp(..), DblOp(..), BolOp(..), CmpOp(..),
            Equal(..), test,
            Type(..),
            AFun(..), extractFun,
            typeCheck, toTFun) where
import Data.Maybe
import Control.Monad
import UExp

instance Show (TFun a) where
    showsPrec p (TBody e) = showsPrec p e
    showsPrec p (TLam i t e) = showParen (p&gt;0)
      (showString "\\ " . showParen True (showString i . showString " :: " . showsPrec 0 t) . showString " -&gt; " . showsPrec 0 e)

instance Show (TTyp a) where
    showsPrec _ TTBol = showString "Bool"
    showsPrec _ TTDbl = showString "Double"
    showsPrec p (TTArr a b) = showParen (p&gt;5) (showsPrec 6 a . showString " -&gt; " . showsPrec 5 b)

instance Show (TExp a) where
    showsPrec p (TDbl d) = showsPrec p d
    showsPrec p (TBol b) = showsPrec p b
    showsPrec _ (TVar i) = showString i
    showsPrec p (TDblOp op a b) = showOp p (fromJust $ lookup op [(DMul, "*"), (DAdd, "+"), (DSub, "-"), (DDiv, "/")]) a b
    showsPrec p (TBolOp op a b) = showOp p (fromJust $ lookup op [(BAnd, "&amp;&amp;"), (BOr, "||")]) a b
    showsPrec p (TCmpOp op a b) = showOp p (fromJust $ lookup op [(CEq, "=="), (CLe, "&amp;lt;=")]) a b
    showsPrec p (TIf c t e) = showParen (p&gt;0) (showString "if " . showsPrec 0 c . showString " then " . showsPrec 0 t . showString " else " . showsPrec 0 e)
    showsPrec p (TLet i _ e b) =
      showParen (p&gt;0) (showString "let " . showString i . showString " = " . showsPrec 0 e . showString " in " . showsPrec 0 b)

&lt;/pre&gt;

The aim of the type checker is to transform from the &lt;tt&gt;UExp&lt;/tt&gt; type to the &lt;tt&gt;TExp&lt;/tt&gt; type, so basically
&lt;pre&gt;
typeCheckExp :: UExp -&gt; TExp a
&lt;/pre&gt;
But things can go wrong, so it's impossible to always return a &lt;tt&gt;TExp&lt;/tt&gt;, so let's use a &lt;tt&gt;Maybe&lt;/tt&gt; type:
&lt;pre&gt;
typeCheckExp :: UExp -&gt; Maybe (TExp a)
&lt;/pre&gt;
But wait!  This type is totally wrong.  Why?  Because it promises that given a &lt;tt&gt;UExp&lt;/tt&gt; the type checker can return &lt;b&gt;any&lt;/b&gt; type, i.e., writing out the (normally implicit) quantifier the type is:
&lt;pre&gt;
typeCheckExp :: forall a . UExp -&gt; Maybe (TExp a)
&lt;/pre&gt;
But this is not the case, the type checker will figure out a type and return an expression with this specific type, so the type we really want is
&lt;pre&gt;
typeCheckExp :: exists a . UExp -&gt; Maybe (TExp a)
&lt;/pre&gt;
Haskell doesn't allow this type to be written this way; we need to package up the existential type in a data type.  Like so:
&lt;pre&gt;
data ATExp = forall a . TExp a ::: TTyp a

data AFun = forall a . AFun (TFun a) (TTyp a)
&lt;/pre&gt;
It might look funny that the existential type is written with a forall, but it makes sense when looking at the type of the constructor function (but not when doing pattern matching).

Now we can attempt a couple of cases of the type checker:
&lt;pre&gt;
typeCheckExp :: UExp -&gt; Maybe ATExp
typeCheckExp (UDbl d) =
    return $ TDbl d ::: TTDbl
typeCheckExp (UBol b) =
    return $ TBol b ::: TTBol
&lt;/pre&gt;
They look quite nice, and they actually work.  So what about something more complicated, like arithmetic?
&lt;pre&gt;
typeCheckExp (UApp op [a, b]) | Just dop &lt;- lookup op [("+", DAdd), ("-", DSub), ("*", DMul), ("/", DDiv)] = do
    a' ::: TTDbl &amp;lt;- typeCheckExp a
    b' ::: TTDbl &amp;lt;- typeCheckExp b
    return $ TDblOp dop a' b' ::: TTDbl
&lt;/pre&gt;
First we conveniently look up the operator among the arithmetic operators, then we recursively call the type checker for the operands.  We do this in the &lt;tt&gt;Maybe&lt;/tt&gt; monad.  If the type checking a subterm fails that's automatically propagated, and furthermore, if the type checking of a subterm does not yield a &lt;tt&gt;TTDbl&lt;/tt&gt; type then this will cause the pattern matching to fail, and this will generate a &lt;tt&gt;Nothing&lt;/tt&gt; in the maybe monad, so we used failing pattern matching to our advantage here.

The interesting case is checking &lt;tt&gt;UIf&lt;/tt&gt;, because here both arms have to have the same type, but we don't know which one.  Here's an attempt:
&lt;pre&gt;
typeCheckExp (UApp "if" [c,t,e]) = do
    c' ::: TTBol &amp;lt;- typeCheckExp c
    t' ::: tt    &amp;lt;- typeCheckExp t
    e' ::: te    &amp;lt;- typeCheckExp e
    guard (tt == te)
    return $ TIf c' t' e' ::: tt
&lt;/pre&gt;
But this doesn't type check.  The guard ensures that the two arms have the same type, but that's something we know, but the Haskell type checker doesn't.  So it rejects the &lt;tt&gt;TIf&lt;/tt&gt;, because ut can't see that both arms have the same type.  We need to be trickier in doing the equality test so that it reflects the equality on the type level.  There's a standard trick for this, namely this type:
&lt;pre&gt;
data Equal a b where
    Eq :: Equal a a
&lt;/pre&gt;
If you ever have a value (which must be &lt;tt&gt;Eq&lt;/tt&gt;) of type &lt;tt&gt;Equal foo bar&lt;/tt&gt; then the type checker will know that &lt;tt&gt;foo&lt;/tt&gt; and &lt;tt&gt;bar&lt;/tt&gt; are actually the same type.  So let's code equality for &lt;tt&gt;TTyp&lt;/tt&gt;.
&lt;pre&gt;
test :: TTyp a -&gt; TTyp b -&gt; Maybe (Equal a b)
test TTBol TTBol = return Eq
test TTDbl TTDbl = return Eq
test (TTArr a b) (TTArr a' b') = do
    Eq &amp;lt;- test a a'
    Eq &amp;lt;- test b b'
    return Eq
test _ _ = mzero
&lt;/pre&gt;
This code is worth pondering for a while, it's actually rather clever (I take no credit for it; I stole it from Tim Sheard).  Why does even the first clause type check?  Because &lt;tt&gt;TTBol&lt;/tt&gt; has type &lt;tt&gt;TTyp Bool&lt;/tt&gt;, so both the type variables (a and b) must be &lt;tt&gt;TTBool&lt;/tt&gt; in the first clause, which means that &lt;tt&gt;Eq :: Equal TBol TBol&lt;/tt&gt; is what we're returning.

Equipped with this equality we can try type checking again.
&lt;pre&gt;
typeCheckExp (UApp "if" [c,t,e]) = do
    c' ::: TTBol &amp;lt;- typeCheckExp c
    t' ::: tt    &amp;lt;- typeCheckExp t
    e' ::: te    &amp;lt;- typeCheckExp e
    Eq &amp;lt;- test tt te
    return $ TIf c' t' e' ::: tt
&lt;/pre&gt;
And amazingly this actually works!  (A tribute to the hard working ghc implementors.)

One (rather large) fly is left in the ointment.  What about variables?  What do we do when we type check &lt;tt&gt;UVar&lt;/tt&gt;?  We must check that there's a bound variable with the right type around.  So the type checker needs to be extended with an environment where variables can be looked up.  It's mostly straight forward.  The environment simply maps a variable to &lt;tt&gt;ATExp&lt;/tt&gt;.  So here's the complete type checker as it's actually defined.
&lt;pre&gt;
type Env = [(Id, ATExp)]

typeCheckExp :: Env -&gt; UExp -&gt; Maybe ATExp
typeCheckExp _ (UDbl d) =
    return $ TDbl d ::: TTDbl
typeCheckExp _ (UBol b) =
    return $ TBol b ::: TTBol
typeCheckExp r (UApp op [a, b]) | Just dop &amp;lt;- lookup op [("+", DAdd), ("-", DSub), ("*", DMul), ("/", DDiv)] = do
    a' ::: TTDbl &amp;lt;- typeCheckExp r a
    b' ::: TTDbl &amp;lt;- typeCheckExp r b
    return $ TDblOp dop a' b' ::: TTDbl
typeCheckExp r (UApp op [a, b]) | Just bop &amp;lt;- lookup op [("&amp;&amp;", BAnd), ("||", BOr)] = do
    a' ::: TTBol &amp;lt;- typeCheckExp r a
    b' ::: TTBol &amp;lt;- typeCheckExp r b
    return $ TBolOp bop a' b' ::: TTBol
typeCheckExp r (UApp op [a, b]) | Just cop &amp;lt;- lookup op [("==", CEq), ("&amp;lt;=", CLe)] = do
    a' ::: TTDbl &amp;lt;- typeCheckExp r a
    b' ::: TTDbl &amp;lt;- typeCheckExp r b
    return $ TCmpOp cop a' b' ::: TTBol
typeCheckExp r (UApp "if" [c,t,e]) = do
    c' ::: TTBol &amp;lt;- typeCheckExp r c
    t' ::: tt    &amp;lt;- typeCheckExp r t
    e' ::: te    &amp;lt;- typeCheckExp r e
    Eq &amp;lt;- test tt te
    return $ TIf c' t' e' ::: tt
typeCheckExp r (ULet i e b) = do
    e' ::: te &amp;lt;- typeCheckExp r e
    b' ::: tb &amp;lt;- typeCheckExp ((i, TVar i ::: te) : r) b
    return $ TLet i te e' b' ::: tb
typeCheckExp r (UVar i) =
    lookup i r
typeCheckExp _ _ =
    mzero
&lt;/pre&gt;

Note the &lt;tt&gt;ULet&lt;/tt&gt; case which extends the environment.  First we type check the expression that's being bound, and then add a variable to the environment and type check the body.

Finally we need to type check the top level:
&lt;pre&gt;
typeCheck :: UFun -&gt; Maybe AFun
typeCheck = typeCheckFun []

typeCheckFun :: Env -&gt; UFun -&gt; Maybe AFun
typeCheckFun n (UFun [] b) = do
    e ::: t &amp;lt;- typeCheckExp n b
    return $ AFun (TBody e) t
typeCheckFun n (UFun ((x, typ):vts) b) =
    case typ of
    UTBol -&gt; f TTBol
    UTDbl -&gt; f TTDbl
  where f t = do AFun e r &amp;lt;- typeCheckFun ((x, TVar x ::: t) : n) (UFun vts b); return $ AFun (TLam x t e) (TTArr t r)
&lt;/pre&gt;
When encountering the expression we just type check it, and for an argument we add a variable with the right type to the environment.

A small test in ghci:
&lt;pre&gt;
TExp UExp&gt; mParseUFun "\\ (x::Double) -&gt; x+1" &gt;&gt;= typeCheck
Just (\ (x :: Double) -&gt; x+1.0 :: Double -&gt; Double)
&lt;/pre&gt;

To be able to extract a function from &lt;tt&gt;ATFun&lt;/tt&gt; we need some small utilties.
&lt;pre&gt;
class Type a where
    theType :: TTyp a
instance Type Double where
    theType = TTDbl
instance Type Bool where
    theType = TTBol
instance (Type a, Type b) =&gt; Type (a-&gt;b) where
    theType = TTArr theType theType

extractFun :: (Type a) =&gt; AFun -&gt; Maybe (TFun a)
extractFun = extract theType

extract :: TTyp a -&gt; AFun -&gt; Maybe (TFun a)
extract s (AFun e t) = do
    Eq &amp;lt;- test t s
    return e

toTFun :: (Type a) =&gt; UFun -&gt; Maybe (TFun a)
toTFun = extractFun &amp;lt;=&amp;lt; typeCheck
&lt;/pre&gt;
The class &lt;tt&gt;Type&lt;/tt&gt; allows us to construct the &lt;tt&gt;TTyp&lt;/tt&gt; corresponding to a Haskell type via overloading.  Using this and the &lt;tt&gt;test&lt;/tt&gt; function we can then extract a &lt;tt&gt;TFun&lt;/tt&gt; at any type we like.  If we try to extract at the wrong type we'll just get &lt;tt&gt;Nothing&lt;/tt&gt; and at the right type we get &lt;tt&gt;Just&lt;/tt&gt;.

&lt;h3&gt;The Compiler module&lt;/h3&gt;
Now all we need to do is to write a function &lt;tt&gt;translate&lt;/tt&gt; that translates a &lt;tt&gt;TFun a&lt;/tt&gt; into the corresponding &lt;tt&gt;a&lt;/tt&gt;.  Naturally, using LLVM.

Let's start with some simple cases in translating literals to LLVM code.
&lt;pre&gt;
compileExp :: TExp a -&gt; CodeGenFunction r (Value a)
compileExp (TDbl d) = return $ valueOf d
compileExp (TBol b) = return $ valueOf b
&lt;/pre&gt;

The &lt;tt&gt;valueOf&lt;/tt&gt; function is simply the one that lifts a Haskell value into an LLVM value.  Note how nice the GADT works out here and we handle both Double and Bool with any need to compromise type safety.

What about arithmetic?  Equally easy.
&lt;pre&gt;
compileExp r (TDblOp op e1 e2) = bind2 (dblOp op) (compileExp r e1) (compileExp r e2)

dblOp :: DblOp -&gt; Value Double -&gt; Value Double -&gt; CodeGenFunction r (Value Double)
dblOp DAdd = add
dblOp DSub = sub
dblOp DMul = mul
dblOp DDiv = fdiv

-- This should be in Control.Monad
bind2 :: (Monad m) =&gt; (a -&gt; b -&gt; m c) -&gt; m a -&gt; m b -&gt; m c
bind2 f m1 m2 = do
    x1 &amp;lt;- m1
    x2 &amp;lt;- m2
    f x1 x2
&lt;/pre&gt;

And we can just carry on:
&lt;pre&gt;
compileExp (TBolOp op e1 e2) = bind2 (bolOp op) (compileExp e1) (compileExp e2)
compileExp (TCmpOp op e1 e2) = bind2 (cmpOp op) (compileExp e1) (compileExp e2)
compileExp (TIf b t e) = mkIf (compileExp b) (compileExp t) (compileExp e)

bolOp :: BolOp -&gt; Value Bool -&gt; Value Bool -&gt; CodeGenFunction r (Value Bool)
bolOp BAnd = and
bolOp BOr  = or

cmpOp :: CmpOp -&gt; Value Double -&gt; Value Double -&gt; CodeGenFunction r (Value Bool)
cmpOp CEq = fcmp FPOEQ
cmpOp CLe = fcmp FPOLE
&lt;/pre&gt;
(The &lt;tt&gt;&amp;&amp;&lt;/tt&gt; and &lt;tt&gt;||&lt;/tt&gt; are not short circuiting in this implementation.  It would be easy to change.)

It's rather amazing that despite these different branches producing and consuming different types it all works out.  It's perfectly type safe and free from coercions.  This is the beauty of GADTs.

Oh, yeah, &lt;tt&gt;mkIf&lt;/tt&gt;.  It's just a piece of mess to create some basic blocks, test, and jump.
&lt;pre&gt;
mkIf :: (IsFirstClass a) =&gt;
        CodeGenFunction r (Value Bool) -&gt; CodeGenFunction r (Value a) -&gt; CodeGenFunction r (Value a) -&gt; CodeGenFunction r (Value a)
mkIf mb mt me = do
    b &amp;lt;- mb
    tb &amp;lt;- newBasicBlock
    eb &amp;lt;- newBasicBlock
    jb &amp;lt;- newBasicBlock
    condBr b tb eb
    defineBasicBlock tb
    t &amp;lt;- mt
    br jb
    defineBasicBlock eb
    e &amp;lt;- me
    br jb
    defineBasicBlock jb
    phi [(t, tb), (e, eb)]
&lt;/pre&gt;

OK, so was lying.  The &lt;tt&gt;translate&lt;/tt&gt; function is not quite as easy as that.  Just as with type checking we need an environment because of variables.  It's easy to add though, and here's the real code.

&lt;pre&gt;
compileExp :: (Type a, IsFirstClass a) =&gt; Env -&gt; TExp a -&gt; CodeGenFunction r (Value a)
compileExp _ (TDbl d) = return $ valueOf d
compileExp _ (TBol b) = return $ valueOf b
compileExp r (TDblOp op e1 e2) = bind2 (dblOp op) (compileExp r e1) (compileExp r e2)
compileExp r (TBolOp op e1 e2) = bind2 (bolOp op) (compileExp r e1) (compileExp r e2)
compileExp r (TCmpOp op e1 e2) = bind2 (cmpOp op) (compileExp r e1) (compileExp r e2)
compileExp r (TIf b t e) = mkIf (compileExp r b) (compileExp r t) (compileExp r e)
compileExp r (TLet i t e b) = do
    e' &amp;lt;- compileExp' r t e
    compileExp ((i, AValue e' t):r) b
compileExp r (TVar i) = return $ fromJust $ castAValue theType =&amp;lt;&amp;lt; lookup i r   -- lookup cannot fail on type checked code

compileExp' :: Env -&gt; TTyp a -&gt; TExp a -&gt; CodeGenFunction r (Value a)
compileExp' r TTDbl e = compileExp r e
compileExp' r TTBol e = compileExp r e
compileExp' _ _ _ = error $ "compileExp': functions not allowed yet"

data AValue = forall a . AValue (Value a) (TTyp a)

castAValue :: TTyp a -&gt; AValue -&gt; Maybe (Value a)
castAValue t (AValue v s) = do
    Eq &amp;lt;- test t s
    return v

type Env = [(Id, AValue)]
&lt;/pre&gt;
Exactly as for the type checking environment we stick the code generation in an environment, and use &lt;tt&gt;castAValue&lt;/tt&gt; project it out of the existential container.  The &lt;tt&gt;fromJust&lt;/tt&gt; call in the &lt;tt&gt;TVar&lt;/tt&gt; case cannot fail on type checked code, but with my string based variable representation I have no evidence of this in the &lt;tt&gt;TExp&lt;/tt&gt; so there's actually a cast in the variable case that can fail if scope and type checking has not been performed.  The &lt;tt&gt;compileExp'&lt;/tt&gt; is placate the type checker and help it with some evidence about that we are only binding base values.

The rest of the code generation module is just house keeping.  It's a little ugly, but not terrible.
&lt;pre&gt;
-- | Compile a TFun into the corresponding LLVM code.
compileFunction :: (Translate a) =&gt;
                   TFun a -&gt; CodeGenModule (Function (RetIO a))
compileFunction = createFunction ExternalLinkage . compileFun []

class Compile a where
    type CG a
    type RetIO a
    type Returns a
    compileFun :: Env -&gt; TFun a -&gt; CG a

instance Compile Double where
    type CG Double = CodeGenFunction Double ()
    type RetIO Double = IO Double
    type Returns Double = Double
    compileFun r (TBody e) = compileExp r e &gt;&gt;= ret
    -- TLam is not well typed

instance Compile Bool where
    type CG Bool = CodeGenFunction Bool ()
    type RetIO Bool = IO Bool
    type Returns Bool = Bool
    compileFun r (TBody e) = compileExp r e &gt;&gt;= ret
    -- TLam is not well typed

instance (Type a, Compile b) =&gt; Compile (a -&gt; b) where
    type CG (a-&gt;b) = Value a -&gt; CG b
    type RetIO (a-&gt;b) = a -&gt; RetIO b
    type Returns (a-&gt;b) = Returns b
    -- TBody is not well typed
    compileFun r (TLam i t e) = \ x -&gt; compileFun ((i, AValue x t):r) e
&lt;/pre&gt;
The verbosity and large number of type functions in this section has convinced me that I need to simplify some of the types and classes involved in the LLVM code generation.


To convert and LLVM module we call the JIT.  This produces a function that returns a value in the IO monad (to be on the safe side) so we need to get rid of the IO, and finally we can get rid of the top level IO, because externally what we are doing is really pure (in some sense).

&lt;pre&gt;
translate :: (Translate a) =&gt; TFun a -&gt; a
translate = unsafePerformIO . fmap unsafePurify . simpleFunction . compileFunction
&lt;/pre&gt;

The &lt;tt&gt;Translate&lt;/tt&gt; context is just an abbreviation for a big context enforced by the LLVM functions.  It looks horrendous, but the type checker figured it out for me and I just pasted it in.

&lt;pre&gt;
{-# LANGUAGE TypeFamilies, FlexibleContexts, ExistentialQuantification, FlexibleInstances, UndecidableInstances #-}
module Compile(Translate, translate) where
import Data.Maybe
import Prelude hiding (and, or)
import TExp
import LLVM.Core hiding (CmpOp)
import LLVM.ExecutionEngine
import System.IO.Unsafe(unsafePerformIO)

class    (Type a,
          Unsafe (RetIO a) a,
          FunctionArgs (RetIO a) (CG a) (CodeGenFunction (Returns a) ()),
          IsFunction (RetIO a),
          Compile a,
          Translatable (RetIO a)) =&gt;
    Translate a
instance (Type a,
          Unsafe (RetIO a) a,
          FunctionArgs (RetIO a) (CG a) (CodeGenFunction (Returns a) ()),
          IsFunction (RetIO a),
          Compile a,
          Translatable (RetIO a)) =&gt;
    Translate a
&lt;/pre&gt;

&lt;h3&gt;Conclusion&lt;/h3&gt;
And that concludes the three parts of the compiler.  In about 400 lines of code we can compile a small subset of Haskell expressions to (efficient) machine code.  After type checking the rest of the processing is done in a type safe manner (except for a cast in &lt;tt&gt;TVar&lt;/tt&gt;) which is the intention of the high level LLVM interface.

Oh, and if you instrument the code generator a little you can peek at the machine code being produced.  For instance, for this input to &lt;tt&gt;compile&lt;/tt&gt;
&lt;pre&gt;
\ (x::Double) -&gt;
let y = x*(x-1) in
let z = x/y + 1 in
if y &amp;lt;= 0 then 0 else 1/(y-z)
&lt;/pre&gt;
we get
&lt;pre&gt;
__fun1:
 subl $12, %esp
 movsd LCPI1_0, %xmm0
 movsd 16(%esp), %xmm1
 movapd %xmm1, %xmm2
 subsd %xmm0, %xmm2
 mulsd %xmm1, %xmm2
 pxor %xmm3, %xmm3
 ucomisd %xmm2, %xmm3
 jae LBB1_3
LBB1_1:
 divsd %xmm2, %xmm1
 addsd %xmm0, %xmm1
 subsd %xmm1, %xmm2
 movsd LCPI1_0, %xmm0
 divsd %xmm2, %xmm0
LBB1_2:
 movsd %xmm0, (%esp)
 fldl (%esp)
 addl $12, %esp
 ret
LBB1_3:
 pxor %xmm0, %xmm0
 jmp LBB1_2
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-4910575514980456234?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/4910575514980456234/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=4910575514980456234' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4910575514980456234'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4910575514980456234'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/06/more-llvm-recently-someone-asked-me-on.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-3941171818143196666</id><published>2009-02-07T20:31:00.003Z</published><updated>2009-02-09T11:43:08.996Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='BASIC'/><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;More BASIC&lt;/h2&gt;
Not that anybody should care, but I've reimplemented by BASIC.
&lt;p&gt;
Here's a simple program.
&lt;pre&gt;
{-# LANGUAGE ExtendedDefaultRules, OverloadedStrings #-}
import BASIC

main = runBASIC $ do
    10 GOSUB 1000
    20 PRINT "* Welcome to HiLo *"
    30 GOSUB 1000

    100 LET I := INT(100 * RND(0))
    200 PRINT "Guess my number:"
    210 INPUT X
    220 LET S := SGN(I-X)
    230 IF S &amp;lt;&gt; 0 THEN 300

    240 FOR X := 1 TO 5
    250   PRINT X*X;" You won!"
    260 NEXT X
    270 STOP

    300 IF S &amp;lt;&gt; 1 THEN 400
    310 PRINT "Your guess ";X;" is too low."
    320 GOTO 200

    400 PRINT "Your guess ";X;" is too high."
    410 GOTO 200

    1000 PRINT "*******************"
    1010 RETURN

    9999 END
&lt;/pre&gt;
In some ways this is a step backwards, since it requires some language extensions in Main.  But I wanted to be able to use semicolon in the print statement.
&lt;P&gt;
But there it is, an exciting game!
&lt;pre&gt;
*******************
* Welcome to HiLo *
*******************
Guess my number:
50
Your guess 50 is too high.
Guess my number:
25
Your guess 25 is too low.
Guess my number:
37
Your guess 37 is too low.
Guess my number:
44
Your guess 44 is too low.
Guess my number:
47
Your guess 47 is too low.
Guess my number:
48
1 You won!
4 You won!
9 You won!
16 You won!
25 You won!
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-3941171818143196666?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/3941171818143196666/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=3941171818143196666' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3941171818143196666'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3941171818143196666'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/02/more-basic-not-that-anybody-should-care.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5253528735726768410</id><published>2009-02-06T16:50:00.002Z</published><updated>2009-02-06T16:58:09.117Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='BASIC'/><category scheme='http://www.blogger.com/atom/ns#' term='Benchmark'/><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='LLVM'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;Is Haskell fast?&lt;/h2&gt;
Let's do a simple benchmark comparing Haskell to C.
My benchmark computes an approximation to infinity by adding up 1/n.
Here is the C code:
&lt;pre&gt;
#include &amp;lt;stdio.h&gt;

int
main(int argc, char **argv)
{
  double i, s;
  s = 0;
  for (i = 1; i &amp;lt; 100000000; i++)
    s += 1/i;
  printf("Almost infinity is %g\n", s);
}
&lt;/pre&gt;
And running it
&lt;pre&gt;
Lennarts-Computer% gcc -O3 inf.c -o inf
Lennarts-Computer% time ./inf
Almost infinity is 18.9979
1.585u 0.009s 0:01.62 97.5%     0+0k 0+0io 0pf+0w
&lt;/pre&gt;

And now the Haskell code:
&lt;pre&gt;
import BASIC

main = runBASIC' $ do

    10 LET I =: 1
    20 LET S =: 0
    30 LET S =: S + 1/I
    40 LET I =: I + 1
    50 IF I &amp;lt;&gt; 100000000 THEN 30
    60 PRINT "Almost infinity is"
    70 PRINT S
    80 END
&lt;/pre&gt;
And running it:
&lt;pre&gt;
Lennarts-Computer% ghc --make Main.hs
[4 of 4] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
Lennarts-Computer% ./Main
Almost infinity is
18.9979
CPU time:   1.57s
&lt;/pre&gt;
As you can see it's about the same time.  In fact the assembly code for the loops look pretty much the same.  Here's the Haskell one:
&lt;pre&gt;
LBB1_1: ## _L4
        movsd   LCPI1_0, %xmm2
        movapd  %xmm1, %xmm3
        addsd   %xmm2, %xmm3
        ucomisd LCPI1_1, %xmm3
        divsd   %xmm1, %xmm2
        addsd   %xmm2, %xmm0
        movapd  %xmm3, %xmm1
        jne     LBB1_1  ## _L4
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5253528735726768410?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5253528735726768410/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5253528735726768410' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5253528735726768410'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5253528735726768410'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/02/is-haskell-fast-lets-do-simple.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-7024343775777574640</id><published>2009-02-06T11:12:00.002Z</published><updated>2009-02-06T11:18:54.985Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='BASIC'/><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;Regression&lt;/h2&gt;

They say that as you get older you regress back towards childhood.  So I present you with today's Haskell program (the idea shamelessly stolen from JoshTriplett from #haskell on IRC):
&lt;pre&gt;
import BASIC

main = runBASIC $ do

    10 LET X =: 1
    20 PRINT "Hello BASIC world!"
    30 LET X =: X + 1
    40 IF X &lt;&gt; 11 THEN 20
    50 END
&lt;/pre&gt;
Yes, it runs.  (I'm sorry about the &lt;tt&gt;=:&lt;/tt&gt; instead of &lt;tt&gt;=&lt;/tt&gt;, but some things are just too wired into Haskell to change.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-7024343775777574640?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/7024343775777574640/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=7024343775777574640' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/7024343775777574640'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/7024343775777574640'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/02/regression-they-say-that-as-you-get.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-6096513306422692728</id><published>2009-01-21T01:07:00.004Z</published><updated>2009-01-21T08:32:12.079Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='LLVM'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;h2&gt;A performance update&lt;/h2&gt;
I've continued playing with the LLVM.  I discovered that when generating code for the &lt;tt&gt;normcdf&lt;/tt&gt; and Black-Scholes functions I did not tell LLVM that the functions that were called (&lt;tt&gt;exp&lt;/tt&gt; etc.) are actually pure functions.  That meant that the LLVM didn't perform CSE properly.
&lt;p&gt;
So here are some updated numbers for computing an option prices for 10,000,000 options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pure Haskell: 8.7s&lt;/li&gt;
&lt;li&gt;LLVM: 2.0s&lt;/li&gt;
&lt;/ul&gt;
Just as a reminder, the code for &lt;tt&gt;normcdf&lt;/tt&gt; looks like this:
&lt;pre&gt;
normcdf x = x %&lt; 0 ?? (1 - w, w)
  where w = 1.0 - 1.0 / sqrt (2.0 * pi) * exp(-l*l / 2.0) * poly k
        k = 1.0 / (1.0 + 0.2316419 * l)
        l = abs x
        poly = horner coeff 
        coeff = [0.0,0.31938153,-0.356563782,1.781477937,-1.821255978,1.330274429] 
&lt;/pre&gt;

A noteworthy thing is that exactly the same code can be used both for the pure Haskell and the LLVM code generation; it's just a matter of overloading.
&lt;h3&gt;Vectors&lt;/h3&gt;
An very cool aspect of the LLVM is that it has vector instructions.  On the x86 these translate into using the SSE extensions to the processor and can speed up computations by doing things in parallel.
&lt;p&gt;
Again, by using overloading, the exact same code can be used to compute over vectors of numbers as with individual numbers.
&lt;p&gt;
So what about performance?  I used four element vectors of 32 bit floating point numbers and got these results:
&lt;ul&gt;
&lt;li&gt;Pure Haskell: 8.7s&lt;/li&gt;
&lt;li&gt;LLVM, scalar: 2.0s&lt;/li&gt;
&lt;li&gt;LLVM, vector: 1.1s&lt;/li&gt;
&lt;li&gt;C, gcc -O3: 2.5s&lt;/li&gt;
&lt;/ul&gt;
Some caveats if you try out vectors in the LLVM.
&lt;ul&gt;
&lt;li&gt;Only on MacOS does the LLVM package give you fast primitive functions, because that's the only platform that seems to have this as a standard.&lt;/li&gt;
&lt;li&gt;The vector version of floating point comparison has not been implemented in the LLVM yet.&lt;/li&gt;
&lt;li&gt;Do not use two element vectors of type 32 bit floats.  This will generate code that is wrong on the x86.  I sent in a bug report about this, but was told that it is a feature and not a bug.  (I kid you not.)  To make the code right you have to manually insert EMMS instructions.&lt;/li&gt;
&lt;li&gt;The GHC FFI is broken for all operations that allocate memory for a &lt;tt&gt;Storable&lt;/tt&gt;, e.g., &lt;tt&gt;alloca&lt;/tt&gt;, &lt;tt&gt;with&lt;/tt&gt;, &lt;tt&gt;withArray&lt;/tt&gt; etc.  These operations do not take the alignment into account when allocating.  This means that, e.g., a vector of four floats may end up on 8 byte alignment instead of 16.  This generates a segfault.
&lt;/ul&gt;
On the whole, I'm pretty happy with the LLVM performance now; about 8 times faster than ghc on this example.
&lt;p&gt;
[Edit:] Added point about broken FFI.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-6096513306422692728?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/6096513306422692728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=6096513306422692728' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6096513306422692728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6096513306422692728'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/01/performance-update-ive-continued.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-3502914505601036632</id><published>2009-01-10T15:33:00.003Z</published><updated>2009-01-10T16:22:35.886Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='LLVM'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;h2&gt;LLVM arithmetic&lt;/h2&gt;
So we want to compute &lt;i&gt;x&lt;/i&gt;&lt;sup&gt;2&lt;/sup&gt;-5&lt;i&gt;x&lt;/i&gt;+6 using the Haskell LLVM bindings.  It would look some something like this.
&lt;pre&gt;
    xsq &amp;lt;- mul x x
    x5  &amp;lt;- mul 5 x
    r1  &amp;lt;- sub xsq x5
    r   &amp;lt;- add r1 6
&lt;/pre&gt;
Not very readable, it would be nicer to write
&lt;pre&gt;
    r   &amp;lt;- x^2 - 5*x + 6
&lt;/pre&gt;
But, e.g., the &lt;tt&gt;add&lt;/tt&gt; instruction has the (simplified) type &lt;tt&gt;Value a -&gt; Value a -&gt; CodeGenFunction r (Value a)&lt;/tt&gt;, where &lt;tt&gt;CodeGenFunction&lt;/tt&gt; is the monad for generating code for a function.  (BTW, the &lt;tt&gt;r&lt;/tt&gt; type variable is used to keep track of the return type of the function, used by the &lt;tt&gt;ret&lt;/tt&gt; instruction.)
&lt;p&gt;
We can't change the return type of &lt;tt&gt;add&lt;/tt&gt;, but we can change the argument type.
&lt;pre&gt;
type TValue r a = CodeGenFunction r (Value a)
add' :: TValue r a -&gt; TValue r a -&gt; TValue r a
add' x y = do x' &lt;- x; y' &lt;- y; add x' y'
&lt;/pre&gt;
Now the type fits what the &lt;tt&gt;Num&lt;/tt&gt; class wants.  And we can make an instance declaration.
&lt;pre&gt;
instance (Num a) =&gt; Num (TValue r a) where
    (+) = add'
    (-) = sub'
    (*) = mul'
    fromInteger = return . valueOf . fromInteger
&lt;/pre&gt;
We are getting close, the only little thing is that the &lt;tt&gt;x&lt;/tt&gt; in our original LLVM code has type &lt;tt&gt;Value a&lt;/tt&gt; rather than &lt;tt&gt;TValue r a&lt;/tt&gt;, but &lt;tt&gt;return&lt;/tt&gt; takes care of that.  So:
&lt;pre&gt;
    let x' = return x
    r &amp;lt;- x'^2 - 5*x' + 6
&lt;/pre&gt;
And a quick look at the generated LLVM code (for Double) shows us that all is well.
&lt;pre&gt;
; x in %0
 %1 = mul double %0, %0
 %2 = mul double 5.000000e+00, %0
 %3 = sub double %1, %2
 %4 = add double %3, 6.000000e+00
&lt;/pre&gt;
&lt;p&gt;
All kinds of numeric instances and some other goodies are available in the &lt;tt&gt;LLVM.Util.Arithmetic&lt;/tt&gt; module.  Here is a complete Fibonacci (again) using this.
&lt;pre&gt;
import Data.Int
import LLVM.Core
import LLVM.ExecutionEngine
import LLVM.Util.Arithmetic

mFib :: CodeGenModule (Function (Int32 -&gt; IO Int32))
mFib = recursiveFunction $ \ rfib n -&gt;
    n %&amp;lt; 2 ? (1, rfib (n-1) + rfib (n-2))

main :: IO ()
main = do
    let fib = unsafeGenerateFunction mFib
    print (fib 22)
&lt;/pre&gt;
The &lt;tt&gt;%&amp;lt;&lt;/tt&gt; operator compares values returning a &lt;tt&gt;TValue r Bool&lt;/tt&gt;.  The &lt;tt&gt;c ? (t, e)&lt;/tt&gt; is a conditional expression, like C's &lt;tt&gt;c ? t : e&lt;/tt&gt;.
&lt;p&gt;
The type given to mFib is not the most general one, of course.  The most general one can accept any numeric type for argument and result.  Here it is:
&lt;pre&gt;
mFib :: (Num a, Cmp a, IsConst a,
         Num b, Cmp b, IsConst b, FunctionRet b) =&gt;
        CodeGenModule (Function (a -&gt; IO b))
mFib = recursiveFunction $ \ rfib n -&gt;
    n %&amp;lt; 2 ? (1, rfib (n-1) + rfib (n-2))
&lt;/pre&gt;
It's impossible to generate code for &lt;tt&gt;mFib&lt;/tt&gt; directly; code can only be generated for monomorphic types.  So a type signature is needed when generating code to force a monomorphic type.
&lt;pre&gt;
main = do
    let fib :: Int32 -&gt; Double
        fib = unsafeGenerateFunction mFib
        fib' :: Int16 -&gt; Int64
        fib' = unsafeGenerateFunction mFib
    print (fib 22, fib' 22)
&lt;/pre&gt;
&lt;h3&gt;Another example&lt;/h3&gt;
Let's try a more complex example.  To pick something mathematical to have lots of formulae we'll do the Cumulative Distribution Function.  For the precision of a Float it can be coded like this in Haskell (normal Haskell, no LLVM):
&lt;pre&gt;
normcdf x = if x &lt; 0 then 1 - w else w
  where w = 1 - 1 / sqrt (2 * pi) * exp(-l*l / 2) * poly k
        k = 1 / (1 + 0.2316419 * l)
        l = abs x
        poly = horner coeff 
        coeff = [0.0,0.31938153,-0.356563782,1.781477937,-1.821255978,1.330274429] 

horner coeff base = foldr1 multAdd coeff
  where multAdd x y = y*base + x
&lt;/pre&gt;
We cannot use this directly, it has type &lt;tt&gt;normcdf :: (Floating a, Ord a) =&gt; a -&gt; a&lt;/tt&gt;.  The &lt;tt&gt;Ord&lt;/tt&gt; context is a problem, because there are no instance of &lt;tt&gt;Ord&lt;/tt&gt; for LLVM types.  The comparison is the root of the problem, since it returns a &lt;tt&gt;Bool&lt;/tt&gt; rather than a &lt;tt&gt;TValue r Bool&lt;/tt&gt;.
&lt;p&gt;
It's possible to hide the Prelude and overload the comparisons, but you cannot overload the &lt;tt&gt;if&lt;/tt&gt; construct.  So a little rewriting is necessary regardless.  Let's just bite the bullet and change the first line.
&lt;pre&gt;
normcdf x = x %&amp;lt; 0 ? (1 - w, w)
&lt;/pre&gt;
And with that change, all we need to do is
&lt;pre&gt;
mNormCDF = createFunction ExternalLinkage $ arithFunction $ normcdf

main :: IO ()
main = do
    writeFunction "CDF.bc" (mNormCDF :: CodeGenModule (Function (Float -&gt; IO Float)))
&lt;/pre&gt;
So what happened here?  Looking at &lt;tt&gt;normcdf&lt;/tt&gt; it contains a things that the LLVM cannot handle, like lists.  But all the list operations happen when the Haskell code runs and nothing of that is left in the LLVM code.
&lt;p&gt;
If you optimize and generate code for &lt;tt&gt;normcdf&lt;/tt&gt; it looks like this (leaving out constants and half the code):
&lt;pre&gt;
__fun1:
        subl    $28, %esp
        pxor    %xmm0, %xmm0
        ucomiss 32(%esp), %xmm0
        jbe     LBB1_2
        movss   32(%esp), %xmm0
        mulss   %xmm0, %xmm0
        divss   LCPI1_0, %xmm0
        movss   %xmm0, (%esp)
        call    _expf
        fstps   24(%esp)
        movss   32(%esp), %xmm0
        mulss   LCPI1_1, %xmm0
        movss   %xmm0, 8(%esp)
        movss   LCPI1_2, %xmm0
        movss   8(%esp), %xmm1
        addss   %xmm0, %xmm1
        movss   %xmm1, 8(%esp)
        movaps  %xmm0, %xmm1
        divss   8(%esp), %xmm1
        movaps  %xmm1, %xmm2
        mulss   LCPI1_3, %xmm2
        addss   LCPI1_4, %xmm2
        mulss   %xmm1, %xmm2
        addss   LCPI1_5, %xmm2
        mulss   %xmm1, %xmm2
        addss   LCPI1_6, %xmm2
        mulss   %xmm1, %xmm2
        addss   LCPI1_7, %xmm2
        mulss   %xmm1, %xmm2
        pxor    %xmm1, %xmm1
        addss   %xmm1, %xmm2
        movss   24(%esp), %xmm1
        mulss   LCPI1_8, %xmm1
        mulss   %xmm2, %xmm1
        addss   %xmm0, %xmm1
        subss   %xmm1, %xmm0
        movss   %xmm0, 20(%esp)
        flds    20(%esp)
        addl    $28, %esp
        ret
LBB1_2:
        ...
&lt;/pre&gt;
And that looks quite nice, in my opinion.
&lt;h3&gt;Black-Scholes&lt;/h3&gt;
I work at a bank these days, so let's do the most famous formula in financial maths, the Black-Scholes formula for pricing options.  Here's a Haskell rendition of it.
&lt;pre&gt;
blackscholes iscall s x t r v = if iscall then call else put
  where call = s * normcdf d1 - x*exp (-r*t) * normcdf d2
        put  = x * exp (-r*t) * normcdf (-d2) - s * normcdf (-d1)
        d1 = (log(s/x) + (r+v*v/2)*t) / (v*sqrt t)
        d2 = d1 - v*sqrt t
&lt;/pre&gt;
Again, it uses an &lt;tt&gt;if&lt;/tt&gt;, so it needs a little fix.
&lt;pre&gt;
blackscholes iscall s x t r v  = iscall ? (call, put)
&lt;/pre&gt;
With that fix, code can be generated directly from &lt;tt&gt;blackscholes&lt;/tt&gt;.  The calls to &lt;tt&gt;normcdf&lt;/tt&gt; are expanded inline, but it's easy to make some small changes to the code so that it actually does function calls.
&lt;h3&gt;Some figures&lt;/h3&gt;
To test the speed of the generated code I ran &lt;tt&gt;blackscholes&lt;/tt&gt; over a portfolio of 10,000,000 options and summed their value.  The time excludes the time to set up the portfolio array, because that is done in Haskell.  I also ran the code in pure Haskell on a list with 10,000,000 elements.
&lt;pre&gt;
Unoptimized LLVM   17.5s
Optimized LLVM      8.2s
Pure Haskell        9.3s
&lt;/pre&gt;
The only surprise here is how well pure Haskell (with -O2) performs.  This is a very good example for Haskell though, because the list gets fused away and everything is strict.  A lot of the time is spent in &lt;tt&gt;log&lt;/tt&gt; and &lt;tt&gt;exp&lt;/tt&gt; in this code, so perhaps the similar results are not so surprising after all.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-3502914505601036632?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/3502914505601036632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=3502914505601036632' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3502914505601036632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3502914505601036632'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/01/llvm-arithmetic-so-we-want-to-compute-x.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-2319446660716223785</id><published>2009-01-07T16:14:00.005Z</published><updated>2009-01-10T15:57:37.551Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='LLVM'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;h2&gt;LLVM&lt;/h2&gt;
The &lt;a href="http://llvm.org/"&gt;LLVM&lt;/a&gt;, Low Level Virtual Machine, is a really cool compiler infrastructure project with many participants.  The idea is that if you want to make a new high quality compiler you just have to generate LLVM code, and then there are lots of optimizations and code generators available to get fast code.
&lt;p&gt;
There are different ways to generate input to the LLVM tools.  You can generate a text file with LLVM code and feed it to the tools, or you can use bindings for some programming language and programmatically build the LLVM code.  The original bindings from the LLVM project is for C++, but they also provide C bindings.  On top of the C bindings you can easily interface to other languages; for instance O'Caml and Haskell.
&lt;p&gt;
There are also diffent things you can do to LLVM code you have build programmatically.  You can transform it, you can write to a file, you can run an interpreter on it, or execute it with a JIT compiler.

&lt;h3&gt;Haskell LLVM bindings&lt;/h3&gt;
There is a Haskell binding to the LLVM.  It has two layers.  You can either work on the C API level and have ample opportunity to shoot your own limbs to pieces, or you can use the high level interface which is mostly safe.
&lt;p&gt;
Bryan O'Sullivan did all the hard work of taking the C header files and producing the corresponding Haskell FFI files.  He also made a first stab at the high level interface, which I have since change a lot (for better or for worse).

&lt;h3&gt;An example&lt;/h3&gt;
Let's do an example.  We'll write the LLVM code for this function
&lt;pre&gt;
  f x y z = (x + y) * z
&lt;/pre&gt;
In Haskell this function is polymorphic, but when generating machine code we have to settle for a type.  Let's pick &lt;tt&gt;Int32&lt;/tt&gt;.  (The Haskell &lt;tt&gt;Int&lt;/tt&gt; type cannot be used in talking to LLVM; it doesn't a well defined size.)  Here is how it looks:
&lt;pre&gt;
mAddMul :: CodeGenModule (Function (Int32 -&gt; Int32 -&gt; Int32 -&gt; IO Int32))
mAddMul = 
  createFunction ExternalLinkage $ \ x y z -&gt; do
    t &lt;- add x y
    r &lt;- mul t z
    ret r
&lt;/pre&gt;
For comparison, the LLVM code in text for for this would be:
&lt;pre&gt;
define i32 @_fun1(i32, i32, i32) {
        %3 = add i32 %0, %1
        %4 = mul i32 %3, %2
        ret i32 %4
}
&lt;/pre&gt;
So what does the Haskell code say? The &lt;tt&gt;mAddMul&lt;/tt&gt; definition is something in the &lt;tt&gt;CodeGenModule&lt;/tt&gt; monad, and it generates a &lt;tt&gt;Function&lt;/tt&gt; of type &lt;tt&gt;Int32 -&gt; Int32 -&gt; Int32 -&gt; IO Int32&lt;/tt&gt;.  That last is the type of &lt;tt&gt;f&lt;/tt&gt; above, except for that &lt;tt&gt;IO&lt;/tt&gt;.  Why the &lt;tt&gt;IO&lt;/tt&gt;?  The Haskell LLVM bindings forces all defined functions to return something in the IO monad, because there are no restriction on what can happen in the LLVM code; it might very well do IO.  So to be on the safe side, there's always an IO on the type.  If we know the function is harmless, we can use &lt;tt&gt;unsafePerformIO&lt;/tt&gt; to get rid of it.
&lt;p&gt;
So the code does a &lt;tt&gt;createFunction&lt;/tt&gt; which does what the name suggests.  The &lt;tt&gt;ExternalLinkage&lt;/tt&gt; argument says that this function will be available outside the module it's in, the obvious opposite being &lt;tt&gt;InternalLinkage&lt;/tt&gt;.  Using &lt;tt&gt;InternalLinkage&lt;/tt&gt; is like saying &lt;tt&gt;static&lt;/tt&gt; on the top level in C.  In this examples it doesn't really matter which we pick.
&lt;p&gt;
The function has three arguments &lt;tt&gt;x y z&lt;/tt&gt;.  The last argument to &lt;tt&gt;createFunction&lt;/tt&gt; should be a lambda expression with the right number of arguments, i.e., the number of arguments should agree with the type.  We the use monadic syntax to generate an &lt;tt&gt;add&lt;/tt&gt;, &lt;tt&gt;mul&lt;/tt&gt;, and &lt;tt&gt;ret&lt;/tt&gt; instruction.
&lt;p&gt;
The code looks like assembly code, which is the level that LLVM is at.  It's a somewhat peculiar assembly code, because it's on SSA (Static Single Assignment) form.  More about that later.
&lt;p&gt;
So what can we do with this function?  Well, we can generate machine code for it and call it.
&lt;pre&gt;
main = do
    addMul &lt;- simpleFunction mAddMul
    a &lt;- addMul 2 3 4
    print a
&lt;/pre&gt;
In this code &lt;tt&gt;addMul&lt;/tt&gt; has type &lt;tt&gt;Int32 -&gt; Int32 -&gt; Int32 -&gt; IO Int32&lt;/tt&gt;, so it has to be called in the IO monad.  Since this is a pure function, we can make the type pure, i.e., &lt;tt&gt;Int32 -&gt; Int32 -&gt; Int32 -&gt; Int32&lt;/tt&gt;.
&lt;pre&gt;
main = do
    addMul &lt;- simpleFunction mAddMul
    let addMul' = unsafePurify addMul
    print (addMul' 2 3 4)
&lt;/pre&gt;
The &lt;tt&gt;unsafePurify&lt;/tt&gt; functions is simply an extension of &lt;tt&gt;unsafePerformIO&lt;/tt&gt; that drops the IO on the result of a function.
&lt;p&gt;
So that was pretty easy.  To make a function, just specify the LLVM code using the LLVM DSEL that the Haskell bindings provides.
&lt;h3&gt;Fibonacci&lt;/h3&gt;
No FP example is complete without the Fibonacci function, so here it is.
&lt;pre&gt;
mFib :: CodeGenModule (Function (Word32 -&gt; IO Word32))
mFib = do
    fib &lt;- newFunction ExternalLinkage
    defineFunction fib $ \ arg -&gt; do
        -- Create the two basic blocks.
        recurse &lt;- newBasicBlock
        exit &lt;- newBasicBlock

        -- Test if arg &gt; 2
        test &lt;- icmp IntUGT arg (2::Word32)
        condBr test recurse exit

        -- Just return 1 if not &gt; 2
        defineBasicBlock exit
        ret (1::Word32)

        -- Recurse if &gt; 2, using the cumbersome plus to add the results.
        defineBasicBlock recurse
        x1 &lt;- sub arg (1::Word32)
        fibx1 &lt;- call fib x1
        x2 &lt;- sub arg (2::Word32)
        fibx2 &lt;- call fib x2
        r &lt;- add fibx1 fibx2
        ret r
    return fib
&lt;/pre&gt;
Instead of using &lt;tt&gt;createFunction&lt;/tt&gt; to create the function we're using &lt;tt&gt;newFunction&lt;/tt&gt; and &lt;tt&gt;defineFunction&lt;/tt&gt;.  The former is a shorthand for the latter two together.  But splitting making the function and actually defining it means that we can refer to the function before it's been defined.  We need this since &lt;tt&gt;fib&lt;/tt&gt; is recursive.
&lt;p&gt;
Every instruction in the LLVM code belongs to a basic block.  A basic block is a sequence of non-jump instructions (&lt;tt&gt;call&lt;/tt&gt; is allowed in the LLVM) ending with some kind of jump.  It is always entered at the top only.  The top of each basic block can be thought of as a label that you can jump to, and those are the only places that you can jump to.
&lt;p&gt;
The code for fib starts with a test if the argument is Unsigned Greater Than 2.  The &lt;tt&gt;condBr&lt;/tt&gt; instruction branches to &lt;tt&gt;recurse&lt;/tt&gt; if &lt;tt&gt;test&lt;/tt&gt; is true otherwise to &lt;tt&gt;exit&lt;/tt&gt;.  To be able to refer to the two branch labels (i.e., basic blocks) before they are defined we create them with &lt;tt&gt;newBasicBlock&lt;/tt&gt; and then later define them with &lt;tt&gt;defineBasicBlock&lt;/tt&gt;.  The &lt;tt&gt;defineBasicBlock&lt;/tt&gt; simply starts a new basic block that runs to the next basic block start, or to the end of the function.  The type system does &lt;b&gt;not&lt;/b&gt; check that the basic block ends with a branch (I can't figure out how to do that without making the rest of the code more cumbersome).
&lt;p&gt;
In the false branch we simply return 1, and in the true branch we make the two usual recursive calls, add the results, and return the sum.
&lt;p&gt;
As you can see a few type annotations are necessary on constants.  In my opinion they are quite annoying, because if you write anything different from &lt;tt&gt;::Word32&lt;/tt&gt; in those annotations there will be a type error.  This means that in principle the compiler has all the information, it's just too "stupid" to use it.
&lt;p&gt;
The performance you get from this Fibonacci function is decent, but in fact worse than GHC with -O2 gives.  Even with full optimization turned on for the LLVM code it's still not as fast as GHC for this function.
&lt;p&gt;
[Edit: Added assembly] Here is the assembly code for Fibonacci.  Note how there is only one recursive call.  The other call has been transformed into a loop.
&lt;pre&gt;
_fib:
 pushl %edi
 pushl %esi
 subl $4, %esp
 movl 16(%esp), %esi
 cmpl $2, %esi
 jbe LBB1_4
LBB1_1:
 movl $1, %edi
 .align 4,0x90
LBB1_2:
 leal -1(%esi), %eax
 movl %eax, (%esp)
 call _fib
 addl %edi, %eax
 addl $4294967294, %esi
 cmpl $2, %esi
 movl %eax, %edi
 ja LBB1_2
LBB1_3:
 addl $4, %esp
 popl %esi
 popl %edi
 ret
LBB1_4:
 movl $1, %eax
 jmp LBB1_3
&lt;/pre&gt;
&lt;h3&gt;Hello, World!&lt;/h3&gt;
The code for printing "Hello, World!":
&lt;pre&gt;
import Data.Word
import LLVM.Core
import LLVM.ExecutionEngine

bldGreet :: CodeGenModule (Function (IO ()))
bldGreet = do
    puts &lt;- newNamedFunction ExternalLinkage "puts" :: TFunction (Ptr Word8 -&gt; IO Word32)
    greetz &lt;- createStringNul "Hello, World!"
    func &lt;- createFunction ExternalLinkage $ do
      tmp &lt;- getElementPtr greetz (0::Word32, (0::Word32, ()))
      call puts tmp -- Throw away return value.
      ret ()
    return func

main :: IO ()
main = do
    greet &lt;- simpleFunction bldGreet
    greet
&lt;/pre&gt;
To get access to the C function &lt;tt&gt;puts&lt;/tt&gt; we simply declare it and rely on the linker to link it in.  The &lt;tt&gt;greetz&lt;/tt&gt; variable has type pointer to array of characters.  So to get a pointer to the first character we have to use the rather complicated &lt;tt&gt;getElementPtr&lt;/tt&gt; instruction.  See &lt;a href="http://llvm.org/docs/GetElementPtr.html"&gt;FAQ about it&lt;/a&gt;.

&lt;h3&gt;Phi instructions&lt;/h3&gt;
Let's do the following simple C function
&lt;pre&gt;
int f(int x)
{
  if (x &lt; 0) x = -x;
  return (x+1);
}
&lt;/pre&gt;
Let's try to write some corresponding LLVM code:
&lt;pre&gt;
  createFunction ExternalLinkage $ \ x -&gt; do
    xneg &lt;- newBasicBlock
    xpos &lt;- newBasicBlock
    t &lt;- icmp IntSLT x (0::Int32)
    condBr t xneg xpos

    defineBasicBlock xneg
    x' &lt;- sub (0::Int32) x
    br xpos

    defineBasicBlock xpos
    r1 &lt;- add ??? (1::Int32)
    ret r1
&lt;/pre&gt;
But what should we put at &lt;tt&gt;???&lt;/tt&gt;?  When jumping from the &lt;tt&gt;condBr&lt;/tt&gt; the value is in &lt;tt&gt;x&lt;/tt&gt;, but when jumping from the negation block the value is in &lt;tt&gt;x'&lt;/tt&gt;.  And this is how SSA works.  Every instruction puts the value in a new "register", so this situation is unavoidable.  This is why SSA (and thus LLVM) form has &lt;tt&gt;phi&lt;/tt&gt; instructions.  This is a pseudo-instruction to tell the code generator what registers should be merged at the entry of a basic block.  So the real code looks like this:
&lt;pre&gt;
mAbs1 :: CodeGenModule (Function (Int32 -&gt; IO Int32))
mAbs1 = 
  createFunction ExternalLinkage $ \ x -&gt; do
    top &lt;- getCurrentBasicBlock
    xneg &lt;- newBasicBlock
    xpos &lt;- newBasicBlock
    t &lt;- icmp IntSLT x (0::Int32)
    condBr t xneg xpos

    defineBasicBlock xneg
    x' &lt;- sub (0::Int32) x
    br xpos

    defineBasicBlock xpos
    r &lt;- phi [(x, top), (x', xneg)]
    r1 &lt;- add r (1::Int32)
    ret r1
&lt;/pre&gt;
The &lt;tt&gt;phi&lt;/tt&gt; instruction takes a list of registers to merge, and paired up with each register is the basic block that the jump comes from.  Since the first basic block in a function is created implicitely we have to get it with &lt;tt&gt;getCurrentBasicBlock&lt;/tt&gt; which returns the current basic block.
&lt;p&gt;
If, like me, you have a perverse interest in the machine code that gets generated here is the optimized code for that function on for x86:
&lt;pre&gt;
__fun1:
        movl    4(%esp), %eax
        movl    %eax, %ecx
        sarl    $31, %ecx
        addl    %ecx, %eax
        xorl    %ecx, %eax
        incl    %eax
        ret
&lt;/pre&gt;
Note how the conditional jump has cleverly been replaced by some non-jumping instructions.  I think this code is as good as it gets.
&lt;p&gt;

&lt;h3&gt;Loops and arrays&lt;/h3&gt;
Let's do a some simple array code, the dot product of two vectors.  The function takes a length and pointers to two vectors.  It sums the elementwise product of the vectors.  Here's the C code:
&lt;pre&gt;
double
dotProd(unsigned int len, double *aPtr, double *bPtr)
{
    unsigned int i;
    double s;

    s = 0;
    for (i = 0; i != len; i++)
        s += aPtr[i] * bPtr[i];
    return s;
}
&lt;/pre&gt;
The corresponding LLVM code is much more complicated and has some new twists.
&lt;pre&gt;
import Data.Word
import Foreign.Marshal.Array
import LLVM.Core
import LLVM.ExecutionEngine

mDotProd :: CodeGenModule (Function (Word32 -&gt; Ptr Double -&gt; Ptr Double -&gt; IO Double))
mDotProd =
  createFunction ExternalLinkage $ \ size aPtr bPtr -&gt; do
    top &lt;- getCurrentBasicBlock
    loop &lt;- newBasicBlock
    body &lt;- newBasicBlock
    exit &lt;- newBasicBlock

    -- Enter loop, must use a br since control flow joins at the loop bb.
    br loop

    -- The loop control.
    defineBasicBlock loop
    i &lt;- phi [(valueOf (0 :: Word32), top)]  -- i starts as 0, when entered from top bb
    s &lt;- phi [(valueOf 0, top)]  -- s starts as 0, when entered from top bb
    t &lt;- icmp IntNE i size       -- check for loop termination
    condBr t body exit

    -- Define the loop body
    defineBasicBlock body

    ap &lt;- getElementPtr aPtr (i, ()) -- index into aPtr
    bp &lt;- getElementPtr bPtr (i, ()) -- index into bPtr
    a &lt;- load ap                 -- load element from a vector
    b &lt;- load bp                 -- load element from b vector
    ab &lt;- mul a b                -- multiply them
    s' &lt;- add s ab               -- accumulate sum

    i' &lt;- add i (valueOf (1 :: Word32)) -- Increment loop index

    addPhiInputs i [(i', body)]  -- Control flow reaches loop bb from body bb
    addPhiInputs s [(s', body)]
    br loop                      -- And loop

    defineBasicBlock exit
    ret (s :: Value Double)      -- Return sum

main = do
    ioDotProd &lt;- simpleFunction mDotProd
    let dotProd a b =
         unsafePurify $
         withArrayLen a $ \ aLen aPtr -&gt;
         withArrayLen b $ \ bLen bPtr -&gt;
         ioDotProd (fromIntegral (aLen `min` bLen)) aPtr bPtr

    let a = [1,2,3]
        b = [4,5,6]
    print $ dotProd a b
    print $ sum $ zipWith (*) a b
&lt;/pre&gt;
First we have to set up the looping machinery.  There a four basic blocks involved: the implicit basic block that is created at the start of every function, &lt;tt&gt;top&lt;/tt&gt;; the top of the loop, &lt;tt&gt;loop&lt;/tt&gt;; the body of the loop, &lt;tt&gt;body&lt;/tt&gt;; and finally the block with the return from the function, &lt;tt&gt;exit&lt;/tt&gt;.
&lt;p&gt;
There are two "registers", the loop index &lt;tt&gt;i&lt;/tt&gt; and the running sum &lt;tt&gt;s&lt;/tt&gt; that arrive from two different basic blocks at the top of the loop.  When entering the loop from the first time they should be 0.  That's what the &lt;tt&gt;phi&lt;/tt&gt; instruction specifies.  The &lt;tt&gt;valueOf&lt;/tt&gt; function simply turns a constant into an LLVM value.  It's worth noting that the initial values for the two variables are constant rather than registers.  The control flow also reached the basic block &lt;tt&gt;loop&lt;/tt&gt; from the end of &lt;tt&gt;body&lt;/tt&gt;, but we don't have the names of those registers in scope yet, so we can't put them in the &lt;tt&gt;phi&lt;/tt&gt; instruction.  Instead, we have to use &lt;tt&gt;addPhiInputs&lt;/tt&gt; to add more phi inputs later (when the registers are in scope).
&lt;p&gt;
The most mysterious instruction in the LLVM is &lt;tt&gt;getElementPtr&lt;/tt&gt;.  It simply does address arithmetic, so it really does something quite simple.  But it can perform several levels of address arithmetic when addressing through multilevel arrays and structs.  In can take several indicies, but since here we simply want to add the index variable to a pointer the usage is pretty simple.  Doing &lt;tt&gt;getElementPtr aPtr (i, ())&lt;/tt&gt; corresponds to &lt;tt&gt;aPtr + i&lt;/tt&gt; in C.
&lt;p&gt;
To test this function we need pointers to two vectors.  The FFI function &lt;tt&gt;withArrayLen&lt;/tt&gt; temporarily allocates the vector and fills it with elements from the list.
&lt;p&gt;
The essential part of the function looks like this in optimized x86 code:
&lt;pre&gt;
        pxor    %xmm0, %xmm0
        xorl    %esi, %esi
        .align  4,0x90
LBB1_2:
        movsd   (%edx,%esi,8), %xmm1
        mulsd   (%ecx,%esi,8), %xmm1
        incl    %esi
        cmpl    %eax, %esi
        addsd   %xmm1, %xmm0
        jne     LBB1_2
&lt;/pre&gt;
Which is pretty good.  Improving this would have to use SSD vector instructions.  This is possible using the LLVM vector type, but I'll leave that for now.
&lt;h3&gt;Abstraction&lt;/h3&gt;
The loop structure in &lt;tt&gt;dotProd&lt;/tt&gt; is pretty common, so we would like to abstract it out for reuse.  The creation of basic blocks and phi instructions is rather fiddly so it would be nice to do this once and not worry about it again.
&lt;p&gt;
What are the parts of the loop?  Well, let's just do a simple "for" loop that loops from a lower index (inclusive) to an upper index (exclusive) and executes the loop body for each iteration.  So there should be three arguments to the loop function: lower bound, upper bound and loop body.  What is the loop body?  Since the LLVM is using SSA the loop body can't really update the loop state variables.  Instead it's like a pure functional language where you have to express it as a state transformation.  So the loop body will take the old state and return a new state.  It's also useful to pass the loop index to the loop body.  Now when we've introduced the notion of a loop state we also need to have an initial value for the loop state as an argument to the loop function.
&lt;p&gt;
Let's start out easy and let the state to be updated in the loop be a single value.  In &lt;tt&gt;dotProd&lt;/tt&gt; it's simply the running sum (&lt;tt&gt;s&lt;/tt&gt;).
&lt;pre&gt;
forLoop low high start incr = do
    top &lt;- getCurrentBasicBlock
    loop &lt;- newBasicBlock
    body &lt;- newBasicBlock
    exit &lt;- newBasicBlock

    br loop

    defineBasicBlock loop
    i &lt;- phi [(low, top)]
    state &lt;- phi [(start, top)]
    t &lt;- icmp IntNE i high
    condBr t body exit

    defineBasicBlock body

    state' &lt;- incr i state
    i' &lt;- add i (valueOf 1)

    body' &lt;- getCurrentBasicBlock
    addPhiInputs i [(i', body')]
    addPhiInputs state [(state', body')]
    br loop
    defineBasicBlock exit

    return state
&lt;/pre&gt;
The &lt;tt&gt;low&lt;/tt&gt; and &lt;tt&gt;high&lt;/tt&gt; arguments are simply the loop bounds, &lt;tt&gt;start&lt;/tt&gt; is the start value for the loop state variable, and finally &lt;tt&gt;incr&lt;/tt&gt; is invoked in the loop body to get the new value for the state variable.  Note that the &lt;tt&gt;incr&lt;/tt&gt; can contain new basic blocks so there's no guarantee we're in the same basic block after &lt;tt&gt;incr&lt;/tt&gt; has been called.  That's why there is a call to &lt;tt&gt;getCurrentBasicBlock&lt;/tt&gt; before adding to the phi instructions.
&lt;p&gt;
So the original loop in &lt;tt&gt;dotProd&lt;/tt&gt; can now be written
&lt;pre&gt;
    s &lt;- forLoop 0 size 0 $ \ i s -&gt; do
      ap &lt;- getElementPtr aPtr (i, ()) -- index into aPtr
      bp &lt;- getElementPtr bPtr (i, ()) -- index into bPtr
      a &lt;- load ap                 -- load element from a vector
      b &lt;- load bp                 -- load element from b vector
      ab &lt;- mul a b                -- multiply them
      s' &lt;- add s ab               -- accumulate sum
      return s'
&lt;/pre&gt;
So that wasn't too bad.  But what if the loop needs multiple state variables?  Or none?  The tricky bit is handling the phi instructions since the number of instructions needed depends on how many state variables we have.  So let's creat a class for types that can be state variables.  This way we can use tuples for multiple state variables.  The class needs two methods, the generalization of &lt;tt&gt;phi&lt;/tt&gt; and the generalization of &lt;tt&gt;addPhiInputs&lt;/tt&gt;.
&lt;pre&gt;
class Phi a where
    phis :: BasicBlock -&gt; a -&gt; CodeGenFunction r a
    addPhis :: BasicBlock -&gt; a -&gt; a -&gt; CodeGenFunction r ()
&lt;/pre&gt;
A simple instance is when we have no state variables.
&lt;pre&gt;
instance Phi () where
    phis _ _ = return ()
    addPhis _ _ _ = return ()
&lt;/pre&gt;
We also need to handle the case with a single state variable.  All LLVM values are encapsulated in the &lt;tt&gt;Value&lt;/tt&gt; type, so this is the one we create an instance for.
&lt;pre&gt;
instance (IsFirstClass a) =&gt; Phi (Value a) where
    phis bb a = do
        a' &lt;- phi [(a, bb)]
        return a'
    addPhis bb a a' = do
        addPhiInputs a [(a', bb)]
&lt;/pre&gt;
Finally, here's the instance for pair.  Other tuples can be done in the same way (or we could just use nested pairs).
&lt;pre&gt;
instance (Phi a, Phi b) =&gt; Phi (a, b) where
    phis bb (a, b) = do
        a' &lt;- phis bb a
        b' &lt;- phis bb b
        return (a', b')
    addPhis bb (a, b) (a', b') = do
        addPhis bb a a'
        addPhis bb b b'
&lt;/pre&gt;
Using this new class the looping function becomes
&lt;pre&gt;
forLoop :: forall i a r . (Phi a, Num i, IsConst i, IsInteger i, IsFirstClass i) =&gt;
           Value i -&gt; Value i -&gt; a -&gt; (Value i -&gt; a -&gt; CodeGenFunction r a) -&gt; CodeGenFunction r a
forLoop low high start incr = do
    top &lt;- getCurrentBasicBlock
    loop &lt;- newBasicBlock
    body &lt;- newBasicBlock
    exit &lt;- newBasicBlock

    br loop

    defineBasicBlock loop
    i &lt;- phi [(low, top)]
    vars &lt;- phis top start
    t &lt;- icmp IntNE i high
    condBr t body exit

    defineBasicBlock body

    vars' &lt;- incr i vars
    i' &lt;- add i (valueOf 1 :: Value i)

    body' &lt;- getCurrentBasicBlock
    addPhis body' vars vars'
    addPhiInputs i [(i', body')]
    br loop
    defineBasicBlock exit

    return vars
&lt;/pre&gt;
&lt;h3&gt;File operations&lt;/h3&gt;
The Haskell bindings provide two convenient functions - &lt;tt&gt;writeBitcodeToFile&lt;/tt&gt; and &lt;tt&gt;readBitcodeFromFile&lt;/tt&gt; - for writing and reading modules in the LLVM binary format.
&lt;p&gt;
A simple example:
&lt;pre&gt;
import Data.Int
import LLVM.Core

mIncr :: CodeGenModule (Function (Int32 -&gt; IO Int32))
mIncr = 
  createNamedFunction ExternalLinkage "incr" $ \ x -&gt; do
    r &lt;- add x (1 :: Int32)
    ret r

main = do
    m &lt;- newModule
    defineModule m mIncr
    writeBitcodeToFile "incr.bc" m
&lt;/pre&gt;
Running this will produce the file &lt;tt&gt;incr.bc&lt;/tt&gt; which can be processed with the usual LLVM tools.  E.g.
&lt;pre&gt;
$ llvm-dis &lt; incr.bc  # to look at the LLVM code
$ opt -std-compile-opts incr.bc -f -o incrO.bc # run optimizer
$ llvm-dis &lt; incrO.bc  # to look at the optimized LLVM code
$ llc incrO.bc # generate assembly code
$ cat incrO.s  # look at assembly code
&lt;/pre&gt;
Reading a module file is equally easy, but what can you do with a module you have read?  It could contain anything.  To extract things from a module there is a function &lt;tt&gt;getModuleValues&lt;/tt&gt; which returns a list of name-value pairs of all externally visible functions and global variables.  The values all have type &lt;tt&gt;ModuleValue&lt;/tt&gt;.  To convert a &lt;tt&gt;ModuleValue&lt;/tt&gt; to a regular &lt;tt&gt;Value&lt;/tt&gt; you have to use &lt;tt&gt;castModuleValue&lt;/tt&gt;.  This is a safe conversion function that makes a dynamic type test to make sure the types match (think of &lt;tt&gt;ModuleValue&lt;/tt&gt; as &lt;tt&gt;Dynamic&lt;/tt&gt; and &lt;tt&gt;castModuleValue&lt;/tt&gt; as &lt;tt&gt;fromDynamic&lt;/tt&gt;).
&lt;p&gt;
Here's an example:
&lt;pre&gt;
import Data.Int
import LLVM.Core
import LLVM.ExecutionEngine

main = do
    m &lt;- readBitcodeFromFile "incr.bc"
    ee &lt;- createModuleProviderForExistingModule m &gt;&gt;= createExecutionEngine
    funcs &lt;- getModuleValues m
    let ioincr :: Function (Int32 -&gt; IO Int32)
        Just ioincr = lookup "incr" funcs &gt;&gt;= castModuleValue
        incr = unsafePurify $ generateFunction ee ioincr

    print (incr 41)
&lt;/pre&gt;
This post is getting rather long, so I'll let this be the last example for today.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2319446660716223785?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2319446660716223785/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2319446660716223785' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2319446660716223785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2319446660716223785'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2009/01/llvm-llvm-low-level-virtual-machine-is.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-4185981415808905008</id><published>2008-12-10T15:27:00.002Z</published><updated>2008-12-10T15:32:48.321Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Modules'/><category scheme='http://www.blogger.com/atom/ns#' term='OCaml'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;The OCaml code again&lt;/h2&gt;
I'm posting a slight variation of the OCaml code that I think better shows what I like about the ML version.
&lt;pre&gt;
module MkPerson(O: sig 
                     type xString
                     type xDouble
                     val opConcat : xString -&gt; xString -&gt; xString
                     val opShow : xDouble -&gt; xString
                   end) =
struct
  open O
  type person = Person of (xString * xString * xDouble)
  let display (Person (firstName, lastName, height)) = 
    opConcat firstName (opConcat lastName (opShow height))
end

module BasicPerson = MkPerson(struct
                                type xString = string
                                type xDouble = float
                                let opConcat = (^)
                                let opShow = string_of_float
                              end)

open BasicPerson

let _ = 
  let p = Person ("Stefan", "Wehr", 184.0)
  in display p
&lt;/pre&gt;
Note how the &lt;tt&gt;open O&lt;/tt&gt; opens the argument to the &lt;tt&gt;MkPerson&lt;/tt&gt; functor and all the values and types from the argument is in scope in the rest of the module.  There's no need to change lots of code in &lt;tt&gt;MkPerson&lt;/tt&gt;.
&lt;p&gt;
Similarely, the &lt;tt&gt;open BasicPerson&lt;/tt&gt; makes the operations from that module avaiable, so that &lt;tt&gt;Person&lt;/tt&gt; and &lt;tt&gt;display&lt;/tt&gt; can be used in a simple way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-4185981415808905008?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/4185981415808905008/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=4185981415808905008' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4185981415808905008'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4185981415808905008'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/12/ocaml-code-again-im-posting-slight.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5744056186368813968</id><published>2008-12-10T12:34:00.002Z</published><updated>2008-12-10T15:26:25.511Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Modules'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;Abstracting on, suggested solutions&lt;/h2&gt;
I guess I should be more constructive than just whining about how Haskell doesn't always do what I want.
I do have some suggestions on how to fix things.
&lt;h3&gt;Explicit type applications&lt;/h3&gt;
Let's look at a simple example again:
&lt;pre&gt;
f :: forall a . a -&gt; a
f = \ x -&gt; x

b :: Bool
b = f True
&lt;/pre&gt;
The way I like to think of this (and what happens in ghc) is that this is shorthand for something more explicit, namely the F&lt;sub&gt;w&lt;/sub&gt; version of the same thing.  In F&lt;sub&gt;w&lt;/sub&gt; all type abstraction and type application are explicit.  Let's look at the explicit version (which is no longer Haskell).
&lt;pre&gt;
f :: forall (a::*) . a -&gt; a
f = /\ (a::*) -&gt; \ (x::a) -&gt; x

b :: Bool
b = f @Bool True
&lt;/pre&gt;
I'm using &lt;tt&gt;/\&lt;/tt&gt; for type abstraction and &lt;tt&gt;expr @type&lt;/tt&gt; for type application.  Furthermore each binder is annotated with its type.  This is what ghc translates the code to internally, this process involves figuring out what all the type abstractions and applications should be.
&lt;p&gt;
Not something a little more complicated (from my previous post)
&lt;pre&gt;
class C a b where
    x :: a
    y :: b

f :: (C a b) =&gt; a -&gt; [a]
f z = [x, x, z]
&lt;/pre&gt;
The type of &lt;tt&gt;x&lt;/tt&gt; is
&lt;pre&gt;
x :: forall a b . (C a b) =&gt; a
&lt;/pre&gt;
So whenever &lt;tt&gt;x&lt;/tt&gt; occurs two type applications have to be inserted (there's also a dictionary to insert, but I'll ignore that).
The decorated term for &lt;tt&gt;f&lt;/tt&gt; (ignoring the context)
&lt;pre&gt;
f :: forall a b . (C a b) =&gt; a -&gt; [a]
f = /\ (a::*) (b::*) -&gt; \ (z::a) -&gt; [ x @a @b1, x @a @b2, z]
&lt;/pre&gt;
The reason for the ambiguity in type checking is that the type check cannot figure out that the &lt;tt&gt;b&lt;/tt&gt; is in any way connected to &lt;tt&gt;b1&lt;/tt&gt; and &lt;tt&gt;b2&lt;/tt&gt;.  Because it isn't.  And there's currently no way we can connect them.
&lt;p&gt;
So I suggest that it should be possible to use explicit type application in Haskell when you want to.  The code would look like this
&lt;pre&gt;
f :: forall a b . (C a b) =&gt; a -&gt; [a]
f z = [ x @a @b, x @a @b, z]
&lt;/pre&gt;
The order of the variables in the &lt;tt&gt;forall&lt;/tt&gt; determines the order in which the type abstractions come, and thus determines where to put the type applications.
&lt;h3&gt;Something like &lt;tt&gt;abstype&lt;/tt&gt;&lt;/h3&gt;
Back to my original problem with abstraction.  What about if this was allowed:
&lt;pre&gt;
class Ops t where
    data XString t :: *
    (+++) :: XString t -&gt; XString t -&gt; XString t

instance Ops Basic where
    type XString Basic = String
    (+++) = (++)
&lt;/pre&gt;
So the class declaration says I'm going to use data types (which was my final try and which works very nicely).
But in the instance I provide a type synonym instead.  This would be like using a &lt;tt&gt;newtype&lt;/tt&gt; in the instance, but without having to use the newtype constructor everywhere.  The fact that it's not a real data type is only visible inside the instance declaration.  The compiler could in fact make a &lt;tt&gt;newtype&lt;/tt&gt; and insert all the coercions.
This is, of course, just a variation of the &lt;tt&gt;abstype&lt;/tt&gt; suggestion by Wehr and Chakravarty.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5744056186368813968?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5744056186368813968/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5744056186368813968' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5744056186368813968'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5744056186368813968'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/12/abstracting-on-suggested-solutions-i.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-6971407551389533807</id><published>2008-12-10T10:55:00.002Z</published><updated>2008-12-10T15:26:25.512Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Modules'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;The abstraction continues&lt;/h2&gt;
I got several comments to my lament about my &lt;a href="http://augustss.blogspot.com/2008/12/somewhat-failed-adventure-in-haskell.html"&gt;attempts at abstraction&lt;/a&gt; in my previous blog post.  Two of the comments involve adding an extra argument to &lt;tt&gt;display&lt;/tt&gt;.  I dont regard this as an acceptable solution; the changes to the code should not be that intrusive.  Adding an argument to a function is a change that ripples through the code to many places and not just the implementation of &lt;tt&gt;display&lt;/tt&gt;.
&lt;p&gt;
Reiner Pope succeeded where I failed.  He split up the operations in Ops into two classes and presto, it works.
&lt;pre&gt;
data Person t = Person {
    firstName :: XString t,
    lastName :: XString t,
    height :: XDouble t
    }

class (Show s, IsString s) =&gt; IsXString s where
    (+++) :: s -&gt; s -&gt; s
class (Num d, IsXString s) =&gt; IsXDouble s d where
    xshow :: d -&gt; s

class (IsXDouble (XString t) (XDouble t)) =&gt; Ops t where
    type XString t :: *
    type XDouble t :: *
instance IsXString String where
    (+++) = (++)
instance IsXDouble String Double where
    xshow = show

data Basic = Basic

instance Ops Basic where
    type XString Basic = String
    type XDouble Basic = Double

display :: Ops t =&gt; Person t -&gt; XString t
display p = firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
&lt;/pre&gt;
That's neat, but a little fiddly if there are many types involved.

&lt;h3&gt;Another problem&lt;/h3&gt;
Armed with this solution I write another function.
&lt;pre&gt;
incSpace :: (Ops t) =&gt; XDouble t -&gt; XString t
incSpace x = xshow x +++ " "
&lt;/pre&gt;
It typechecks fine.  But as far as I can figure out there is &lt;b&gt;no&lt;/b&gt; way to use this function.
Let's see what ghci says:
&lt;pre&gt;
&gt; :t incSpace (1 :: XDouble Basic) :: XString Basic

&lt;interactive&gt;:1:0:
    Couldn't match expected type `[Char]'
           against inferred type `XString t'
    In the expression: incSpace (1 :: XDouble Basic) :: XString Basic

&lt;interactive&gt;:1:10:
    Couldn't match expected type `XDouble t'
           against inferred type `Double'
    In the first argument of `incSpace', namely `(1 :: XDouble Basic)'
    In the expression: incSpace (1 :: XDouble Basic) :: XString Basic
&lt;/pre&gt;
Despite my best efforts at providing types it doesn't work.  The reason being that saying, e.g., &lt;tt&gt;(1 :: XDouble Basic)&lt;/tt&gt; is the same as saying &lt;tt&gt;(1 :: Double)&lt;/tt&gt;.  And that doesn't match &lt;tt&gt;XDouble t&lt;/tt&gt;.  At least not to the typecheckers knowledge.
&lt;p&gt;
In the example of &lt;tt&gt;display&lt;/tt&gt; things work because the parameter &lt;tt&gt;t&lt;/tt&gt; occurs in &lt;tt&gt;Person t&lt;/tt&gt; which is a real type and not a type family.  If a type variable only occurs in type family types you are out of luck.  There's no way to convey the information what that type variable should be (as far as i know).  You can "solve" the problem by adding &lt;tt&gt;t&lt;/tt&gt; as an argument to &lt;tt&gt;incSpace&lt;/tt&gt;, but again, I don't see that as a solution.
&lt;p&gt;
In the paper &lt;a href="http://www.cse.unsw.edu.au/~chak/papers/WC08.html"&gt;ML Modules and Haskell Type Classes: A Constructive Comparison&lt;/a&gt; Wehr and Chakravarty introduce a notion of abstract associated types.  That might solve this problem.  I really want &lt;tt&gt;XDouble&lt;/tt&gt; and &lt;tt&gt;XString&lt;/tt&gt; to appear as abstract types (or associated data types) outside of the instance declaration.  Only inside the instance declaration where I provide implementations for the operations do I really care what the type is.
&lt;h3&gt;A reflection on type signatures&lt;/h3&gt;
If I write
&lt;pre&gt;
f x = x
&lt;/pre&gt;
Haskell can deduce that the type is &lt;tt&gt;f :: a -&gt; a&lt;/tt&gt;.
&lt;p&gt;
If I instead write
&lt;pre&gt;
f :: Int -&gt; Int
f x = x
&lt;/pre&gt;
Haskell happily uses this type.  The type checker does &lt;b&gt;not&lt;/b&gt; complain as to say "Sorry dude, but you're wrong, the type is more general than what you wrote.".  I think that's nice and polite.
&lt;p&gt;
Now a different example.
&lt;pre&gt;
class C a b where
    x :: a
    y :: b

f z = [x, x, z]
&lt;/pre&gt;
What does ghc have to say about the type of &lt;tt&gt;f&lt;/tt&gt;?
&lt;pre&gt;
f :: (C a b, C a b1) =&gt; a -&gt; [a]
&lt;/pre&gt;
OK, that's reasonable; the two occurences of &lt;tt&gt;x&lt;/tt&gt; could have different contexts.  But I don't want them to.  Let's add a type signature.
&lt;pre&gt;
f :: (C a b) =&gt; a -&gt; [a]
f z = [x, x, z]
&lt;/pre&gt;
What does ghc have to say?
&lt;pre&gt;
Blog2.hs:9:7:
    Could not deduce (C a b) from the context (C a b2)
      arising from a use of `x' at Blog2.hs:9:7
    Possible fix:
      add (C a b) to the context of the type signature for `f'
    In the expression: x
    In the expression: [x, x, z]
    In the definition of `f': f z = [x, x, z]

Blog2.hs:9:10:
    Could not deduce (C a b1) from the context (C a b2)
      arising from a use of `x' at Blog2.hs:9:10
    Possible fix:
      add (C a b1) to the context of the type signature for `f'
    In the expression: x
    In the expression: [x, x, z]
    In the definition of `f': f z = [x, x, z]
&lt;/pre&gt;
Which is ghc's way of say "Dude, I see your context, but I'm not going to use it because I'm more clever than you and can figure out a better type."
Rude, is what I say.
&lt;p&gt;
I gave a context, but there is nothing to link the &lt;tt&gt;b&lt;/tt&gt; in my context to what ghc internally figures out that the type of the two occuerences of &lt;tt&gt;x&lt;/tt&gt; should.  I wish I could tell the type checker, "This is the only context you'll ever going to have, use it if you can."  Alas, this is not how things work.
&lt;h3&gt;A little ML&lt;/h3&gt;
Stefan Wehr provided the ML version of the code that I only aluded to
&lt;pre&gt;
module MkPerson(O: sig 
                     type xString
                     type xDouble
                     val opConcat : xString -&gt; xString -&gt; xString
                     val opShow : xDouble -&gt; xString
                   end) =
struct
  type person = Person of (O.xString * O.xString * O.xDouble)
  let display (Person (firstName, lastName, height)) = 
    O.opConcat firstName (O.opConcat lastName (O.opShow height))
end

module BasicPerson = MkPerson(struct
                                type xString = string
                                type xDouble = float
                                let opConcat = (^)
                                let opShow = string_of_float
                              end)

let _ = 
  let p = BasicPerson.Person ("Stefan", "Wehr", 184.0)
  in BasicPerson.display p
&lt;/pre&gt;
In this case, I think this is the natural way of expressing the abstraction I want.  Of course, this ML code has some shortcomings too.  Since string literals in ML are not overloaded the cannot be used neatly in the display function like I could in the Haskell version, but that's a minor point.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-6971407551389533807?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/6971407551389533807/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=6971407551389533807' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6971407551389533807'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6971407551389533807'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/12/abstraction-continues-i-got-several.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-788298342897012865</id><published>2008-12-10T00:10:00.002Z</published><updated>2008-12-10T15:26:34.573Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Modules'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;A somewhat failed adventure in Haskell abstraction&lt;/h2&gt;
I usually blog about weird and wonderful things you can do in Haskell.  Today I'm going to talk about something very plain and not wonderful at all.
&lt;p&gt;
If you want to try out the code below, use these Haskell extensions:
&lt;/p&gt;&lt;pre&gt;
{-# LANGUAGE TypeFamilies, MultiParamTypeClasses, OverloadedStrings,
   FlexibleInstances, TypeSynonymInstances, ScopedTypeVariables,
   FunctionalDependencies, RecordWildCards, FlexibleContexts,
   GeneralizedNewtypeDeriving #-}
&lt;/pre&gt;
&lt;h3&gt;The simple problem&lt;/h3&gt;
We want to define a type for a person which has a few fields and operations.  Like this
&lt;pre&gt;
module Person(Person(..), display) where

data Person = Person {
   firstName :: String,
   lastName  :: String,
   height    :: Double
   }

display :: Person -&gt; String
display p = firstName p ++ " " ++ lastName p ++ " " ++ show (height p + 1)
&lt;/pre&gt;
Very simple.  To use it we can just import the module and the write something like
&lt;pre&gt;
 print $ display $ Person { firstName = "Manuel", lastName = "Peyton Jones", height = 255 }
&lt;/pre&gt;
But being efficiancy conscious I'm not happy with using &lt;tt&gt;String&lt;/tt&gt; and &lt;tt&gt;Double&lt;/tt&gt;.
I'd like to experiment with different types for these.  Maybe I should use &lt;tt&gt;ByteString&lt;/tt&gt; and &lt;tt&gt;Int&lt;/tt&gt; instead?
&lt;p&gt;
Simple enough, let's abstract out the types and operations into a different module.
&lt;/p&gt;&lt;pre&gt;
module Ops(XString, XDouble, (+++), xshow) where
import Data.String
newtype XString = XString String deriving (Eq, Show, IsString)
newtype XDouble = XDouble Double deriving (Eq, Show, Num)

(+++) :: XString -&gt; XString -&gt; XString
XString x +++ XString y = XString (x ++ y)

xshow :: XDouble -&gt; XString
xshow (XDouble x) = XString (show x)

module Person(Person(..), display) where
import Ops

data Person = Person {
   firstName :: XString,
   lastName  :: XString,
   height    :: XDouble
   }

display :: Person -&gt; XString
display p = firstName p +++ " " +++ lastName p +++ " " +++ show (height p + 1)
&lt;/pre&gt;
There, problems solved.  By changing the import in the &lt;tt&gt;Person&lt;/tt&gt; module you can try out different types for &lt;tt&gt;XString&lt;/tt&gt; and &lt;tt&gt;XDouble&lt;/tt&gt;.
&lt;p&gt;
No, this is not problem solved.  To try out different implementations I need to &lt;it&gt;edit&lt;/it&gt; the &lt;tt&gt;Person&lt;/tt&gt; module.  That's not abstraction, that's obstruction.  It should be possible to write the code for the &lt;tt&gt;Person&lt;/tt&gt; module once and for all once you decided to abstract, and then never change it again.
&lt;/p&gt;&lt;p&gt;
I also didn't really want to necessarily have &lt;tt&gt;newtype&lt;/tt&gt; in my module.  Maybe I'd want this:
&lt;/p&gt;&lt;pre&gt;
module Ops(XString, XDouble, (+++), xshow) where
type XString = String
type XDouble = Double

(+++) :: XString -&gt; XString -&gt; XString
(+++) = (++)

xshow :: XDouble -&gt; XString
xshow = show
&lt;/pre&gt;
You can define Ops that way, but then the implementation of Ops may leak into the Person module.  What you really want is to type check Person against the signature of Ops, like
&lt;pre&gt;
interface Ops where
type XString
type XDouble
(+++) :: XString -&gt; XString -&gt; XString
xshow :: XDouble -&gt; XString
&lt;/pre&gt;
And later supply the actual implementation.  Alas, Haskell doesn't allow this.
&lt;p&gt;
In ML (SML or O'Caml) this would be solved by using a functor.  The Person module would be a functor that takes the Ops module as an argument and yields a new module.  And then you can just plug and play with different Ops implementations.  This is where ML shines and Haskell sucks.
&lt;/p&gt;&lt;h3&gt;Type classes&lt;/h3&gt;
But the defenders of Haskell superiority say, Haskell has type classes, that's the way to abstract!  So let's make Ops into a type class.  Let's do old style with multiple parameters first.  Since Ops defines two types it will correspond to having two type parameters to the class.
&lt;pre&gt;
class (IsString xstring, Num xdouble) =&gt; Ops xstring xdouble where
   (+++) :: xstring -&gt; xstring -&gt; xstring
   xshow :: xdouble -&gt; xshow
&lt;/pre&gt;
Ok, so how do we have to rewrite the Person module?
&lt;pre&gt;
data Person xstring xdouble = Person {
   firstName :: xstring,
   lastName  :: xstring,
   height    :: xdouble
   }

display :: (Ops xstring xdouble) =&gt; Person xstring xdouble -&gt; xstring
display p = firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
&lt;/pre&gt;
An implementation is provided by an instance declaration:
&lt;pre&gt;
instance Ops String Double where
   (+++) = (++)
   xshow = show
&lt;/pre&gt;
We see the major flaw in this approch at once.  The &lt;tt&gt;Person&lt;/tt&gt; data type now has two parameters.  This might be bearable, but imagine a more complicated example where Ops contains 15 types.  And every time you add a field with a new type to Person you have to update every single place in the program that mentions the Person type.  That's not abstraction.
&lt;p&gt;
But in fact, it's even worse than that.  The definition of &lt;tt&gt;display&lt;/tt&gt; might look plausible, but it's full of ambiguities.  Compiling it gives lots of errors of this kind:
&lt;/p&gt;&lt;pre&gt;
   Could not deduce (Ops xstring xdouble)
     from the context (Ops xstring xdouble4)
&lt;/pre&gt;
Well, we can remove the type signature and let GHC figure it out.  The we get this
&lt;pre&gt;
display :: (Ops xstring xdouble,
           Ops xstring xdouble3,
           Ops xstring xdouble2,
           Ops xstring xdouble1,
           Ops xstring xdouble4) =&gt;
          Person xstring xdouble4 -&gt; xstring
&lt;/pre&gt;
And this function can, of course, never be used because most of the type variables do not occur outside the context so they will never be determined.  I don't even know how to put explicit types in the function to make it work.
&lt;p&gt;
Well, it's common knowledge that multi-parameter type classes without functional dependencies is asking for trouble.  So can we add some functional dependencies?  Sure, if we use
&lt;/p&gt;&lt;pre&gt;
class (IsString xstring, Num xdouble) =&gt; Ops xstring xdouble | xstring -&gt; xdouble where
&lt;/pre&gt;
then things work beautifully.  Until we decide that another instance that would be interesting to try is
&lt;pre&gt;
instance Ops String Int
&lt;/pre&gt;
which is not valid with the FD present.
&lt;p&gt;
So we can't have functional dependencies if we want to have flexibilty with the instances.  So what is it that goes wrong without the FDs?  It's that all the uses &lt;tt&gt;(+++)&lt;/tt&gt; and &lt;tt&gt;xshow&lt;/tt&gt; are not tied together, they could potentially have different types.  Let's try and be sneaky and tie them together:
&lt;/p&gt;&lt;pre&gt;
display :: (Ops xstring xdouble) =&gt; Person xstring xdouble -&gt; xstring
display p =
   let (++++) = (+++); xxshow = xshow
   in  firstName p ++++ " " ++++ lastName p ++++ " " ++++ xxshow (height p + 1)
&lt;/pre&gt;
This only generates one error message, because there's still nothing that says the the two operations come from the same instance.  We need to make the tie even closer.
&lt;pre&gt;
class (IsString xstring, Num xdouble) =&gt; Ops xstring xdouble where
   ops :: (xstring -&gt; xstring -&gt; xstring, xdouble -&gt; xstring)
instance Ops String Double where
   ops = ((++), show)

display :: (Ops xstring xdouble) =&gt; Person xstring xdouble -&gt; xstring
display p =
   let ((+++), xshow) = ops
   in  firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
&lt;/pre&gt;
This actually works!
We can make it neater looking.
&lt;pre&gt;
class (IsString xstring, Num xdouble) =&gt; Ops xstring xdouble where
   ops :: DOps xstring xdouble

data DOps xstring xdouble = DOps {
   (+++) :: xstring -&gt; xstring -&gt; xstring,
   xshow :: xdouble -&gt; xstring
   }

instance Ops String Double where
   ops = DOps (++) show

display :: (Ops xstring xdouble) =&gt; Person xstring xdouble -&gt; xstring
display p =
   let DOps{..} = ops
   in  firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
&lt;/pre&gt;
We have basically packaged up the dictionary and unpack it ourselves to get access to the operations.  It's not pleasent, but it works.
&lt;p&gt;
But as I already said, the multiparameter type class version isn't really a good solution to the problem even if it works; it introduces too many parameters to the Person record.
&lt;/p&gt;&lt;h3&gt;Associated types&lt;/h3&gt;
The new and shiny way of doing type classes is to use associated types instead of FDs.  So let's give that a try.  So what should the associated types be in the class.  The associated type is supposed to be the one that can be computed from the main one.  But we have two types that are on equal footing, so there is no main one.  We can remedy that by introducing an artificial third type that is the main one, it can then determine the other two.
&lt;pre&gt;
class (IsString (XString t), Num (XDouble t)) =&gt; Ops t where
   type XString t :: *
   type XDouble t :: *
   (+++) :: XString t -&gt; XString t -&gt; XString t
   xshow :: XDouble t -&gt; XString t

data Person t = Person {
   firstName :: XString t,
   lastName  :: XString t,
   height    :: XDouble t
   }
&lt;/pre&gt;
That looks pretty neat.  Note how the Person record has one parameter and no matter how many new associated type we add it will still only have one parameter.  One parameter is reasonable, the Person record is after all parameterized over what kind of Ops we are providing.
&lt;p&gt;
Let's do an instance.  It will need the extra type that is somehow the name of the instance.
&lt;/p&gt;&lt;pre&gt;
data Basic = Basic

instance Ops Basic where
   type XString Basic = String
   type XDouble Basic = Double
   (+++) = (++)
   xshow = show
&lt;/pre&gt;
Now what about the &lt;tt&gt;display&lt;/tt&gt; function?  Alas, now it breaks down again.  The &lt;tt&gt;display&lt;/tt&gt; function is full of type errors again.  And the reason is similar to the multiparameter version; there's nothing that ties the operations together.
&lt;p&gt;
We can play the same trick as with DOps above, but for some reason it doesn't work this time.  The type comes out as
&lt;/p&gt;&lt;pre&gt;
display :: (XString t ~ XString a,
           XDouble t ~ XDouble a,
           Ops a,
           Num (XDouble t)) =&gt;
          Person t -&gt; XString a
&lt;/pre&gt;
I have no clue why.  I find associated types very hard to get a grip on.
&lt;p&gt;
OK, multi-parameter type classes made things work, but had too many type parameters.  And associated types is the other way around.  You can try combining them, but it didn't get me anywhere closer.
&lt;/p&gt;&lt;h3&gt;Associated data types&lt;/h3&gt;
OK, I won't admit defeat yet.  There's still associated data types.  They are easier to deal with than associated types, because the type function is guaranteed to be injective.
&lt;pre&gt;
class (IsString (XString t), Num (XDouble t)) =&gt; Ops t where
   data XString t :: *
   data XDouble t :: *
   (+++) :: XString t -&gt; XString t -&gt; XString t
   xshow :: XDouble t -&gt; XString t

data Basic = Basic

instance Ops Basic where
   newtype XString Basic = XSB String deriving (Eq, Ord, Show)
   newtype XDouble Basic = XDB Double deriving (Eq, Ord, Show)
   XSB x +++ XSB y = XSB (x ++ y)
   xshow (XDB x) = XSB (show x)
instance Num (XDouble Basic) where
   XDB x + XDB y = XDB (x+y)
   fromInteger = XDB . fromInteger
instance IsString (XString Basic) where
   fromString = XSB
&lt;/pre&gt;
At last, this actually works!  But it's at a price.  We can no longer use the types we want in the instance declaration, instead we are forced to invent new types.  Using this approach the original multi-parameter version could have been made to work as well.
&lt;p&gt;
Normally the GeneralizedNewtypeDeriving language extension makes it relatively painless to introduce a newtype that has all the instances of the underlying type.  But due to a bug in ghc you can't use this extension for associated newtypes.  So we have to make manual instance declarations which makes this approach very tedious.
&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;
I have found no way of doing what I want.  My request is very simple, I want to be able to abstract over the actual implementation of a module, where the module contains types, values, and instances.
&lt;p&gt;
Haskell normally excels in abstraction, but here I have found no natural way of doing what I want.  Perhaps I'm just not clever enough to figure out how, but that is a failure of Haskell too.  It should not take any cleverness to do something as simple as this.  In ML this is the most natural thing in the world to do.
&lt;/p&gt;&lt;p&gt;
Associated types are &lt;b&gt;not&lt;/b&gt; a replacement for a proper module system.  They let you do some things, but others just don't work.
&lt;/p&gt;&lt;p&gt;I'd be happy to see anyone doing this in Haskell in a simple way.&lt;/p&gt;&lt;p&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-788298342897012865?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/788298342897012865/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=788298342897012865' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/788298342897012865'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/788298342897012865'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/12/somewhat-failed-adventure-in-haskell.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-2069749915560147065</id><published>2008-07-03T12:38:00.018+01:00</published><updated>2008-07-05T00:52:49.730+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;Lost and found&lt;/h2&gt;

If I write &lt;tt&gt;10^8&lt;/tt&gt; in Haskell, how many multiplications will be used to compute the power?  A stupid question?  Well, for this example, but if I was computing &lt;tt&gt;x^8&lt;/tt&gt; and x has 100000 digits then I'd care.

So how can I find out?  I can look at the definition of the exponentiation operator.  Here it is, from the Haskell report and GHC 6.8:
&lt;pre&gt;
(^)             :: (Num a, Integral b) =&gt; a -&gt; b -&gt; a
_ ^ 0           =  1
x ^ n | n &gt; 0   =  f x (n-1) x
   where f _ 0 y = y
         f a d y = g a d
           where
             g b i | even i  = g (b*b) (i `quot` 2)
                   | otherwise = f b (i-1) (b*y)
_ ^ _           = error "Prelude.^: negative exponent"
&lt;/pre&gt;

It's a bit involved, but decipherable.  Another way would be to insert some kind of debug trace message in the multiplication.

&lt;h3&gt;Traced values&lt;/h3&gt;
I'd like to show a different way.  Here's a ghci session:
&lt;pre&gt;
Prelude&gt; :m +Debug.Traced
Prelude Debug.Traced&gt; let x = 10 :: Traced AsValue Integer
Prelude Debug.Traced&gt; let y = x ^ 8
Prelude Debug.Traced&gt; y
100000000
Prelude Debug.Traced&gt; :t y
y :: Traced AsValue Integer
Prelude Debug.Traced&gt; asExp y
10 * 10 * (10 * 10) * (10 * 10 * (10 * 10))
Prelude Debug.Traced&gt; asSharedExp y
let _2 = 10 * 10; in  _2 * _2 * (_2 * (10 * 10))
Prelude Debug.Traced&gt; :t asSharedExp y
asSharedExp y :: Traced AsExp Integer
Prelude Debug.Traced&gt;
&lt;/pre&gt;
So what's going on?  The value of x is &lt;tt&gt;Traced Integer&lt;/tt&gt;, which means that there's some magic going on.  The variable can be used as usual, for instance in computing &lt;tt&gt;x^8&lt;/tt&gt;.  A traced value can also be shown as an expression, which is what &lt;tt&gt;showAsExp&lt;/tt&gt; does.

So a traced value is somewhat like the symbolic values I had in an earlier post, but in addition to having a symbolic representation they also have a normal value.  But the output from &lt;tt&gt;showAsExp&lt;/tt&gt; doesn't really help in answering how many multiplications there are, since the shown expression has no sharing; it is totally flattened.  The &lt;tt&gt;showAsShared&lt;/tt&gt; function is the black magic here, it recovers the sharing, and we can see what happened.  What we see is that there are actually five (5) multiplications involved in computing 10^8.  This shows that the definition of exponentiation is suboptimal, since it can be done with three multiplications (three repeated squarings).

The &lt;tt&gt;showAsShared&lt;/tt&gt; really does have some &lt;b&gt;black&lt;/b&gt; magic.  It recovers information that is not part of the Haskell semantics, so from that we can conclude that it must contain the powerful incantation &lt;tt&gt;unsafePerformIO&lt;/tt&gt; somewhere.  How does it reveal implementation secrets?  Look at this:
&lt;pre&gt;
Prelude Debug.Traced&gt; asSharedExp $ let a = 1 + 2 in a * a
let _1 = 1 + 2; in  _1 * _1
Prelude Debug.Traced&gt; asSharedExp $ (1+2) * (1+2)
(1 + 2) * (1 + 2)
Prelude Debug.Traced&gt; 
&lt;/pre&gt;
The let expression and the expression where the variable has been expanded are semantically equal in Haskell, so no (pure) function can possibly be able to give different results for them.
&lt;p&gt;
OK, so how does it work?  I'll show a simplified version of the Traced module here that only deals with one traced type, but it can be extended.  The (soon to be available) hackage package contains the extended version.
&lt;p&gt;
In the &lt;tt&gt;Traced&lt;/tt&gt; type we need to represent expressions.  We only need constants and function applications.
&lt;pre&gt;
data Traced a
    = Con a
    | Apply a String [Traced a]
&lt;/pre&gt;
The function application contains the value, the name of the function, and the arguments the function was applied to.
&lt;p&gt;
In the exported interface from the module we want to be able to convert to and from the Traced type.  Nothing exciting here.
&lt;pre&gt;
traced :: a -&gt; Traced a
traced = Con

unTraced :: Traced a -&gt; a
unTraced (Con x) = x
unTraced (Apply x _ _) = x
&lt;/pre&gt;

We want to show a traced value the same way we show the underlying value.
&lt;pre&gt;
instance (Show a) =&gt; Show (Traced a) where
    show = show . unTraced
&lt;/pre&gt;

Comparing for equality, we simply compare the underlying values.
&lt;pre&gt;
instance (Eq a) =&gt; Eq (Traced a) where
    x == y  =  unTraced x == unTraced y
&lt;/pre&gt;

And we'll make traced numbers be an instance of &lt;tt&gt;Num&lt;/tt&gt;.  All the functions (except &lt;tt&gt;fromInteger&lt;/tt&gt;) build apply nodes.
&lt;pre&gt;
instance (Num a) =&gt; Num (Traced a) where
    (+) = apply2 "+" (+)
    (-) = apply2 "-" (-)
    (*) = apply2 "*" (*)
    negate = apply1 "-" negate
    abs = apply1 "abs" abs
    signum = apply1 "signum" signum
    fromInteger = traced . fromInteger

apply1 s f x = Apply (f (unTraced x)) s [x]
apply2 s f x y = Apply (f (unTraced x) (unTraced y)) s [x, y]
&lt;/pre&gt;

A fancier version of this module could make the Traced type an applicative functor etc., but that's not really so important.
&lt;p&gt;
Finally, we want to be able to show a traced expression as an expression tree instead of a value.
&lt;pre
showAsExp :: (Show t) =&gt; Traced t -&gt; String
showAsExp (Con x) = show x
showAsExp (Apply _ s [x,y]) | not (isAlpha (head s)) =
    "(" ++ showAsExp x ++ " " ++ s ++ " " ++ showAsExp y ++ ")"
showAsExp (Apply _ s xs) =
    "(" ++ concat (intersperse " " $ s : map showAsExp xs) ++ ")"
&lt;/pre&gt;
We only export what is necessary, so the module header should be
&lt;pre&gt;
module Traced(Traced, traced, unTraced, showAsExp) where
&lt;/pre&gt;

A quick test:
&lt;pre&gt;
Traced&gt; putStrLn $ showAsExp $ 10^8
(((10 * 10) * (10 * 10)) * ((10 * 10) * (10 * 10)))
Traced&gt; 
&lt;/pre&gt;

And now we need the black magic to recover the sharing.  We would like to have a unique label in each node of the expression tree.  If we only had that we could see when two things referred to the same subexpression, and use the label to refer to it instead of the value.  If we were doing this in, &lt;i&gt;e.g.&lt;/i&gt;., Java we could use object identity for this purpose.  If we were doing it in C, we'd just compare pointers to the structs containing the expressions.  But we're doing it in Haskell and none of this is available.  It's not unavailable because Haskell wants to make our lives difficult, quite the contrary.  Languages that allow pointer comparisons (object identity) must introduce an extra level of indirection in the semantics to explain how this is possible.  So now it's not enough to know we have the number 5, we need to know that this is the number 5 at location 1000.  And that's not the same as the number 5 at location 1010.  The numbers contained in the locations might be the same, but the locations are not interchangable since they could, &lt;i&gt;e.g.&lt;/i&gt;, be mutated differently in the future.
&lt;p&gt;
So is everything lost in Haskell?  Not at all.  GHC implements a library of stable names, which is (at first approximation) the same as the address of something in memory.
The API to &lt;tt&gt;System.Mem.StableName&lt;/tt&gt; is very simple.
&lt;pre&gt;
data StableName a
makeStableName :: a -&gt; IO (StableName a)
hashStableName :: StableName a -&gt; Int
&lt;/pre&gt;
The &lt;tt&gt;makeStableName&lt;/tt&gt; function is like the &amp;amp; operator (address of) in C.  It returns the "address" of something.  So &lt;tt&gt;StableName&lt;/tt&gt; is like a C pointer type.  And the &lt;tt&gt;hashStableName&lt;/tt&gt; function converts the "address" to an int, &lt;i&gt;i.e.&lt;/i&gt;, like casting it to int in C.  (In the simplified code below we'll assume that two stable names never hash to the same Int, although this is not absolutely guaranteed.)
&lt;p&gt;
How come this interface is OK?  For instance, calling &lt;tt&gt;makeStableName&lt;/tt&gt; on semantically equal values can yield different results if the values happen to be stored in different parts of memory.  It's ok, because the returned value is in the IO monad.  In the IO monad anything can happen, so it's perfectly reasonable that the same argument yields different results in different calls.
&lt;p&gt;
Despite the name and the documentation the GHC stable names have a stability flaw.  The stable name changes when an unevaluated expression is evaluated.  It's annoying, but not a major flaw.  But once evaluated the stable names is guaranteed to remain the same.  (The implementation of stable names is not as simple as taking the address of an object, since the GC can move objects around.)
&lt;p&gt;
So now we have a way to get the identity of each node in the expression tree built by the Traced type.  The plan is to traverse the expression tree.  For each node we'll use the "address" of it as it's name, and remember that we've seen this node.  As we traverse the tree we build a list of nodes we've seen.  This list then corresponds to let bindings we'd like to display as the final result.
As we traverse the nodes we'll also replace each node with a reference to its name instead, so we can see the sharing in the result.
&lt;p&gt;
To be able to represent the expression with rediscovered sharing we need to extend the Traced type.  We need variable references and let bindings.  In fact, we'll only generate a top level let binding, but we include it in the data type anyway.
&lt;pre&gt;
data Traced a
    ...
    | Var a Name
    | Let [(Name, Traced a)] (Traced a)
type Name = String
&lt;/pre&gt;
We store the value in the &lt;tt&gt;Var&lt;/tt&gt; constructor to make &lt;tt&gt;unTraced&lt;/tt&gt; possible.  New cases:
&lt;pre&gt;
...
unTraced (Var x _) = x
unTraced (Let _ e) = unTraced e
&lt;/pre&gt;
And we want to show the new constructors:
&lt;pre&gt;
...
showAsExp (Var _ n) = n
showAsExp (Let bs e) = "let " ++ concatMap bind bs ++ "in " ++ showAsExp e
  where bind (n, e) = n ++ " = " ++ showAsExp e ++ "; "
&lt;/pre&gt;
To rediscover the sharing we need to keep some state.  We need a mapping from the node address to the &lt;tt&gt;Var&lt;/tt&gt; that should replace it, and we need to accumulate the bindings, &lt;i&gt;i.e.&lt;/i&gt;, the pairs of node name and expression.  So we need a state monad to keep track of the state.  We also need to be able to call &lt;tt&gt;makeStableName&lt;/tt&gt; in the IO monad, so we need IO as well.  We'll do this by using the state transformer monad on top of IO.  So the function that discovers sharing will take a traced value and return a new traced value, all in this monad.  So we have the type:
&lt;pre&gt;
type TState = (M.IntMap (Traced a), [(Name, Traced a)])
share :: Traced a -&gt; StateT TState IO (Traced a)
&lt;/pre&gt;
Assuming some imports
&lt;pre&gt;
import Control.Monad.State
import qualified Data.IntMap as M
import System.Mem.StableName
&lt;/pre&gt;
Now the body:
&lt;pre&gt;
share e@(Con _) = return e
share e@(Apply v s xs) = do
    (sm, bs) &lt;- get
    sn &lt;- liftIO $ makeStableName e
    let h = hashStableName sn
    case M.lookup h sm of
        Just ie -&gt; return ie
        Nothing -&gt; do
            let n = "_" ++ show h
                ie = Var v n
            put (M.insert h ie sm, bs)
            xs' &lt;- mapM share xs
            modify $ \ (sm', bs') -&gt; (sm', (n, Apply v s xs') : bs')
            return ie
&lt;/pre&gt;
Constants are easy, we don't bother sharing them (we could, but it's not that interesting).
For an apply node, we get its stable name (it's already evaluated, so it won't change) and hash it.
We then grab the map (it's an IntMap; a fast map from Int to anything) from the state and look up the node.  If the node is found we just return the expression from the map.  If it's not in the map, we invent a name (using the hash value) and insert a &lt;tt&gt;Var&lt;/tt&gt; node in the map so we'll never process this node again.  We then recursively traverse all the children of the apply node, and rebuild a new apply node with those new children.  This constitutes a binding and we stick it in the accumulated list of bindings.  Finally we return the new &lt;tt&gt;Var&lt;/tt&gt; node.
&lt;p&gt;
At the top level we need to call &lt;tt&gt;share&lt;/tt&gt; and then build a let expression.  The bindings are put in the list with the top node first in the list, so it looks nicer to reverse it.
&lt;pre&gt;
shareIO :: Traced a -&gt; IO (Traced a)
shareIO e = do
    (v, (_, bs)) &lt;- runStateT (share e) (M.empty, [])
    return $ Let (reverse bs) v
&lt;/pre&gt;

We could leave it there, but to make it more convenient to use, we'll do an &lt;tt&gt;unsafePerformIO&lt;/tt&gt; to hide the use of IO.  This is not unsafe in the sense that it will make our program crash, but it is unsafe in the sense that it ruins the Haskell semantics.  But since this whole exercise is to make a debugging/tracing tool this is a legitimate use, in my opinion.
&lt;pre&gt;
reShare :: Traced a -&gt; Traced a
reShare = unsafePerformIO . shareIO

showShared :: (Show a) =&gt; Traced a -&gt; String
showShared = showAsExp . reShare
&lt;/pre&gt;
And let's test it:
&lt;pre&gt;
Traced&gt; putStrLn $ showShared $ 10^8
let _5 = (10 * 10); _7 = (_5 * _5); _11 = (10 * 10);
 _1 = (_5 * _11); _8 = (_7 * _1); in _8
Traced&gt; 
&lt;/pre&gt;
In my first example it looked a little prettier since the full implementation only shows bindings for nodes that are actually shared.  This is a simple transformation to add.  But looking at this, we can still see that there are five multiplications.
&lt;p&gt;
&lt;h3&gt;Hacking with types&lt;/h3&gt;
It's a little annoying that we have both &lt;tt&gt;show&lt;/tt&gt; and &lt;tt&gt;showAsExp&lt;/tt&gt; to show traced values.  It would be nicer if we could always use &lt;tt&gt;show&lt;/tt&gt; and make the type determine how it is showed.  We could invent a &lt;tt&gt;newtype&lt;/tt&gt; to wrap around &lt;tt&gt;Traced&lt;/tt&gt; and print the new type in a different way.  But then this new type would not be compatible with the old one, which is a little annoying.
&lt;p&gt;
So we're going down a different road instead, we'll have a the &lt;tt&gt;Traced&lt;/tt&gt; type have two parameters.  The first one being a phantom type; just being used to determine how to show the traced value.  We will rename the old type to &lt;tt&gt;TracedT&lt;/tt&gt;, and it will only be used internally in the module.
&lt;pre&gt;
newtype Traced t a = T { unT :: TracedT a }

data TracedT a
    = Con a
    | Apply a String [TracedT a]
    | Var a Name
    | Let [(Name, TracedT a)] (TracedT a)
&lt;/pre&gt;
Some minor changes are needed to the code to accomodate for this change.
&lt;pre&gt;
traced :: a -&gt; Traced t a
traced = T . Con

unTraced :: Traced t a -&gt; a
unTraced = unTracedT . unT

unTracedT :: TracedT a -&gt; a
...

apply1 s f x = T $ Apply (f (unTraced x)) s [unT x]
apply2 s f x y = T $ Apply (f (unTraced x) (unTraced y)) s [unT x, unT y]
&lt;/pre&gt;
And now for the fun part, the &lt;tt&gt;show&lt;/tt&gt; function.  We could do something like this:
&lt;pre&gt;
data AsValue
data AsExp
instance Show (Traced AsValue a) ...
instance Show (Traced AsExp a) ...
&lt;/pre&gt;
This has problems, first it's not Haskell-98, but more importantly we can't print something of type &lt;tt&gt;Traced t a&lt;/tt&gt;, because t is too general.  We'd like to print as value by default, and be able to override this.  There's only one defaulting mechanism in Haskell: the numeric defaulting.  So as a somewhat disgusting hack, let's use the numeric defaulting to our advantage.  We'll print &lt;tt&gt;Traced Integer a&lt;/tt&gt; as values and &lt;tt&gt;Traced Double a&lt;/tt&gt; as expressions.
&lt;pre&gt;
instance (Num t, Show a) =&gt; Show (Traced t a) where
    show e = if doExp e then showAsExp (unT e) else show (unT e)

doExp :: (Num t) =&gt; Traced t a -&gt; Bool
doExp x = '.' `elem` show (f x)
  where f :: (Num t) =&gt; Traced t a -&gt; t
        f _ = 0
&lt;/pre&gt;
We distinguish between Integer and Double by how 0 is printed; for Double it has a '.' in the string.
&lt;p&gt;
A small utility to force printing as an expression.
&lt;pre&gt;
asExp :: Traced t a -&gt; Traced Double a
asExp = T . unT
&lt;/pre&gt;
Let's try it in ghci:
&lt;pre&gt;
Traced&gt; reShare $ 10^8
100000000
Traced&gt; asExp $ reShare $ 10^8
let _8 = (10 * 10); _9 = (_8 * _8); _5 = (10 * 10);
 _7 = (_8 * _5); _10 = (_9 * _7); in _10
Traced&gt; reShare $ (asExp 10)^8
let _7 = (10 * 10); _8 = (_7 * _7); _1 = (10 * 10);
 _5 = (_7 * _1); _9 = (_8 * _5); in _9
Traced&gt;
&lt;/pre&gt;
In the first expression the numeric default made t be Integer, so we got a value.  In the second and third case we forced t to be Double.
&lt;p&gt;
Some other examples:
&lt;pre&gt;
Traced&gt; let fac n = if n == 0 then 1 else n * fac(n-1)
Traced&gt; asExp $ reShare $ fac 3
let _19 = (3 - 1); _24 = (_19 - 1); _18 = (_24 * 1);
 _20 = (_19 * _18); _21 = (3 * _20); in _21

Traced&gt; let slowFib n = if n &lt; 2 then 1 else slowFib(n-1) + slowFib(n-2)
Traced&gt; asExp $ reShare $ slowFib 5
let _18 = (1 + 1); _19 = (_18 + 1); _23 = (1 + 1); _20 = (_19 + _23);
 _25 = (1 + 1); _17 = (_25 + 1); _21 = (_20 + _17); in _21

Traced&gt; let fastFib n = fst $ iterate (\ (x,y) -&gt; (y,x+y)) (1,1) !! n
Traced&gt; asExp $ reShare $ fastFib 5
let _20 = (1 + 1); _21 = (1 + _20); _19 = (_20 + _21); _23 = (_21 + _19); in _23
Traced&gt; 
&lt;/pre&gt;
Well, that's enough abuse of the type system for one day.

&lt;h3&gt;More examples&lt;/h3&gt;
The full library for traced values contains some more functionality, like naming values and functions that operate on traced booleans.
&lt;p&gt;

A named value, and a symbolic named value:
&lt;pre&gt;
Prelude Debug.Traced&gt; asSharedExp $ (named "x" 10)^8
let x = 10; _2 = x * x; in  _2 * _2 * (_2 * (x * x))
Prelude Debug.Traced&gt; asSharedExp $ (unknown "x")^8
let _2 = x * x; in  _2 * _2 * (_2 * (x * x))
&lt;/pre&gt;

For a normal definition of &lt;tt&gt;fac&lt;/tt&gt; we only see the arithmetic:
&lt;pre&gt;
Prelude Debug.Traced&gt; let fac n = if n == 0 then 1 else n * fac (n-1)
Prelude Debug.Traced&gt; fac 5
120
Prelude Debug.Traced&gt; asSharedExp $ fac 5
let _2 = 5 - 1;
    _4 = _2 - 1;
    _6 = _4 - 1;
in  5 * (_2 * (_4 * (_6 * ((_6 - 1) * 1))))
&lt;/pre&gt;

By using traced booleans we can see exactly what's going on:
&lt;pre&gt;
Prelude Debug.Traced&gt; let facT n = ifT (n %== 0) 1 (n * facT (n-1))
Prelude Debug.Traced&gt; facT 5
120
Prelude Debug.Traced&gt; asSharedExp $ facT 5
let _6 = 5 - 1;
    _11 = _6 - 1;
    _16 = _11 - 1;
    _21 = _16 - 1;
in  ifT (5 == 0) ...
        (5 * ifT (_6 == 0) ...
                 (_6 * ifT (_11 == 0) ...
                           (_11 * ifT (_16 == 0) ...
                                      (_16 * ifT (_21 == 0) ... (_21 * ifT (_21 - 1 == 0) 1 ...)))))
&lt;/pre&gt;

We can also lift a regular function to a traced function:
&lt;pre&gt;
Prelude Debug.Traced&gt; let fac' = liftFun "fac'" fac :: Traced t Integer -&gt; Traced t Integer
Prelude Debug.Traced&gt; let a = fac' 5 + fac' 10
Prelude Debug.Traced&gt; a
3628920
Prelude Debug.Traced&gt; asSharedExp a
fac' 5 + fac' 10
Prelude Debug.Traced&gt;
&lt;/pre&gt;

&lt;h3&gt;And what about exponentiation?&lt;/h3&gt;
The new exponentiation function in GHC looks like this:
&lt;pre&gt;
(^) :: (Num a, Integral b) =&gt; a -&gt; b -&gt; a
x0 ^ y0 | y0 &lt; 0    = error "Negative exponent"
        | y0 == 0   = 1
        | otherwise = f x0 y0
    where f x y | even y    = f (x * x) (y `quot` 2)
                | y == 1    = x
                | otherwise = g (x * x) ((y - 1) `quot` 2) x
          g x y z | even y = g (x * x) (y `quot` 2) z
                  | y == 1 = x * z
                  | otherwise = g (x * x) ((y - 1) `quot` 2) (x * z)
&lt;/pre&gt;
And if we try this definition we get
&lt;pre&gt;
Debug.Traced&gt; asSharedExp $ 10 ^ 8
let _2 = 10 * 10; _1 = _2 * _2; in  _1 * _1
Debug.Traced&gt; 
&lt;/pre&gt;
Victory!
&lt;p&gt;
Edit: Package available in &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/traced-2008.7.4"&gt;hackage&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2069749915560147065?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2069749915560147065/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2069749915560147065' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2069749915560147065'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2069749915560147065'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/07/lost-and-found-if-i-write-108-in.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5880367048512311013</id><published>2008-03-09T21:45:00.007Z</published><updated>2008-07-04T15:27:21.384+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h2&gt;Simple reflections of higher order&lt;/h2&gt;

In a &lt;a href="http://twan.home.fmf.nl/blog/haskell/simple-reflection-of-expressions.details"&gt;recent blog post&lt;/a&gt; by Twan van Laarhoven he showed how to reflect Haskell expressions so that we can actually see them symbolically. This has now been included in lambdabot on the Haskell IRC (&lt;a href"http://code.haskell.org/lambdabot/scripts/SimpleReflect.hs"&gt;source&lt;/a&gt;).

Let's play with it for a moment:
&lt;pre&gt;
*SimpleReflect&gt; x+y
x + y
*SimpleReflect&gt; foldr f z [1,2,3]
f 1 (f 2 (f 3 z))
*SimpleReflect&gt; let swap (x,y) = (y, x)
*SimpleReflect&gt; swap (a,b)
(b,a)
*SimpleReflect&gt; map swap [(a,b),(c,d)]
[(b,a),(d,c)]
*SimpleReflect&gt; :t x
x :: Expr
&lt;/pre&gt;
That's very cool!  Read Twan's post to find out how it works.  All we need to know is on the last line, &lt;tt&gt;x :: Expr&lt;/tt&gt;.  So &lt;tt&gt;Expr&lt;/tt&gt; is a special type with special instances.

Let's try something else
&lt;pre&gt;
*SimpleReflect&gt; \ x -&gt; x + y

&lt;interactive&gt;:1:0:
    No instance for (Show (Expr -&gt; Expr))
    ...
&lt;/pre&gt;

Oh, that's annoying, we can't print functions.  But why can't we?  If we made up some variable of type &lt;tt&gt;Expr&lt;/tt&gt; and then applied the function we'd get something we could print.

Let's add some code to Twan's module.  I've just randomly picked the variable &lt;tt&gt;a&lt;/tt&gt;.  (To make this compile you need the extension language FlexibleInstances.)
&lt;pre&gt;
instance Show (Expr -&gt; Expr) where
    show f = "\\ " ++ show a ++ " -&gt; " ++ show (f a)
&lt;/pre&gt;

And try again with the example
&lt;pre&gt;
*SimpleReflect&gt; \ x -&gt; x + y
\ a -&gt; a + y
&lt;/pre&gt;
Pretty smooth.  Except it doesn't really work.
&lt;pre&gt;
*SimpleReflect&gt; \ x -&gt; x + a
\ a -&gt; a + a
&lt;/pre&gt;
Those two functions are not the same.  We can't just arbitrarily pick the variable &lt;tt&gt;a&lt;/tt&gt; since it might be free in the expression.
&lt;p&gt;
So we need to pick a variable that is not free in the expression.  How could we do that?  No matter what we pick it could be used.  And we have no idea what is being used.  Or do we?  If we just turn the function in to text we could look at the string, and pick some variable that is unused in that string.  This is really a gruesome hack, but who cares?
&lt;p&gt;
How do we print the function?  Just invent some variable, I use _, turn it into an expression and tokenize the string.  We will then have a string of tokens not to use.  To find a variable to use, just pick an infinite supply of variables, remove the ones being used, and then pick one of the remaining ones.
&lt;pre&gt;
instance Show (Expr -&gt; Expr) where
    show f = "\\ " ++ show v ++ " -&gt; " ++ show (f v)
      where v = var (head vars)
            vars = supply \\ tokenize (show $ f $ var "_")
            supply = [ "x" ++ show i | i &lt;- [1..]]
            tokenize "" = []
            tokenize s = case lex s of (x,s') : _ -&gt; x : tokenize s'
&lt;/pre&gt;

And try to fool it:
&lt;pre&gt;
*SimpleReflect&gt; \ x -&gt; x + y
\ x1 -&gt; x1 + y
*SimpleReflect&gt; \ x -&gt; x + var "x1"
\ x2 -&gt; x2 + x1
&lt;/pre&gt;

OK, what about multiple arguments?
&lt;pre&gt;
*SimpleReflect&gt; \ x y -&gt; x + y + z

&lt;interactive&gt;:1:0:
    No instance for (Show (Expr -&gt; Expr -&gt; Expr))
    ...
&lt;/pre&gt;
Well, yeah, that's true.  There is no such instance.  But wait, why is the instance we have for the type &lt;tt&gt;Expr-&gt;Expr&lt;/tt&gt;?  No particular reason, that's just what I wrote.  In fact it works equally well for &lt;tt&gt;Expr-&gt;r&lt;/tt&gt; as long as we can show &lt;tt&gt;r&lt;/tt&gt;, because that's the only thing we do with &lt;tt&gt;f v&lt;/tt&gt;.

So we change the first line of the instance:
&lt;pre&gt;
instance (Show r) =&gt; Show (Expr -&gt; r) where
&lt;/pre&gt;

And now
&lt;pre&gt;
*SimpleReflect&gt; \ x y -&gt; x + y + z
\ x2 -&gt; \ x1 -&gt; x2 + x1 + z
*SimpleReflect&gt; foldr (.) id [f::Expr-&gt;Expr,g,h] -- a little type help needed
\ x1 -&gt; f (g (h x1))
*SimpleReflect&gt; scanr (.) id [(*2),f::Expr-&gt;Expr,(+1)]
[\ x1 -&gt; f (x1 + 1) * 2,\ x1 -&gt; f (x1 + 1),\ x1 -&gt; x1 + 1,\ x1 -&gt; x1]
&lt;/pre&gt;

Well, that wasn't too hard.  So let's try another example.
&lt;pre&gt;
*SimpleReflect&gt; \ (x,y) -&gt; x+y+z

&lt;interactive&gt;:1:0:
    No instance for (Show ((Expr, Expr) -&gt; Expr))
    ...
&lt;/pre&gt;

Hmmm, yes, the argument must be an &lt;tt&gt;Expr&lt;/tt&gt;.  That's annoying.  We need to generalize the argument type.  We want be able to use &lt;tt&gt;Expr&lt;/tt&gt; and pair of them etc.  Time for a type class.  What does it need to do?  It has to invent expressions and never reuse variables doing so.  So when we invent an &lt;tt&gt;Expr&lt;/tt&gt; we need to consume one variable and leave the rest for others to consume.  This sounds like a state monad.  So we're going to use a state monad where the state is the (infinite) list of strings that are available for making variables.

&lt;pre&gt;
instance (Show a, ExprArg a, Show r) =&gt; Show (a -&gt; r) where
    show f = "\\ " ++ show v ++ " -&gt; " ++ show (f v)
      where v = evalState exprArg vars
            dummy = evalState exprArg $ repeat "_"
            vars = supply \\ tokenize (show $ f dummy)
            supply = [ "x" ++ show i | i &lt;- [1..]]
            tokenize "" = []
            tokenize s = case lex s of (x,s') : _ -&gt; x : tokenize s'

class ExprArg a where
    exprArg :: State [String] a

instance ExprArg Expr where
    exprArg = do v:vs &amp;lt;- get; put vs; return (var v)
&lt;/pre&gt;

Using this we're back where we were before, but now we can make some more instances.
&lt;pre&gt;
instance ExprArg () where
    exprArg = return ()

instance (ExprArg a, ExprArg b) =&gt; ExprArg (a, b) where
    exprArg = liftM2 (,) exprArg exprArg

instance (ExprArg a, ExprArg b, ExprArg c) =&gt; ExprArg (a, b, c) where
    exprArg = liftM3 (,,) exprArg exprArg exprArg
&lt;/pre&gt;

And finally:
&lt;pre&gt;
*SimpleReflect&gt; \ (x, y) -&gt; x + y + z
\ (x1,x2) -&gt; x1 + x2 + z
*SimpleReflect&gt; curry f :: Expr -&gt; Expr -&gt; Expr
\ x2 -&gt; \ x1 -&gt; f (x2,x1)
*SimpleReflect&gt; uncurry f :: (Expr, Expr) -&gt; Expr
\ (x1,x2) -&gt; f x1 x2
*SimpleReflect&gt; \ () -&gt; 1
\ () -&gt; 1
&lt;/pre&gt;
The last example is a curiosity since it does not involve &lt;tt&gt;Expr&lt;/tt&gt; at all.

Well, that's enough for a Sunday night hack.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5880367048512311013?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5880367048512311013/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5880367048512311013' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5880367048512311013'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5880367048512311013'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2008/03/in-recent-blog-post-by-twan-van.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-7429676325682374995</id><published>2007-11-09T18:26:00.000Z</published><updated>2007-11-09T23:43:05.863Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Lambda calculus'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h3&gt;Some lambda calculus examples&lt;/h3&gt;

&lt;h4&gt;Syntax&lt;/h4&gt;
In a previous blog entry I described a simple evaluator and type checker for the lambda cube, i.e., various forms of lambda calculus.
&lt;p&gt;
Here I'm going to show some examples of code in pure typed &amp;lambda;-calculus.  All the examples are typable in F&lt;sub&gt;&amp;omega;&lt;/sub&gt;; the full lambda cube is not necessary.
&lt;p&gt;
Before doing any examples we'd better have some syntax that is not too painful, because writing &amp;lambda;-expression in raw Haskell is tedious.  The syntax for variables, * kind, and application is easy, I'll just use Haskell syntax.  For &amp;lambda;-expressions the Haskell syntax doesn't allow explicit type annotations, but various Haskell compiler implement that extension, so I'll just pick that.  So a &amp;lambda; will be written, "&lt;tt&gt;\ (var::type) -&gt; expr&lt;/tt&gt;".  And as in Haskell I'll allow multiple variables between the "&lt;tt&gt;\&lt;/tt&gt;" and "&lt;tt&gt;-&gt;&lt;/tt&gt;"; it's just a shorthand for multiple lambdas.
&lt;p&gt;
So what about the dependent function type?  The syntax &lt;i&gt;(x:t)&amp;rarr;u&lt;/i&gt; suggests &lt;tt&gt;(x::t)-&gt;u&lt;/tt&gt;, so I'll use that.  And when the variable doesn't occur we'll write &lt;tt&gt;t-&gt;u&lt;/tt&gt; as usual.
For type variables Haskell (well, not Haskell 98, but extensions) uses &lt;tt&gt;forall (a::*) . t&lt;/tt&gt;, so I'll allow that too.
&lt;p&gt;
An example, the identity function:
&lt;pre&gt;
\ (a::*) (x::a) -&gt; x
&lt;/pre&gt;
with type
&lt;pre&gt;
forall (a::*) . a-&gt;a
&lt;/pre&gt;
And using it
&lt;pre&gt;
id Int 5
&lt;/pre&gt;

Writing a pretty printer and parser for this is pretty straight forward so I'll skip that and just point you to the &lt;a href="http://darcs.augustsson.net/Darcs/LambdaCube"&gt;code&lt;/a&gt;.  BTW, instead of using Parsec for the parser like everyone else I used ReadP.  The ReadP library is very nice, partly because the alternative operator is actually commutative (unlike Parsec).  But the error messages suck.

&lt;h4&gt;Enter &lt;tt&gt;let&lt;/tt&gt;&lt;/h4&gt;
Now if we want to use, say, the identity function more than once we need to name it.  There is a mechanism for that, namely &amp;lambda;.  But it looks awkward.  Look:
&lt;pre&gt;
(\ (id :: forall (a::*) . a-&gt;a) -&gt; ... id ... id ... id ...) (\ (a::*) (x::a) -&gt; x)
&lt;/pre&gt;
What makes it awkward is that the name, &lt;tt&gt;id&lt;/tt&gt;, is far away from the body, &lt;tt&gt;\ (a::*) (x::a) -&gt; x&lt;/tt&gt;.  From Haskell we are more used the &lt;tt&gt;let&lt;/tt&gt; and &lt;tt&gt;where&lt;/tt&gt; expressions.
So let's add that &lt;tt&gt;let&lt;/tt&gt;.  Instead of what we have above we'll write
&lt;pre&gt;
let id :: forall (a::*) . a-&gt;a = \ (a::*) (x::a) -&gt; x
in  ... id ... id ... id ...
&lt;/pre&gt;
The &lt;tt&gt;let&lt;/tt&gt; construct could be just be syntactic suger for a lambda and an application, but I've decided to add it as a constructor to the expression type instead.
&lt;pre&gt;
    | Let Sym Type Expr Expr
&lt;/pre&gt;
Adding a new constructor means that we have to modify all the functions operating on &lt;tt&gt;Expr&lt;/tt&gt;, and it's extra much work because &lt;tt&gt;Let&lt;/tt&gt; is a variable binding construct.
For substitution we just cheat a little and use the expansion into an application and &amp;lambda;
&lt;pre&gt;
    sub (Let i t e b) = let App (Lam i' t' b') e' = sub (App (Lam i t b) e)
                        in  Let i' t' e' b'
&lt;/pre&gt;
And for normal form we use the same trick.
&lt;pre&gt;
    spine (Let i t e b) as = spine (App (Lam i t b) e) as
&lt;/pre&gt;
And for now, let's extend the type checker in the same way.
&lt;p&gt;
&lt;br&gt;
&lt;p&gt;
To make definitions a little less verbose I'll allow multiple bindings.  E.g.
&lt;pre&gt;
let x :: Int = g a; y :: Int = f x in h x y
&lt;/pre&gt;
It's just multiple nested &lt;tt&gt;Let&lt;/tt&gt;s in the obvious way.

Another shorthand.  I'll allow the identity function to be written
&lt;pre&gt;
let id (a::*) (x::a) :: a = x
in ...
&lt;/pre&gt;
The translation is pretty easy
&lt;pre&gt;
let f (x1::t1) ... (xn::tn) :: t = b in e
&lt;/pre&gt;
is
&lt;pre&gt;
let f :: forall (x1::t1) ... (xn::tn) . t = \ (x1::t1) ... (xn::tn) -&gt; b in e
&lt;/pre&gt;

And finally, to make it even more like Haskell, I'll allow the type signature to be on a line of its own, omitting types on the bound variables.
&lt;pre&gt;
let id :: forall (a::*) . a -&gt; a;
    id a x = x;
in  ...
&lt;/pre&gt;
It's pretty easy to translate to the bare expression type.  Why all these little syntactic extras?  Well, remember Mary Poppins "A spoonful of sugar makes the medicine go down."

Phew!  Enough syntax, on to the examples.

&lt;h4&gt;Bool&lt;/h4&gt;
We could start by throwing in all kinds of primitive types into our little language, but who knows what might happen then.  All the nice properties that have been shown about the language might not hold anymore.  So instead we'll code all the types we need with what we already have.
&lt;p&gt;
The Bool type has two values: False and True.  So we need to find a type that has exactly two different values.  Fortunately that's easy: &lt;tt&gt;a-&gt;a-&gt;a&lt;/tt&gt; (I'll be a bit sloppy and leave off top level quantifier sometimes, just like Haskell; That should be &lt;tt&gt;forall (a::*) . a-&gt;a-&gt;a&lt;/tt&gt;).
&lt;p&gt;
Why does that have two possible values?  Well, we have a function type that must return an &lt;tt&gt;a&lt;/tt&gt; for any possible &lt;tt&gt;a&lt;/tt&gt; that it's given, so it can't conjure up the return value out of thin air.  It has to return one of it's arguments.  And there are two arguments to choose from, so there are two different values in that type.

Here's &lt;tt&gt;Bool, False, True&lt;/tt&gt;:
&lt;pre&gt;
Bool :: *;
Bool = forall (a::*) . a-&gt;a-&gt;a;

False :: Bool;
False = \ (a::*) (x::a) (y::a) -&gt; x;

True  :: Bool;
True = \ (a::*) (x::a) (y::a) -&gt; y;
&lt;/pre&gt;
If the definition for &lt;tt&gt;Bool&lt;/tt&gt; was written in Haskell it would be
&lt;pre&gt;
type Bool = forall a . a-&gt;a-&gt;a

false :: Bool
false = \ x y -&gt; x

true :: Bool
true = \ x y -&gt; y
&lt;/pre&gt;
It would be easy to add more sugar and allow &lt;tt&gt;type&lt;/tt&gt; which just means you're defining something of type &lt;tt&gt;*&lt;/tt&gt;.  And you could also make an omitted type in a &lt;tt&gt;forall&lt;/tt&gt; mean type &lt;tt&gt;*&lt;/tt&gt;.  But I've not added these little extras.
&lt;p&gt;
Defining the &lt;tt&gt;if&lt;/tt&gt; function is trivial; it's just a matter of permuting the arguments a little, because the boolean values come with the "if" built in.
&lt;pre&gt;
if :: forall (a::*) . Bool -&gt; a -&gt; a -&gt; a;
if a b t f = b a t f;
&lt;/pre&gt;
Note how the type signature is exactly the same as you'd find in Haskell.  A difference is that we have explicit type abstraction and type application.  For instance, &lt;tt&gt;if&lt;/tt&gt; takes a first argument that is the type of the branches.  So when using &lt;tt&gt;if&lt;/tt&gt; we must pass in a type.
&lt;p&gt;
Given this we can try to type check some simple code:
&lt;pre&gt;
let Bool :: *;
    Bool = forall (a::*) . a-&gt;a-&gt;a;
    False :: Bool;
    False = \ (a::*) (x::a) (y::a) -&gt; x;
in  False
&lt;/pre&gt;
Here is what happens:
&lt;pre&gt;
*CubeExpr&gt; typeCheck $ read "let Bool :: *; Bool = forall (a::*) . a-&gt;a-&gt;a; False :: Bool;False = \\ (a::*) (x::a) (y::a) -&gt; x; in False"
*** Exception: Type error:
Bad function argument type:
Function: \ (False :: Bool) -&gt; False
argument: \ (a :: *) (x :: a) (y :: a) -&gt; x
expected type: Bool
     got type: forall (a :: *) . a-&gt;a-&gt;a
&lt;/pre&gt;
What is it whining about?  Expected &lt;tt&gt;Bool&lt;/tt&gt;, got &lt;tt&gt;forall (a :: *) . a-&gt;a-&gt;a&lt;/tt&gt;.  But it says right there in the code that &lt;tt&gt;Bool&lt;/tt&gt; is exactly that.  What's going on?
&lt;br&gt;
Well, the type checker knows the &lt;b&gt;type&lt;/b&gt; of everything, but not the &lt;b&gt;value&lt;/b&gt; of anything.  So the type checker knows that type of &lt;tt&gt;Bool&lt;/tt&gt; is &lt;tt&gt;*&lt;/tt&gt;, but it doesn't know what it is equal to.
&lt;p&gt;
The problem is that when we have a &lt;tt&gt;let&lt;/tt&gt; binding we know the value of the defined variable and to be able to do dependent type checking the type checker needs to know it too.  We need to change the type checking of &lt;tt&gt;let&lt;/tt&gt;.  Here's a simple solution:
&lt;pre&gt;
tCheck r (Let s t a e) = do
    tCheck r t
    ta &lt;- tCheck r a
    when (not (betaEq ta t)) $ throwError $ "Bad let def\n" ++ show (ta, t)
    te &lt;- tCheck r (subst s a e)
    tCheck r (Pi s t te)
    return te
&lt;/pre&gt;
Note how we substitute the value of the definition (&lt;tt&gt;a&lt;/tt&gt;) in the body before type checking it.  This is a sledge hammer approach.  A better one would be to change the environment in the type checker to carry values when they are known.  These values could then be used when computing the normal form of an expression.  But let's keep it simple for now.
&lt;p&gt;
Let's try again:
&lt;pre&gt;
*CubeExpr&gt; typeCheck' $ read "let Bool :: *; Bool = forall (a::*) . a-&gt;a-&gt;a; False :: Bool;False = \\ (a::*) (x::a) (y::a) -&gt; x; in False"
forall (a :: *) . a-&gt;a-&gt;a
&lt;/pre&gt;
That worked!

&lt;h4&gt;A simple top level&lt;/h4&gt;
To make experimentation easier I've added a simple top level where you can evaluate expression, make definitions, load files, etc.  The files are just a bunch of definitions the way they would go inside a &lt;tt&gt;let&lt;/tt&gt;.
&lt;p&gt;
Sample session:
&lt;pre&gt;
Welcome to the Cube.
Use :help to get help.
Cube&gt; \ (a::*) (x::a) -&gt; x
\ (a :: *) (x :: a) -&gt; x
  ::
forall (a :: *) . a-&gt;a
Cube&gt; 
&lt;/pre&gt;
When evaluating an expression it will first print the value, then &lt;tt&gt;::&lt;/tt&gt;, and finally the type.
&lt;br&gt;
Let's try some more.  Here's the file &lt;tt&gt;bool.cube&lt;/tt&gt;.
&lt;pre&gt;
-- The Bool type.

Bool :: *;
Bool = forall (boolT::*) . boolT-&gt;boolT-&gt;boolT;

False :: Bool;
False = \ (boolT::*) (false::boolT) (true::boolT) -&gt; false;

True  :: Bool;
True = \ (boolT::*) (false::boolT) (true::boolT) -&gt; true;

if :: forall (a::*) . Bool -&gt; a -&gt; a -&gt; a;
if a b t f = b a f t;

not :: Bool -&gt; Bool;
not b = if Bool b False True;

and :: Bool -&gt; Bool -&gt; Bool;
and x y = if Bool x y False;

or :: Bool -&gt; Bool -&gt; Bool;
or x y = if Bool x True y;
&lt;/pre&gt;
Note how I've named some of the type variables with more suggestive names.  More about that in a moment.
&lt;pre&gt;
Cube&gt; :load bool.cube
Cube&gt; True
\ (boolT :: *) (false :: boolT) (true :: boolT) -&gt; true
  ::
forall (boolT :: *) . boolT-&gt;boolT-&gt;boolT
Cube&gt; and True False
\ (boolT :: *) (false :: boolT) (true :: boolT) -&gt; false
  ::
forall (boolT :: *) . boolT-&gt;boolT-&gt;boolT
&lt;/pre&gt;
By staring carefully at these you can convince yourself that it's right.
What's making it a little hard to read are all those leading lambdas (and the corresponding function types).  But there's an option to suppress the printing of them.
&lt;pre&gt;
Cube&gt; :skip
Cube&gt; True
true
  ::
boolT
Cube&gt; and True False
false
  ::
boolT
&lt;/pre&gt;
Look, it got deceptively readable.

&lt;h4&gt;Characters&lt;/h4&gt;
So we can define booleans.  What about something like characters?  You can define those too.  What is the ASCII type?  It's a type with 128 different values.  We saw how we can make a type with two values (Bool); we can do the same for 128.  Since that takes a lot of room, I'll do it for just four values.
&lt;pre&gt;
Char :: *;
Char = forall (charT::*) . charT-&gt;charT-&gt;charT-&gt;charT-&gt;charT;

'a' :: Char;
'a' = \ (charT::*) (_'a'::charT) (_'b'::charT) (_'c'::charT) (_'d'::charT) -&gt; _'a';
'b' :: Char;
'b' = \ (charT::*) (_'a'::charT) (_'b'::charT) (_'c'::charT) (_'d'::charT) -&gt; _'b';
'c' :: Char;
'c' = \ (charT::*) (_'a'::charT) (_'b'::charT) (_'c'::charT) (_'d'::charT) -&gt; _'c';
'd' :: Char;
'd' = \ (charT::*) (_'a'::charT) (_'b'::charT) (_'c'::charT) (_'d'::charT) -&gt; _'d';

eqChar :: Char -&gt; Char -&gt; Bool;
eqChar x y = x Bool (y Bool True False False False)
                    (y Bool False True False False)
                    (y Bool False False True False)
                    (y Bool False False False True);
&lt;/pre&gt;
Since single quotes are allowed in identifiers in my little language the &lt;tt&gt;'a'&lt;/tt&gt; is just an identifier as usual.
Some sample use:
&lt;pre&gt;
Cube&gt; :load char.cube
Cube&gt; 'a'
_'a'
  ::
charT
Cube&gt; eqChar 'a' 'a'
true
  ::
boolT
Cube&gt; eqChar 'a' 'b'
false
  ::
boolT
&lt;/pre&gt;
&lt;p&gt;
A different way to use characters is just to assume that there is some character type and that is has some values.  How can we do that?  By using a lambda expression.
&lt;pre&gt;
\ (Char :: *) -&gt; \ ('a' :: Char) ('b' :: Char) ... -&gt; ...
&lt;/pre&gt;
One final little syntactic sugar is to allow lambda expressions to be written with &lt;tt&gt;let&lt;/tt&gt;.
So
&lt;pre&gt;
let x :: T;
in  e
&lt;/pre&gt;
is the same as
&lt;pre&gt;
\ (x :: T) -&gt; e
&lt;/pre&gt;
So for Char we can write
&lt;pre&gt;
Char :: *;
'a' :: Char;
'b' :: Char;
'c' :: Char;
'd' :: Char;
eqChar :: Char -&gt; Char -&gt; Bool;
&lt;/pre&gt;
This is a very handy notation during program development, btw.  Inside i &lt;tt&gt;let&lt;/tt&gt; you can first just write the type of something and then use it freely.  This something will then be lambda abstracted until such a time that you provide the definition.
&lt;p&gt;
Let's try the "fake" characters.
&lt;pre&gt;
Cube&gt; :load extchar.cube
Cube&gt; 'a'
'a'
  ::
Char
Cube&gt; :skip
Cube&gt; 'a'
\ (Char :: *) ('a' :: Char) ('b' :: Char) ('c' :: Char) ('d' :: Char)
  (eqChar :: Char-&gt;Char-&gt;forall (boolT :: *) . boolT-&gt;boolT-&gt;boolT)
  -&gt; 'a'
  ::
forall (Char :: *) .
       Char-&gt;Char-&gt;Char-&gt;Char-&gt;
       (Char-&gt;Char-&gt;forall (boolT :: *) . boolT-&gt;boolT-&gt;boolT)-&gt;
       Char
Cube&gt; :skip
Cube&gt; eqChar 'a' 'b'
eqChar 'a' 'b'
  ::
forall (boolT :: *) . boolT-&gt;boolT-&gt;boolT
&lt;/pre&gt;
Now &lt;tt&gt;eqChar&lt;/tt&gt; no longer gets evaluated, naturally, because it has no definition; it's &amp;lambda; bound.
&lt;p&gt;
Now we know how the &lt;tt&gt;Char&lt;/tt&gt; type could be implemented (or supplied from the outside).  It would then be all right to add it as a primitive (with the same behavior), because it will not ruin any properties.  (Speaking of ruining properties, the &lt;tt&gt;seq&lt;/tt&gt; function in Haskell is an example of a function that cannot be defined in the &amp;lambda;-calculus.  And sure enough, adding it to Haskell ruined the validity of &amp;eta;-reduction.)
&lt;p&gt;
It's not difficult to extend the &lt;tt&gt;Expr&lt;/tt&gt; type with some &lt;tt&gt;Prim&lt;/tt&gt; constructor for primitive functions, and adding some special cases in the syntax and type checking for them.  But since this is not intended to be a usable language I'll resist the temptation (for now).

&lt;h4&gt;Pairs&lt;/h4&gt;
OK, booleans and characters were pretty easy.  Let's do pairs next.  The &lt;tt&gt;Pair&lt;/tt&gt; type is parameterized over two types; the type of the first and the second component.  As with booleans the representation of pairs come with the case analysis "build in".  (Case analysis on pairs is &lt;tt&gt;flip uncurry&lt;/tt&gt; in Haskell.)  If we can correctly implement building pairs, &lt;tt&gt;fst&lt;/tt&gt;, and &lt;tt&gt;snd&lt;/tt&gt; we are done.
&lt;pre&gt;
Pair :: * -&gt; * -&gt; *;
Pair a b = forall (pairT::*) . (a-&gt;b-&gt;pairT) -&gt; pairT;

PairC :: forall (a::*) (b::*) . a -&gt; b -&gt; Pair a b;
PairC a b x y = \ (pairT::*) (pair :: a-&gt;b-&gt;pairT) -&gt; pair x y;

fst :: forall (a::*) (b::*) . Pair a b -&gt; a;
fst a b p = p a (\ (x::a) (y::b) -&gt; x);

snd :: forall (a::*) (b::*) . Pair a b -&gt; b;
snd a b p = p b (\ (x::a) (y::b) -&gt; y);
&lt;/pre&gt;
So the type is called &lt;tt&gt;Pair&lt;/tt&gt; and the constructor &lt;tt&gt;PairC&lt;/tt&gt;.  Since they live in the same name space we can't use Haskell's convention of giving them the same name (which is &lt;tt&gt;(,)&lt;/tt&gt; in Haskell.).
&lt;p&gt;
Again, in Haskell the first definition would be
&lt;pre&gt;
type Pair a b = forall pairT . (a-&gt;b-&gt;pairT) -&gt; pairT
&lt;/pre&gt;
Staring at these definitions makes it obvious(?) that they work.  But we can test it too.
&lt;pre&gt;
Cube&gt; :let S :: *; s :: S; T :: *; t :: T
Cube&gt; :load pair.cube
Cube&gt; :let p :: Pair S T; p = PairC S T s t
Cube&gt; p
pair s t
  ::
pairT
Cube&gt; fst S T p
s
  ::
S
Cube&gt; snd S T p
t
  ::
T
&lt;/pre&gt;
The &lt;tt&gt;:let&lt;/tt&gt; command is used to make bindings without having to load them from a file.  The first &lt;tt&gt;:let&lt;/tt&gt; just introduces some values to play with.
&lt;p&gt;
Again, the big difference compared to Haskell is that all types are explicit.  (But users of templates in C++ or generics in Java/C# might appeciate it?)

&lt;h4&gt;Maybe&lt;/h4&gt;
We can follow the same pattern and define Haskell's &lt;tt&gt;Maybe&lt;/tt&gt; type.  In fact any non-recursive Haskell data type has a mechanical translation and easy into lambda calculus.  We've seen boolean, characters, and pairs above.
&lt;pre&gt;
Maybe :: * -&gt; *;
Maybe a = forall (maybeT::*) . maybeT-&gt;(a-&gt;maybeT)-&gt;maybeT;

Nothing :: forall (a::*) . Maybe a;
Nothing a = \ (maybeT::*) (nothing::maybeT) (just::a-&gt;maybeT) -&gt; nothing;

Just :: forall (a::*) . a -&gt; Maybe a;
Just a x = \ (maybeT::*) (nothing::maybeT) (just::a-&gt;maybeT) -&gt; just x;

maybe :: forall (a::*) (r::*) . r -&gt; (a-&gt;r) -&gt; Maybe a -&gt; r;
maybe a r nothing just s = s r nothing just;
&lt;/pre&gt;

&lt;h4&gt;Natural numbers&lt;/h4&gt;
Let's carry on with some natural numbers.  Natural numbers are trickier because it's a recursive type.  In Haskell:
&lt;pre&gt;
data Nat = Zero | Succ Nat
&lt;/pre&gt;
Let's try the cube version:
&lt;pre&gt;
Cube&gt; :load nat.cube
Cube&gt; 2
succ (succ zero)
  ::
natT
Cube&gt; add 2 3
succ (succ (succ (succ (succ zero))))
  ::
natT
&lt;/pre&gt;
What does the definitions for natural numbers look like?
&lt;pre&gt;
Nat :: *;
Nat = forall (natT::*) . natT -&gt; (natT-&gt;natT) -&gt; natT;

0 :: Nat;
0 = \ (natT::*) (zero::natT) (succ::natT-&gt;natT) -&gt; zero;

Succ :: Nat -&gt; Nat;
Succ n = \ (natT::*) (zero::natT) (succ::natT-&gt;natT) -&gt; succ (n natT zero succ);

natprim :: forall (r::*) . (r-&gt;r) -&gt; r -&gt; Nat -&gt; r;
natprim r succ zero n = n r zero succ;

add :: Nat -&gt; Nat -&gt; Nat;
add x y = x Nat y Succ;

mul :: Nat -&gt; Nat -&gt; Nat;
mul x y = x Nat 0 (add y);

power :: Nat -&gt; Nat -&gt; Nat;
power x y = y Nat (Succ 0) (mul x);

isZero :: Nat -&gt; Bool;
isZero n = n Bool True (\ a::Bool -&gt; False);

1 :: Nat;
1 = Succ 0;
2 :: Nat;
2 = Succ 1;
3 :: Nat;
3 = Succ 2;
&lt;/pre&gt;
The &lt;tt&gt;natprim&lt;/tt&gt; function corresponds to &lt;tt&gt;foldr&lt;/tt&gt; for list and is our means of recursion.  This type for natural numbers come with primitive recursion built in, just as the non-recursive data types earlier came with case analysis built in.
&lt;p&gt;
I'll skip defining subtraction&amp;co, they are possible, but a little tedious (and very inefficient).

&lt;h4&gt;Lists&lt;/h4&gt;
Lists are like a hybrid of natural numbers and the &lt;tt&gt;Maybe&lt;/tt&gt; type.  They follow the same pattern as before.
&lt;pre&gt;
List :: * -&gt; *;
List e = forall (listT::*) . listT -&gt; (e-&gt;listT-&gt;listT) -&gt; listT;

Nil :: forall (e::*) . List e;
Nil e = \ (listT::*) (nil::listT) (cons::e-&gt;listT-&gt;listT) -&gt; nil;

Cons :: forall (e::*) . e -&gt; List e -&gt; List e;
Cons e x xs = \ (listT::*) (nil::listT) (cons::e-&gt;listT-&gt;listT) -&gt; cons x (xs listT nil cons);
&lt;/pre&gt;

And some sample functions.
&lt;pre&gt;
foldr :: forall (a::*) (b::*) . (a-&gt;b-&gt;b) -&gt; b -&gt; List a -&gt; b;
foldr a b f z xs = xs b z f;

map :: forall (a::*) (b::*) . (a-&gt;b) -&gt; List a -&gt; List b;
map a b f xs = foldr a (List b) (\ (x::a) (r::List b) -&gt; Cons b (f x) r) (Nil b) xs;

append :: forall (a::*) . List a -&gt; List a -&gt; List a;
append a xs ys = foldr a (List a) (Cons a) ys xs;

foldl :: forall (a::*) (b::*) . (b-&gt;a-&gt;b) -&gt; b -&gt; List a -&gt; b;
foldl a b f z xs = foldr a (b-&gt;b) (\ (x::a) (g::b-&gt;b) (v::b) -&gt; g (f v x)) (\ (x::b) -&gt; x) xs z;

reverse :: forall (a::*) . List a -&gt; List a;
reverse a xs = foldl a (List a) (\ (r :: List a) (x :: a) -&gt; Cons a x r) (Nil a) xs;
&lt;/pre&gt;
And a test:
&lt;pre&gt;
Welcome to the Cube.
Use :help to get help.
Cube&gt; :load bool.cube
Cube&gt; :load nat.cube
Cube&gt; :load list.cube
Cube&gt; :load pair.cube    
Cube&gt; :load listmisc.cube
Cube&gt; :load extchar.cube
Cube&gt; :skip
Cube&gt; replicate Char 3 'b'
cons 'b' (cons 'b' (cons 'b' nil))
  ::
listT
Cube&gt; :quit
&lt;/pre&gt;
And with that, I'll quit too for now.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-7429676325682374995?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/7429676325682374995/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=7429676325682374995' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/7429676325682374995'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/7429676325682374995'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/11/some-lambda-calculus-examples-syntax-in.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1215723127547718916</id><published>2007-11-06T16:28:00.000Z</published><updated>2007-11-07T14:49:27.174Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Benchmark'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h3&gt;Benchmarking ray tracing, Haskell vs. OCaml&lt;/h3&gt;
On his web site about OCaml Jon Harrop has a &lt;a href="http://www.ffconsultancy.com/languages/ray_tracer/index.html"&gt;benchmark&lt;/a&gt; for a simple ray tracing program written in a number of languages.  When I saw it I wondered why Haskell was doing so badly, especially since this benchmark was taken as some kind of proof that Haskell performs poorly in "real life".

So I have rerun the benchmarks.  I've also rewritten the Haskell versions of the programs.  The Haskell versions on Jon's web site were OK, but they were too far from the OCaml versions for my taste.  I prefer to keep the programs very similar in a situation like this.

My rewrite of the benchmarks from OCaml to Haskell was done with a minimum of intelligence.  Here are the only things I did that needed creative thought:
&lt;ol&gt;
  &lt;li&gt;Use the type &lt;tt&gt;Vector&lt;/tt&gt; to represent vectors instead of a tuple.  This allows the components to be strict.
  &lt;/li&gt;
  &lt;li&gt;Use the type &lt;tt&gt;Scene&lt;/tt&gt; instead of a tuple to represent a scene.  The tuple used in the OCaml code uses the dubious feature of equi-recursive types (even Xavier thinks it's strange enough to have a flag to enable it).
  &lt;/li&gt;
  &lt;li&gt;Rewrite the loop that computes a pixel's value using an accumulating updatable variable into a list comprehension that sums the list.
  &lt;/li&gt;
  &lt;li&gt;Finally, the compiler flags needed a bit of tweaking to get good performance, even though "&lt;tt&gt;-O3 -funbox-strict-fields -fexcess-precision -optc-ffast-math&lt;/tt&gt;" were pretty obvious.
  &lt;/li&gt;
&lt;/ol&gt;
In addition to this I made the code look a little more Haskellish, e.g., using overloading to allow &lt;tt&gt;+&lt;/tt&gt; and &lt;tt&gt;-&lt;/tt&gt; on vectors.  This is really just minor syntactic changes, but makes the code more readable.
&lt;p&gt;
To make the program size comparison fair I removed some dead code from the OCaml code.
&lt;p&gt;
I then reran the benchmarks using Haskell, OCaml, and C++.
&lt;p&gt;
The benchmarks are five programs that starts from very simple ray tracing and specializing the program more and more to speed it up.
&lt;p&gt;
The numbers are in the tables below.  The time is execution time in second, the characters are non-white characters in the file, and the lines are the number of lines in the file.  To ease comparison I also include the relative numbers compared to OCaml (smaller numbers are better).
&lt;p&gt;
Interestingly, and unlike Jon's benchmark, the Haskell code is always smaller than the OCaml code.  Furthermore, the Haskell code ranges from much faster to slightly faster than the OCaml code.  Again, this is very unlike Jon's benchmark.  I find the unoptimized version of the benchmark especially interesting since Haskell is 5 times(!) faster than OCaml.  I've not investigated why, but I suspect laziness.

&lt;h4&gt;Results&lt;/h4&gt;
The programs, ray1-ray5, are variations on the ray tracer as given on Jon's web site.  I've used the same size metrics as Jon does.
&lt;ul&gt;
  &lt;li&gt;Haskell: My Haskell code compiled with ghc 6.8.1&lt;/li&gt;
  &lt;li&gt;Haskell old: Jon's Haskell code, compiled with ghc 6.8.1&lt;/li&gt;
  &lt;li&gt;Haskell old 6.6: Jon's Haskell code, compiled with ghc 6.1.1&lt;/li&gt;
  &lt;li&gt;OCaml: Jon's OCaml code&lt;/li&gt;
  &lt;li&gt;C++: Jon's C++ code&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
  &lt;li&gt;Time: execution time is second&lt;/li&gt;
  &lt;li&gt;Char: number of non-white chracters in the program&lt;/li&gt;
  &lt;li&gt;Lines: number of lines in the program&lt;/li&gt;
  &lt;li&gt;Rel T: execution time relative to OCaml&lt;/li&gt;
  &lt;li&gt;Rel C: non-white characters relative to OCaml&lt;/li&gt;
  &lt;li&gt;Rel L: lines relative to OCaml&lt;/li&gt;
  &lt;li&gt;Mem: Maximum resident memory&lt;/li&gt;
&lt;/ul&gt;
&lt;TABLE BORDER=4&gt;
&lt;TR&gt;&lt;TH&gt;ray1      &lt;/TH&gt; &lt;TD&gt;    Time&lt;/TD&gt; &lt;TD&gt;   Chars&lt;/TD&gt; &lt;TD&gt;   Lines&lt;/TD&gt; &lt;TD&gt;   Rel T&lt;/TD&gt; &lt;TD&gt;   Rel C&lt;/TD&gt; &lt;TD&gt;   Rel L&lt;/TD&gt;&lt;TD&gt;Mem&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell   &lt;/TD&gt; &lt;TD&gt;    15.3&lt;/TD&gt; &lt;TD&gt;    1275&lt;/TD&gt; &lt;TD&gt;      51&lt;/TD&gt; &lt;TD&gt;   0.202&lt;/TD&gt; &lt;TD&gt;   0.990&lt;/TD&gt; &lt;TD&gt;   1.020&lt;/TD&gt;&lt;TD&gt;5M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old     &lt;/TD&gt; &lt;TD&gt;    15.8&lt;/TD&gt; &lt;TD&gt;    1946&lt;/TD&gt; &lt;TD&gt;      88&lt;/TD&gt; &lt;TD&gt;   0.208&lt;/TD&gt; &lt;TD&gt;   1.511&lt;/TD&gt; &lt;TD&gt;   1.760&lt;/TD&gt;&lt;TD&gt;9M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old 6.6 &lt;/TD&gt; &lt;TD&gt;    28.1&lt;/TD&gt; &lt;TD&gt;    1946&lt;/TD&gt; &lt;TD&gt;      88&lt;/TD&gt; &lt;TD&gt;   0.370&lt;/TD&gt; &lt;TD&gt;   1.511&lt;/TD&gt; &lt;TD&gt;   1.760&lt;/TD&gt;&lt;TD&gt;9M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;OCaml     &lt;/TD&gt; &lt;TD&gt;    75.9&lt;/TD&gt; &lt;TD&gt;    1288&lt;/TD&gt; &lt;TD&gt;      50&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt;&lt;TD&gt;18M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;C++       &lt;/TD&gt; &lt;TD&gt;     8.1&lt;/TD&gt; &lt;TD&gt;    2633&lt;/TD&gt; &lt;TD&gt;     122&lt;/TD&gt; &lt;TD&gt;   0.106&lt;/TD&gt; &lt;TD&gt;   2.044&lt;/TD&gt; &lt;TD&gt;   2.440&lt;/TD&gt;&lt;TD&gt;8M&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&lt;P&gt;
&lt;TABLE BORDER=4&gt;
&lt;TR&gt;&lt;TH&gt;ray2      &lt;/TH&gt; &lt;TD&gt;    Time&lt;/TD&gt; &lt;TD&gt;   Chars&lt;/TD&gt; &lt;TD&gt;   Lines&lt;/TD&gt; &lt;TD&gt;   Rel T&lt;/TD&gt; &lt;TD&gt;   Rel C&lt;/TD&gt; &lt;TD&gt;   Rel L&lt;/TD&gt;&lt;TD&gt;Mem&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell   &lt;/TD&gt; &lt;TD&gt;    11.5&lt;/TD&gt; &lt;TD&gt;    1457&lt;/TD&gt; &lt;TD&gt;      50&lt;/TD&gt; &lt;TD&gt;   0.206&lt;/TD&gt; &lt;TD&gt;   0.912&lt;/TD&gt; &lt;TD&gt;   0.943&lt;/TD&gt;&lt;TD&gt;12M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old     &lt;/TD&gt; &lt;TD&gt;    12.0&lt;/TD&gt; &lt;TD&gt;    2173&lt;/TD&gt; &lt;TD&gt;      99&lt;/TD&gt; &lt;TD&gt;   0.215&lt;/TD&gt; &lt;TD&gt;   1.360&lt;/TD&gt; &lt;TD&gt;   1.868&lt;/TD&gt;&lt;TD&gt;35M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old 6.6 &lt;/TD&gt; &lt;TD&gt;    21.1&lt;/TD&gt; &lt;TD&gt;    2173&lt;/TD&gt; &lt;TD&gt;      99&lt;/TD&gt; &lt;TD&gt;   0.379&lt;/TD&gt; &lt;TD&gt;   1.360&lt;/TD&gt; &lt;TD&gt;   1.868&lt;/TD&gt;&lt;TD&gt;35M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;OCaml     &lt;/TD&gt; &lt;TD&gt;    55.8&lt;/TD&gt; &lt;TD&gt;    1598&lt;/TD&gt; &lt;TD&gt;      53&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt;&lt;TD&gt;15M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;C++       &lt;/TD&gt; &lt;TD&gt;     6.1&lt;/TD&gt; &lt;TD&gt;    3032&lt;/TD&gt; &lt;TD&gt;     115&lt;/TD&gt; &lt;TD&gt;   0.108&lt;/TD&gt; &lt;TD&gt;   1.897&lt;/TD&gt; &lt;TD&gt;   2.170&lt;/TD&gt;&lt;TD&gt;8M&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&lt;P&gt;
&lt;TABLE BORDER=4&gt;
&lt;TR&gt;&lt;TH&gt;ray3      &lt;/TH&gt; &lt;TD&gt;    Time&lt;/TD&gt; &lt;TD&gt;   Chars&lt;/TD&gt; &lt;TD&gt;   Lines&lt;/TD&gt; &lt;TD&gt;   Rel T&lt;/TD&gt; &lt;TD&gt;   Rel C&lt;/TD&gt; &lt;TD&gt;   Rel L&lt;/TD&gt;&lt;TD&gt;Mem&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell   &lt;/TD&gt; &lt;TD&gt;     9.7&lt;/TD&gt; &lt;TD&gt;    1794&lt;/TD&gt; &lt;TD&gt;      62&lt;/TD&gt; &lt;TD&gt;   0.970&lt;/TD&gt; &lt;TD&gt;   0.919&lt;/TD&gt; &lt;TD&gt;   0.939&lt;/TD&gt;&lt;TD&gt;12M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old     &lt;/TD&gt; &lt;TD&gt;    11.1&lt;/TD&gt; &lt;TD&gt;    2312&lt;/TD&gt; &lt;TD&gt;     103&lt;/TD&gt; &lt;TD&gt;   1.112&lt;/TD&gt; &lt;TD&gt;   1.184&lt;/TD&gt; &lt;TD&gt;   1.561&lt;/TD&gt;&lt;TD&gt;35M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old 6.6 &lt;/TD&gt; &lt;TD&gt;    19.7&lt;/TD&gt; &lt;TD&gt;    2312&lt;/TD&gt; &lt;TD&gt;     103&lt;/TD&gt; &lt;TD&gt;   1.984&lt;/TD&gt; &lt;TD&gt;   1.184&lt;/TD&gt; &lt;TD&gt;   1.561&lt;/TD&gt;&lt;TD&gt;35M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;OCaml     &lt;/TD&gt; &lt;TD&gt;    10.0&lt;/TD&gt; &lt;TD&gt;    1953&lt;/TD&gt; &lt;TD&gt;      66&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt;&lt;TD&gt;15M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;C++       &lt;/TD&gt; &lt;TD&gt;     5.4&lt;/TD&gt; &lt;TD&gt;    3306&lt;/TD&gt; &lt;TD&gt;     143&lt;/TD&gt; &lt;TD&gt;   0.545&lt;/TD&gt; &lt;TD&gt;   1.693&lt;/TD&gt; &lt;TD&gt;   2.167&lt;/TD&gt;&lt;TD&gt;8M&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&lt;P&gt;
&lt;TABLE BORDER=4&gt;
&lt;TR&gt;&lt;TH&gt;ray4      &lt;/TH&gt; &lt;TD&gt;    Time&lt;/TD&gt; &lt;TD&gt;   Chars&lt;/TD&gt; &lt;TD&gt;   Lines&lt;/TD&gt; &lt;TD&gt;   Rel T&lt;/TD&gt; &lt;TD&gt;   Rel C&lt;/TD&gt; &lt;TD&gt;   Rel L&lt;/TD&gt;&lt;TD&gt;Mem&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell   &lt;/TD&gt; &lt;TD&gt;     8.5&lt;/TD&gt; &lt;TD&gt;    1772&lt;/TD&gt; &lt;TD&gt;      66&lt;/TD&gt; &lt;TD&gt;   0.985&lt;/TD&gt; &lt;TD&gt;   0.867&lt;/TD&gt; &lt;TD&gt;   0.957&lt;/TD&gt;&lt;TD&gt;12M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old     &lt;/TD&gt; &lt;TD&gt;    11.7&lt;/TD&gt; &lt;TD&gt;    2387&lt;/TD&gt; &lt;TD&gt;     110&lt;/TD&gt; &lt;TD&gt;   1.360&lt;/TD&gt; &lt;TD&gt;   1.168&lt;/TD&gt; &lt;TD&gt;   1.594&lt;/TD&gt;&lt;TD&gt;36M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell, old 6.6 &lt;/TD&gt; &lt;TD&gt;    19.2&lt;/TD&gt; &lt;TD&gt;    2387&lt;/TD&gt; &lt;TD&gt;     110&lt;/TD&gt; &lt;TD&gt;   2.235&lt;/TD&gt; &lt;TD&gt;   1.168&lt;/TD&gt; &lt;TD&gt;   1.594&lt;/TD&gt;&lt;TD&gt;35M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;OCaml     &lt;/TD&gt; &lt;TD&gt;     8.6&lt;/TD&gt; &lt;TD&gt;    2043&lt;/TD&gt; &lt;TD&gt;      69&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt;&lt;TD&gt;11M&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;C++       &lt;/TD&gt; &lt;TD&gt;     5.0&lt;/TD&gt; &lt;TD&gt;    3348&lt;/TD&gt; &lt;TD&gt;     149&lt;/TD&gt; &lt;TD&gt;   0.584&lt;/TD&gt; &lt;TD&gt;   1.639&lt;/TD&gt; &lt;TD&gt;   2.159&lt;/TD&gt;&lt;TD&gt;8M&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&lt;P&gt;
&lt;TABLE BORDER=4&gt;
&lt;TR&gt;&lt;TH&gt;ray5      &lt;/TH&gt; &lt;TD&gt;    Time&lt;/TD&gt; &lt;TD&gt;   Chars&lt;/TD&gt; &lt;TD&gt;   Lines&lt;/TD&gt; &lt;TD&gt;   Rel T&lt;/TD&gt; &lt;TD&gt;   Rel C&lt;/TD&gt; &lt;TD&gt;   Rel L&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;Haskell   &lt;/TD&gt; &lt;TD&gt;     7.0&lt;/TD&gt; &lt;TD&gt;    2246&lt;/TD&gt; &lt;TD&gt;      95&lt;/TD&gt; &lt;TD&gt;   0.999&lt;/TD&gt; &lt;TD&gt;   0.878&lt;/TD&gt; &lt;TD&gt;   0.950&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;OCaml     &lt;/TD&gt; &lt;TD&gt;     7.0&lt;/TD&gt; &lt;TD&gt;    2559&lt;/TD&gt; &lt;TD&gt;     100&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt; &lt;TD&gt;   1.000&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD&gt;C++       &lt;/TD&gt; &lt;TD&gt;     4.7&lt;/TD&gt; &lt;TD&gt;    3579&lt;/TD&gt; &lt;TD&gt;     142&lt;/TD&gt; &lt;TD&gt;   0.674&lt;/TD&gt; &lt;TD&gt;   1.399&lt;/TD&gt; &lt;TD&gt;   1.420&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&lt;P&gt;
The source code is available in &lt;a href="http://darcs.augustsson.net/Darcs/RayBen/"&gt;a Darcs repository&lt;/a&gt;.
&lt;h4&gt;Software and hardware details&lt;/h4&gt;
Hardware: MacBook, Intel Core Duo 2GHz, 2MB L2 Cache, 1GB 667MHz DRAM
&lt;p&gt;
Software:
&lt;ul&gt;
&lt;li&gt;Haskell compiler: ghc-6.8.1&lt;/li&gt;
&lt;li&gt;OCaml compiler: 3.10.0&lt;/li&gt;
&lt;li&gt;g++: gcc version 4.0.1 (Apple Computer, Inc. build 5367)&lt;/li&gt;
&lt;/ul&gt;
Compilation commands:
&lt;ul&gt;
  &lt;li&gt;ghc: -O3 -fvia-C -funbox-strict-fields -optc-O3 -fexcess-precision -optc-ffast-math -funfolding-keeness-factor=10&lt;/li&gt;
  &lt;li&gt;OCaml: -rectypes -inline 100 -ffast-math -ccopt -O3&lt;/li&gt;
  &lt;li&gt;g++: -O3 -ffast-math&lt;/li&gt;
&lt;/ul&gt;
Target architecture is x86 (even though the processor is x86_64 capable).
&lt;br&gt;

&lt;h4&gt;Some observations&lt;/h4&gt;
Haskell should really have the definitions of &lt;tt&gt;infinity&lt;/tt&gt; and &lt;tt&gt;epsilon_float&lt;/tt&gt; in a library.  They are quite useful.  Also, having them in a library would have made the Haskell code somewhat shorter and faster.
&lt;p&gt;
Converting these programs from OCaml to Haskell was very mechanical; it could almost be done with just &lt;tt&gt;sed&lt;/tt&gt;.
&lt;p&gt;
I'm glad version 5 of the benchmark didn't show much improvement, because it's a really ugly rewrite. :)
&lt;p&gt;
Note that the code is all Haskell98, no strange extensions (even though -funbox-strict-fields deviates subtly from H98).
&lt;h4&gt;Conclusion&lt;/h4&gt;
Benchmarking is tricky.  I'm not sure why my and Jon's numbers are so different.  Different hardware, slightly different programs, different software.
&lt;p&gt;
Haskell is doing just fine against OCaml on this benchmark; the Haskell programs are always smaller and faster.

&lt;p&gt;
Edit: Updated tables with more numbers.
&lt;p&gt;
PS:  Phil Armstrong wrote the Haskell code on Jon's web site and I took some code from his original.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1215723127547718916?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1215723127547718916/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1215723127547718916' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1215723127547718916'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1215723127547718916'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/11/benchmarking-ray-tracing-haskell-vs.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5687095637180134906</id><published>2007-10-25T16:11:00.001+01:00</published><updated>2007-11-09T18:31:00.012Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Dependent types'/><category scheme='http://www.blogger.com/atom/ns#' term='Lambda calculus'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;h3&gt;Simpler, Easier!&lt;/h3&gt;

In a recent paper, &lt;a href="http://people.cs.uu.nl/andres/LambdaPi.html"&gt;Simply Easy! (An Implementation of a Dependently Typed Lambda Calculus)&lt;/a&gt;, the authors argue that type checking a dependently typed language is easy.  I agree whole-heartedly, it doesn't have to be difficult at all.  But I don't think the paper presents the easiest way to do it.

So here is my take on how to write a simple dependent type checker.

(There's nothing new here, and the authors of the paper are undoubtedly familiar with all of it.)

&lt;h3&gt;First, the &lt;a href="http://en.wikipedia.org/wiki/Untyped_lambda_calculus"&gt;untyped lambda calculus&lt;/a&gt;.&lt;/h3&gt;
I'll start by implementing the untyped lambda calculus.  It's a very simple language with just three constructs: variables, applications, and lambda expressions, i.e.,&lt;ul&gt;
&lt;item&gt;&lt;i&gt;x&lt;/i&gt;
&lt;item&gt;&lt;i&gt;e e&lt;/i&gt;
&lt;item&gt;&lt;i&gt;&amp;lambda;x.e&lt;/i&gt;
&lt;/ul&gt;
For example, &lt;i&gt;(&amp;lambda;x.&amp;lambda;y.x)(&amp;lambda;z.z)&lt;/i&gt;.

In Haskell I'll use strings to represent variables names; it's simple and easy.
&lt;pre&gt;
type Sym = String

data Expr
        = Var Sym
        | App Expr Expr
        | Lam Sym Expr
        deriving (Eq, Read, Show)
&lt;/pre&gt;
The example above represented by
&lt;tt&gt;App (Lam "x" $ Lam "y" $ Var "x") (Lam "z" $ Var "z")&lt;/tt&gt;.

What do we want to do with the &lt;tt&gt;Expr&lt;/tt&gt; type?  Well, evaluating an expression seems like the thing we need.
Now, there are many degrees of evaluation to choose from, Weak Head Normal Form, Head Normal Form, Normal Form, etc., etc.
They differ in exactly where there might be reducible expression lingering.

To evaluate lambda expression the most important step is &amp;beta;-reduction.
A &amp;beta;-reduction step can be performed anywhere a function meets an argument, i.e., an application where the function is on &amp;lambda; form, a.k.a. a redex.&lt;ul&gt;
&lt;item&gt;&lt;i&gt;(&amp;lambda;x.e)a&lt;/i&gt; reduces to &lt;i&gt;e&lt;/i&gt;[&lt;i&gt;x:=a&lt;/i&gt;]
&lt;/ul&gt;
Where the &lt;i&gt;e&lt;/i&gt;[&lt;i&gt;x:=a&lt;/i&gt;] notation means that all (free) occurrences of the variable &lt;i&gt;x&lt;/i&gt; in the expression &lt;i&gt;e&lt;/i&gt; are replaced by &lt;i&gt;a&lt;/i&gt;.
The example above has one redex, and doing a &amp;beta; step yields &lt;i&gt;&amp;lambda;y.&amp;lambda;z.z&lt;/i&gt;.

The other kind of reduction we will make use of is &amp;alpha;-substitution, which is simply renaming a bound variable, e.g., &lt;i&gt;&amp;lambda;x.x&lt;/i&gt; can be changed to &lt;i&gt;&amp;lambda;y.y&lt;/i&gt;.

Let's start with an easy evaluation strategy, normal order to WHNF.  In WHNF we only need to make sure there the there's no redex along the "spine" of the expression, i.e., starting from the root and following the left branch in applications.  Doing normal order reduction means that we do &lt;b&gt;not&lt;/b&gt; evaluate anything inside the argument of a &amp;beta; redex before doing the reduction.  It's sometimes called lazy evaluation, but I prefer to use that term for an implementation strategy for normal order reduction.

We implement normal order WHNF by walking down the spine collecting arguments (i.e., the right branch of a applications) until we reach a lambda or a variable.
If we reach a variable we already have WHNF so we just reconstitute the applications again.  If we hit a lambda we get to the crux of evaluation.
We need to perform a &amp;beta;-reduction, i.e., if we have &lt;tt&gt;App (Lam v e) a&lt;/tt&gt; we need to replace all (free) occurrences of the variable &lt;tt&gt;v&lt;/tt&gt; by the argument &lt;tt&gt;a&lt;/tt&gt; inside the lambda body &lt;tt&gt;e&lt;/tt&gt;.  That's what the &lt;tt&gt;subst&lt;/tt&gt; function does.
&lt;pre&gt;
whnf :: Expr -&gt; Expr
whnf ee = spine ee []
  where spine (App f a) as = spine f (a:as)
        spine (Lam s e) (a:as) = spine (subst s a e) as
        spine f as = foldl App f as
&lt;/pre&gt;
The &lt;tt&gt;subst&lt;/tt&gt; function is the only tricky part, so let's relax by first defining something easy, namely getting the free variables from an expression.  The free variables are those variables that occur in an expression, but are not bound in it.  We simply collect the variables in a set (using a list as a set here), removing anything bound.
&lt;pre&gt;
freeVars :: Expr -&gt; [Sym]
freeVars (Var s) = [s]
freeVars (App f a) = freeVars f `union` freeVars a
freeVars (Lam i e) = freeVars e \\ [i]
&lt;/pre&gt;

Back to substitution.
&lt;pre&gt;
subst :: Sym -&gt; Expr -&gt; Expr -&gt; Expr
subst v x b = sub b
  where sub e@(Var i) = if i == v then x else e
        sub (App f a) = App (sub f) (sub a)
        sub (Lam i e) =
            if v == i then
                Lam i e
            else if i `elem` fvx then
                let i' = cloneSym e i
                    e' = substVar i i' e
                in  Lam i' (sub e')
            else
                Lam i (sub e)
        fvx = freeVars x
        cloneSym e i = loop i
           where loop i' = if i' `elem` vars then loop (i ++ "'") else i'
                 vars = fvx ++ freeVars e

substVar :: Sym -&gt; Sym -&gt; Expr -&gt; Expr
substVar s s' e = subst s (Var s') e
&lt;/pre&gt;
The &lt;tt&gt;subst&lt;/tt&gt; function will replace all free occurrences of &lt;tt&gt;v&lt;/tt&gt; by &lt;tt&gt;b&lt;/tt&gt; inside &lt;tt&gt;x&lt;/tt&gt;, i.e., &lt;i&gt;b&lt;/i&gt;[&lt;i&gt;v:=x&lt;/i&gt;].

The &lt;tt&gt;Var&lt;/tt&gt; case is easy.  If it's the variable we are replacing then replace else leave it alone.

The &lt;tt&gt;App&lt;/tt&gt; case is also easy, just recurse in both branches.

The &lt;tt&gt;Lam&lt;/tt&gt; case has three alternative.  First, if the bound variable is the same as the one we are replacing then there can be no free occurrences inside it, so just return the lambda as is.  Second, if the lambda bound variable is among the free variables in &lt;tt&gt;x&lt;/tt&gt; we have a problem (see below).  Third case, just recurse in the body.

So, what about the case when the lambda bound variable occurs in &lt;tt&gt;x&lt;/tt&gt;?  Well, if we just blindly continue substitution the variable &lt;tt&gt;v&lt;/tt&gt; inside &lt;tt&gt;x&lt;/tt&gt; will no longer refer to the same thing; it will refer to the variable bound in the lambda.  That's no good.  For example, take the expression &lt;i&gt;&amp;lambda;x.((&amp;lambda;y.&amp;lambda;x.y)x)&lt;/i&gt;, the &amp;beta; reduction gives &lt;i&gt;&amp;lambda;x.&amp;lambda;x'.x&lt;/i&gt; (or similar), whereas doing the substitution blindly would give &lt;i&gt;&amp;lambda;x.&amp;lambda;x.x&lt;/i&gt;.  Which is wrong!

But it's easy to fix, just conjure up a variable, &lt;tt&gt;i'&lt;/tt&gt; that will not clash with anything (&lt;tt&gt;cloneSym&lt;/tt&gt; does that).  How do we come up with a good variable?  Well, we don't want to pick one that is free in the expression &lt;tt&gt;x&lt;/tt&gt; because that would lead to the same problem again.  Nor do we want to pick a variable that is free in &lt;tt&gt;e&lt;/tt&gt; because that would accidentally bind something in &lt;tt&gt;e&lt;/tt&gt;.  So we take the original identifier and tack on "'" until it fulfills our requirements.  (OK, efficiency affectionados are allowed to complain a little here, but this isn't that bad actually.)
Once we have a new variable we can do an &amp;alpha;-conversion to rename the offending variable to something better.

The &lt;tt&gt;substVar&lt;/tt&gt; function is a utility when we want to replace one variable with another.

Another useful thing to be able to do is to compare lambda expression for equality.  We already have syntactic equality derived for &lt;tt&gt;Expr&lt;/tt&gt;, but it is also very useful to be able to compare expressions modulo &amp;alpha;-conversions.  That is, we'd like &lt;i&gt;&amp;lambda;x.x&lt;/i&gt; to compare equal to &lt;i&gt;&amp;lambda;y.y&lt;/i&gt;.  Let's call that function &lt;tt&gt;alphaEq&lt;/tt&gt;.
&lt;pre&gt;
alphaEq :: Expr -&gt; Expr -&gt; Bool
alphaEq (Var v)   (Var v')    = v == v'
alphaEq (App f a) (App f' a') = alphaEq f f' &amp;&amp; alphaEq a a'
alphaEq (Lam s e) (Lam s' e') = alphaEq e (substVar s' s e')
alphaEq _ _ = False
&lt;/pre&gt;
Variables and applications just proceed along the structure of the expression.  When we hit a lambda the variables might be different, so we do an &amp;alpha;-conversion of the second expression to make them equal.

As the final functions, we will do reduction to Normal Form (i.e., to a form where no redexes remain) and comparison of expressions via their normal forms.
&lt;pre&gt;
nf :: Expr -&gt; Expr
nf ee = spine ee []
  where spine (App f a) as = spine f (a:as)
        spine (Lam s e) [] = Lam s (nf e)
        spine (Lam s e) (a:as) = spine (subst s a e) as
        spine f as = app f as
        app f as = foldl App f (map nf as)

betaEq :: Expr -&gt; Expr -&gt; Bool
betaEq e1 e2 = alphaEq (nf e1) (nf e2)
&lt;/pre&gt;
Computing the NF is similar to WHNF, but as we reconstruct expressions we make sure that all subexpression have NF as well.

Note that both &lt;tt&gt;whnf&lt;/tt&gt; and &lt;tt&gt;nf&lt;/tt&gt; may fail to terminate because not all expressions have a normal form.  The canonical non-terminating example is &lt;i&gt;(&amp;lambda;x.x x)(&amp;lambda;x.x x)&lt;/i&gt; which has one redex, and doing the reduction produces the same term again.  But if an expression has a normal form then it is unique (the Church-Rosser theorem).

Here are some sample lambda terms (named for convenience):&lt;ul&gt;&lt;i&gt;
&lt;item&gt;zero &amp;equiv; &amp;lambda;s.&amp;lambda;z.z
&lt;item&gt;one &amp;equiv; &amp;lambda;s.&amp;lambda;z.s z
&lt;item&gt;two &amp;equiv; &amp;lambda;s.&amp;lambda;z.s (s z)
&lt;item&gt;three &amp;equiv; &amp;lambda;s.&amp;lambda;z.s (s (s z))
&lt;item&gt;plus &amp;equiv; &amp;lambda;m.&amp;lambda;n.&amp;lambda;s.&amp;lambda;z.m s (n s z)
&lt;/i&gt;
&lt;/ul&gt;
Or, in Haskell
&lt;pre&gt;
[z,s,m,n] = map (Var . (:[])) "zsmn"
app2 f x y = App (App f x) y
zero  = Lam "s" $ Lam "z" z
one   = Lam "s" $ Lam "z" $ App s z
two   = Lam "s" $ Lam "z" $ App s $ App s z
three = Lam "s" $ Lam "z" $ App s $ App s $ App s z
plus  = Lam "m" $ Lam "n" $ Lam "s" $ Lam "z" $ app2 m s (app2 n s z)
&lt;/pre&gt;
And now we can check that addition works, &lt;tt&gt;betaEq (app2 plus one two) three&lt;/tt&gt; will produce &lt;tt&gt;True&lt;/tt&gt;.

&lt;h3&gt;A detour, &lt;a href="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus"&gt;simply typed lambda calculus&lt;/a&gt;.&lt;/h3&gt;
To do type checking we need to introduce types.  A very simple system is the simply typed lambda calculus.  It has one (or more) base type (think of it as Bool or Int if you want an example) and function types.  In Haskell terms:
&lt;pre&gt;
data Type = Base | Arrow Type Type
    deriving (Eq, Read, Show)
&lt;/pre&gt;
Or&lt;i&gt;&lt;ul&gt;
&lt;item&gt;t&amp;rarr;t
&lt;item&gt;B
&lt;/ul&gt;&lt;/i&gt;
The expressions themselves will have an explicit type on the bound variable in a lambda expression. So we now have&lt;ul&gt;
&lt;item&gt;&lt;i&gt;x&lt;/i&gt;
&lt;item&gt;&lt;i&gt;e e&lt;/i&gt;
&lt;item&gt;&lt;i&gt;&amp;lambda;x:t.e&lt;/i&gt;
&lt;/ul&gt;
For example, &lt;i&gt;(&amp;lambda;x:(B&amp;rarr;B).&amp;lambda;y:B.x)(&amp;lambda;z:B.z)&lt;/i&gt;.

The Haskell type for expressions is
&lt;pre&gt;
data Expr
    = Var Sym
    | App Expr Expr
    | Lam Sym Type Expr
    deriving (Eq, Read, Show)
&lt;/pre&gt;
The only difference is the &lt;tt&gt;Type&lt;/tt&gt; in &lt;tt&gt;Lam&lt;/tt&gt;.
All the functions we had for the untyped lambda calculus can be trivially extended to the simply typed one by simply carrying the type along.

So finally, time for some type checking.  The type checker will take an expression and return the type of the expression.  The type checker will also need the types of all free variables in the expression to be able to do this.  Otherwise, what type would it assign to, say, the expression &lt;i&gt;x&lt;/i&gt;?

To represent the types of the free variables we use an environment which is simply a list of variables and their types.
&lt;pre&gt;
newtype Env = Env [(Sym, Type)] deriving (Show)

initalEnv :: Env
initalEnv = Env []

extend :: Sym -&gt; Type -&gt; Env -&gt; Env
extend s t (Env r) = Env ((s, t) : r)
&lt;/pre&gt;

Type checking can go wrong; there can be type errors.  To cater for this the type checker will be written in monadic style where the monad is simply an error (exception) monad.  The error messages are strings, and the monad itself is the &lt;tt&gt;Either&lt;/tt&gt; type.  So &lt;tt&gt;TC&lt;/tt&gt; is the type checking monad.
&lt;pre&gt;
type ErrorMsg = String

type TC a = Either ErrorMsg a
&lt;/pre&gt;

We can now write variable lookup.
&lt;pre&gt;
findVar :: Env -&gt; Sym -&gt; TC Type
findVar (Env r) s =
    case lookup s r of
    Just t -&gt; return t
    Nothing -&gt; throwError $ "Cannot find variable " ++ s
&lt;/pre&gt;
It simply looks up the variable and returns the type.  If not found it throws an error with &lt;tt&gt;throwError&lt;/tt&gt; (from &lt;tt&gt;Control.Monad.Error&lt;/tt&gt;).

And then the type checker itself.
&lt;pre&gt;
tCheck :: Env -&gt; Expr -&gt; TC Type
tCheck r (Var s) =
    findVar r s
tCheck r (App f a) = do
    tf &amp;lt;- tCheck r f
    case tf of
     Arrow at rt -&gt; do
        ta &amp;lt;- tCheck r a
        when (ta /= at) $ throwError "Bad function argument type"
        return rt
     _ -&gt; throwError "Non-function in application"
tCheck r (Lam s t e) = do
    let r' = extend s t r
    te &amp;lt;- tCheck r' e
    return $ Arrow t te
&lt;/pre&gt;

For variables, just look up the type for it in the environment.

For application, type check the function part and the argument part.  The function should have function (arrow) type, and if it does the type of the application is the return type of the function.

Finally, for a lambda expression we extend the environment with the bound variable.  We then check the body, and the type of the lambda expression is a function type from the argument type to the type of the body.

For convenience:
&lt;pre&gt;
typeCheck :: Expr -&gt; Type
typeCheck e =
    case tCheck initalEnv e of
    Left msg -&gt; error ("Type error:\n" ++ msg)
    Right t -&gt; t
&lt;/pre&gt;

Pretty easy sailing so far.

The simply typed lambda calculus is a pain to use.  Take something like &lt;i&gt;&amp;lambda;x.x&lt;/i&gt; in the untyped world.  What type should we give it?  Well that depends on how we intend to use it.  Maybe &lt;i&gt;&amp;lambda;x:B.x&lt;/i&gt;, maybe &lt;i&gt;&amp;lambda;x:(B&amp;rarr;B).x&lt;/i&gt;, maybe &lt;i&gt;&amp;lambda;x:(B&amp;rarr;B&amp;rarr;B).x&lt;/i&gt;...
So we can no longer have one identity function; we need one for each type.  What a bummer!  It's as bad as C.

BTW, all (type correct) expression in the simply typed lambda calculus have a normal form (Tait 1967).

&lt;h3&gt;Almost going down the wrong track, the &lt;a href="http://en.wikipedia.org/wiki/System_F"&gt;polymorphic lambda calculus&lt;/a&gt;.&lt;/h3&gt;
(Don't get me wrong, the polymorphic lambda calculus is a work of marvel.)
So how can we fix the problem with one identity function for each type?  We can add polymorphism!
We can extend the expression language so that we also pass types around; we add type abstraction and type application.&lt;i&gt;&lt;ul&gt;
&lt;item&gt;x
&lt;item&gt;e e
&lt;item&gt;&amp;lambda;x:t.e
&lt;item&gt;&amp;Lambda;&amp;alpha;:k.e
&lt;item&gt;e[t]
&lt;/ul&gt;&lt;/i&gt;
Where &lt;i&gt;&amp;Lambda;&amp;alpha;:k.e&lt;/i&gt; is a type abstraction, i.e., &lt;i&gt;&amp;alpha;&lt;/i&gt; is a type variable which we can use in type expressions inside &lt;i&gt;e&lt;/i&gt;.
To supply a type argument we have type application, &lt;i&gt;e[t]&lt;/i&gt;.
So the types we have would functions, base type, and type variables.&lt;i&gt;&lt;ul&gt;
&lt;item&gt;t&amp;rarr;t
&lt;item&gt;B
&lt;item&gt;&amp;alpha;
&lt;/ul&gt;&lt;/i&gt;
And what is &lt;i&gt;k&lt;/i&gt; in the &lt;i&gt;&amp;Lambda;&amp;alpha;:k.e&lt;/i&gt;?  Well, now types have gotten so complicated that it is possible to construct types that make no sense, so we need a "type system" for the types.  We call them kinds.&lt;i&gt;&lt;ul&gt;
&lt;item&gt;k&amp;rarr;k
&lt;item&gt;*
&lt;/ul&gt;&lt;/i&gt;

Defining all this is Haskell would be something like
&lt;pre&gt;
data Expr
    = Var Sym
    | App Expr Expr
    | Lam Sym Type Expr
    | TLam Sym Kind Expr
    | TApp Expr Type
    deriving (Eq, Read, Show)
data Type
    = Arrow Type Type
    | Base
    | TVar Sym
    deriving (Eq, Read, Show)
data Kind
    = KArrow Type Type
    | Star
    deriving (Eq, Read, Show)
&lt;/pre&gt;
But wait, there's an awful lot of duplication here.  The structures on the three levels have a lot of similarities.
(Oh, and we don't really need &lt;tt&gt;Base&lt;/tt&gt; anymore now when we have variables.)

BTW, this system, called System F&lt;sub&gt;&amp;omega;&lt;/sub&gt;, is (a simplified version of) what GHC uses internally to represent Haskell code.
It's a beautiful system, really.

Oh, the identity function, well it would be &lt;i&gt;&amp;Lambda;&amp;alpha;:*.&amp;lambda;x:&amp;alpha;.x&lt;/i&gt;.  And using it: &lt;i&gt;id[B]a&lt;/i&gt;, assuming &lt;i&gt;a&lt;/i&gt; has type &lt;i&gt;B&lt;/i&gt;.



&lt;h3&gt;Simply easy, &lt;a href="http://en.wikipedia.org/wiki/Lambda_cube"&gt;the lambda cube&lt;/a&gt;.&lt;/h3&gt;
To simplify and (as often happens when you simplify) generalize the expressions above we are going to squish them all into one expression data type.

So &lt;tt&gt;TLam&lt;/tt&gt; and &lt;tt&gt;Lam&lt;/tt&gt; will join, as will &lt;tt&gt;TApp&lt;/tt&gt; and &lt;tt&gt;App&lt;/tt&gt;, &lt;tt&gt;TVar&lt;/tt&gt; and &lt;tt&gt;Var&lt;/tt&gt;, &lt;tt&gt;KArrow&lt;/tt&gt; and &lt;tt&gt;Arrow&lt;/tt&gt;.
But wait, there's nothing corresponding to &lt;tt&gt;Arrow&lt;/tt&gt; in &lt;tt&gt;Expr&lt;/tt&gt;.  We need to add something.  We could just add it as it is, but we won't.

TADA, enter dependent types.  Instead of the boring function type &lt;i&gt;t&amp;rarr;u&lt;/i&gt; we will use a more exciting one, &lt;i&gt;(x:t)&amp;rarr;u&lt;/i&gt;.
What does it mean?  It means that the variable &lt;i&gt;x&lt;/i&gt; can occur in &lt;i&gt;u&lt;/i&gt;.  If it doesn't then it's simply the same as the old fashioned function type.  If &lt;i&gt;x&lt;/i&gt; does occur it means that &lt;b&gt;type&lt;/b&gt; &lt;i&gt;u&lt;/i&gt; can depend on the &lt;b&gt;value&lt;/b&gt; of the argument (&lt;i&gt;x&lt;/i&gt;).

In Haskell:
&lt;pre&gt;
data Expr
    = Var Sym
    | App Expr Expr
    | Lam Sym Type Expr
    | Pi  Sym Type Type
    | Kind Kinds
    deriving (Eq, Read, Show)
type Type = Expr

data Kinds = Star | Box deriving (Eq, Read, Show)
&lt;/pre&gt;
The new arrow type is called &lt;tt&gt;Pi&lt;/tt&gt;.  We will also need more than one kind, &lt;tt&gt;Star&lt;/tt&gt; and &lt;tt&gt;Box&lt;/tt&gt;.

It's pretty easy to extend the functions from the first part to handle this expression type.  There's just a few more places to recurse.
Here's the code again.  Absolutly nothing subtle about it.
&lt;pre&gt;
freeVars :: Expr -&gt; [Sym]
freeVars (Var s) = [s]
freeVars (App f a) = freeVars f `union` freeVars a
freeVars (Lam i t e) = freeVars t `union` (freeVars e \\ [i])
freeVars (Pi i k t) = freeVars k `union` (freeVars t \\ [i])
freeVars (Kind _) = []

subst :: Sym -&gt; Expr -&gt; Expr -&gt; Expr
subst v x = sub
  where sub e@(Var i) = if i == v then x else e
        sub (App f a) = App (sub f) (sub a)
        sub (Lam i t e) = abstr Lam i t e
        sub (Pi i t e) = abstr Pi i t e
        sub (Kind k) = Kind k
        fvx = freeVars x
        cloneSym e i = loop i
           where loop i' = if i' `elem` vars then loop (i ++ "'") else i'
                 vars = fvx ++ freeVars e
        abstr con i t e =
            if v == i then
                con i (sub t) e
            else if i `elem` fvx then
                let i' = cloneSym e i
                    e' = substVar i i' e
                in  con i' (sub t) (sub e')
            else
                con i (sub t) (sub e)

&lt;/pre&gt;
To cut down on the code you could actually join the &lt;tt&gt;Lam&lt;/tt&gt; and &lt;tt&gt;Pi&lt;/tt&gt; constructors since they are treated identically in many cases.  I've left them separate for clarity.

The &lt;tt&gt;alphaEq&lt;/tt&gt; function extends in the natural way to the new type, so does &lt;tt&gt;nf&lt;/tt&gt;, but here it is anyway.
&lt;pre&gt;
nf :: Expr -&gt; Expr
nf ee = spine ee []
  where spine (App f a) as = spine f (a:as)
        spine (Lam s t e) [] = Lam s (nf t) (nf e)
        spine (Lam s _ e) (a:as) = spine (subst s a e) as
        spine (Pi s k t) as = app (Pi s (nf k) (nf t)) as
        spine f as = app f as
        app f as = foldl App f (map nf as)
&lt;/pre&gt;

So, now for the meaty part, the type checking itself.  The handling of the environment is just as before, so we'll just look at the different cases for the type checking.

&lt;pre&gt;
tCheck :: Env -&gt; Expr -&gt; TC Type
tCheck r (Var s) =
    findVar r s
&lt;/pre&gt;Just as before.

&lt;pre&gt;
tCheck r (App f a) = do
    tf &amp;lt;- tCheckRed r f
    case tf of
     Pi x at rt -&gt; do
        ta &amp;lt;- tCheck r a
        when (not (betaEq ta at)) $ throwError $ "Bad function argument type"
        return $ subst x a rt
     _ -&gt; throwError $ "Non-function in application"
&lt;/pre&gt;This is almost as before, but the arrow type is called &lt;tt&gt;Pi&lt;/tt&gt; now.
The key thing here &amp;mdash; and this is really where the fact that we are doing dependent types shows up &amp;mdash; is the return type.  For the simply typed lambda calculus it was just &lt;tt&gt;rt&lt;/tt&gt;, but now &lt;tt&gt;rt&lt;/tt&gt; can contain free occurences of the variable &lt;tt&gt;x&lt;/tt&gt;.  Since we are returning &lt;tt&gt;rt&lt;/tt&gt; the &lt;tt&gt;x&lt;/tt&gt; would no longer be in scope, so we substitute the value of the argument for it.  This is coolest part of the type checker.  You've seen it.  That's where it is.
Since types can now be arbitrary expression we use &lt;tt&gt;betaEq&lt;/tt&gt; to compare them instead of &lt;tt&gt;(==)&lt;/tt&gt;.

&lt;pre&gt;
tCheck r (Lam s t e) = do
    tCheck r t
    let r' = extend s t r
    te &amp;lt;- tCheck r' e
    let lt = Pi s t te
    tCheck r lt
    return lt
&lt;/pre&gt;The lambda case is similar to before, but we return a &lt;tt&gt;Pi&lt;/tt&gt; now, so we need to include the variable name.
Furthermore, to avoid nonsense like &lt;i&gt;&amp;lambda;x:5.e&lt;/i&gt; we make sure that the type we want to return actually has a valid kind itself.
(The first call to &lt;tt&gt;tCheck&lt;/tt&gt; is to ensure the type we're putting into the environment is valid; I'm sure there's a more elegant way to do this, but I can't remember what it is right now.)

&lt;pre&gt;
tCheck _ (Kind Star) = return $ Kind Box
tCheck _ (Kind Box) = throwError "Found a Box"
&lt;/pre&gt;Everything has a type, so what's the type of &lt;i&gt;*&lt;/i&gt; (&lt;tt&gt;Kind Star&lt;/tt&gt;), well it's a [] (&lt;tt&gt;Kind Box&lt;/tt&gt;) (excuse the ugly box, I can't find the HTML version of a box).
And what's the type of &lt;tt&gt;Box&lt;/tt&gt;?  Well, you could keep going, but instead we'll stop right here.  The idea is that the source language which we'll write our terms in will not allow the box to be written, so it should never occur.

&lt;pre&gt;
tCheck r (Pi x a b) = do
    s &amp;lt;- tCheckRed r a
    let r' = extend x a r
    t &amp;lt;- tCheckRed r' b
    when ((s, t) `notElem` allowedKinds) $ throwError "Bad abstraction"
    return t
&lt;/pre&gt;How do we check the type of the (dependent) function type?  Well, we check the type of the thing to the left of the arrow, extend the environment, and then check the thing to the right.  So now what should &lt;tt&gt;(s, t)&lt;/tt&gt; be?  Well, &lt;tt&gt;a&lt;/tt&gt; and &lt;tt&gt;b&lt;/tt&gt; should be types (or maybe kinds).  So their types should be kinds.  This leads to the following definition:
&lt;pre&gt;
allowedKinds :: [(Type, Type)]
allowedKinds = [(Kind Star, Kind Star), (Kind Star, Kind Box), (Kind Box, Kind Star), (Kind Box, Kind Box)]
&lt;/pre&gt;I.e., we allow (*,*), (*,[]), ([],*), and ([],[]).  What does it all mean?

Here's the beauty of the lambda cube.  By varying what we allow we can change what system we type check.&lt;ul&gt;
&lt;item&gt;(*,*) values can depend on values.  Just this gives the simply typed &amp;lambda; calculus.
&lt;item&gt;([],[]) types can depend on types.
&lt;item&gt;([],*) values can depend on type.  Include all these three and you get F&lt;sub&gt;&amp;omega;&lt;/sub&gt;.
&lt;item&gt;(*,[]) types can depend on values.  Include this one to get dependent types.
&lt;/ul&gt;
With all four combination allowed you get Calculus of Construction (CoC).  If you always include (*,*), but make a choice of the other 3 you get 8 choices; these are the corners of the lambda cube.  All of these system have been studied.

BTW, all the systems in the lambda cube have the property that a well typed expression has a normal form.  (Well, the proof of this is so complicated for some of these systems that some people kinda doubt it.)

&lt;h3&gt;Examples&lt;/h3&gt;
Here the syntax &lt;i&gt;s&amp;rarr;t&lt;/i&gt; means &lt;i&gt;(_:s)&amp;rarr;t&lt;/i&gt;, where "_" is some new variable not used in &lt;i&gt;t&lt;/i&gt;.
&lt;ul&gt;
&lt;item&gt;Identity&lt;i&gt;&lt;ul&gt;
&lt;item&gt;id &amp;equiv; &amp;lambda;a:*.&amp;lambda;x:a.x
&lt;/ul&gt;&lt;/i&gt;
&lt;item&gt;Pairs&lt;i&gt;&lt;ul&gt;
&lt;item&gt;Pair  &amp;equiv; &amp;lambda;a:*.&amp;lambda;b:*.(c:*&amp;rarr;((a&amp;rarr;b&amp;rarr;c)&amp;rarr;c))
&lt;item&gt;pair  &amp;equiv; &amp;lambda;a:*.&amp;lambda;b:*.&amp;lambda;x:a.&amp;lambda;y:b.&amp;lambda;c:*.&amp;lambda;f:(a&amp;rarr;b&amp;rarr;c).f x y
&lt;item&gt;split &amp;equiv; &amp;lambda;a:*.&amp;lambda;b:*.&amp;lambda;r:*.&amp;lambda;f:(a&amp;rarr;b&amp;rarr;r).&amp;lambda;p:(Pair a b).p r f
&lt;item&gt;fst   &amp;equiv; &amp;lambda;a:*.&amp;lambda;b:*.&amp;lambda;p:(Pair a b).split a b a (&amp;lambda;x:a.&amp;lambda;y:b.x) p
&lt;item&gt;snd   &amp;equiv; &amp;lambda;a:*.&amp;lambda;b:*.&amp;lambda;p:(Pair a b).split a b b (&amp;lambda;x:a.&amp;lambda;y:b.y) p
&lt;/ul&gt;&lt;/i&gt;
&lt;/ul&gt;

My fingers are numb from all these greek characters, so I'll continue with examples another time.
And, of course, a parser and pretty printer.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5687095637180134906?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5687095637180134906/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5687095637180134906' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5687095637180134906'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5687095637180134906'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/10/simpler-easier-in-recent-paper-simply.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1682710382303035989</id><published>2007-08-20T20:51:00.000+01:00</published><updated>2007-08-20T21:31:38.730+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Quicksort in Haskell&lt;/b&gt;
Quicksort is a commonly used example of how succinct Haskell programming can be.
It usually looks something likes this:
&lt;pre&gt;
qsort :: (Ord a Bool) =&gt; [a] -&gt; [a]
qsort [] = []
qsort (x:xs) = qsort (filter (&lt;= x) xs) ++ [x] ++ qsort (filter (&gt; x) xs)
&lt;/pre&gt;
The problem with this function is that it's not really Quicksort.  Viewed through sufficently blurry glasses (or high abstraction altitude) it's looks like the real Quicksort.  What they have in common is overall algorithm: pick a pivot (always the first element), then recursively sort the ones that are smaller, the ones that are bigger, and then stick it all together.

But in my opinion the real Quicksort has to be imperative because it relies on destructive update; and it uses a very elegant partitioning algorithm.  The partitioning works like this: scan from the left for an element bigger than the pivot, then scan from the right for an element smaller than the pivot, and then swap them.  Repeat this until the array has been partitioned.

Haskell has a variety of array types with destructive updates (in different monads), so it's perfectly possible to write the imperative Quicksort in Haskell.

To make the code look reasonably nice I'm going to use my C-like DSEL to write the code.

Here it is:
&lt;pre&gt;
qsortM :: forall i a m r arr .
         (MonadRef m r, Num i, Ix i, MArray arr a m, Ord a Bool) =&gt;
         arr i a -&gt; m (arr i a)
qsortM ma = runE $ do
    (lb, ub) &amp;lt;- embed $ getBounds ma
    let mlb = pure0 lb
        mub = pure0 ub
    a &amp;lt;- liftArray ma

    let qsort' l r =
            if1 (r &gt; l) $ do
                i &amp;lt;- auto l
                j &amp;lt;- auto (r+1)
                let v = a[l] :: E m a
                    iLTj = i &amp;lt; (j :: E m i)
                while iLTj $ do
                    while ((i += 1) &amp;lt; mub &amp;&amp; a[i] &amp;lt; v)
                        skip
                    while (a[j -= 1] &gt; v)
                        skip
                    if1 iLTj $ do
                        a[i] =:= a[j]
                a[l] =:= a[j]
                qsort' l (j-1)
                qsort' i r

    qsort' mlb mub
    return ma
&lt;/pre&gt;

So the type is &lt;tt&gt;arr i a -&gt; m (arr i a)&lt;/tt&gt;, i.e., &lt;tt&gt;qsortM&lt;/tt&gt; takes an array indexed with &lt;tt&gt;i&lt;/tt&gt; and elements of type &lt;tt&gt;a&lt;/tt&gt;.  It returns the sorted array, but the sorting takes places in some monad &lt;tt&gt;m&lt;/tt&gt;.
And then there are all kinds of constraints on the type variables.
The &lt;tt&gt;MonadRef m r&lt;/tt&gt; says that the monad has to have references so we can have some variables.
The array index has to be in the &lt;tt&gt;Ix&lt;/tt&gt;; that's part of the general array constraints.  It also have to be numeric so we can add and subtract indicies.
The array type has to fulfill &lt;tt&gt;MArray arr a m&lt;/tt&gt; which means that &lt;tt&gt;arr&lt;/tt&gt; is an array of &lt;tt&gt;a&lt;/tt&gt; and updatable in monad &lt;tt&gt;m&lt;/tt&gt;.
Finally, the elements have to be ordered.  I'm not using the normal &lt;tt&gt;Ord&lt;/tt&gt; class, but instead an overloaded &lt;tt&gt;Ord&lt;/tt&gt; where the return type is overloaded too.

A few comments on the code.  The &lt;tt&gt;liftArray&lt;/tt&gt; function lifts a regular array into one that can be indexed with &lt;tt&gt;[i]&lt;/tt&gt;.  The &lt;tt&gt;=:=&lt;/tt&gt; operator swaps two variables.  The &lt;tt&gt;skip&lt;/tt&gt; function does nothing.  In traditional C style we do all the side effects while computing the condition.
There are some type signatures in the code that are annoying, but that I have not found a way around yet.

But otherwise, the code proceeds like most any imperative Quicksort.  The &lt;tt&gt;i&lt;/tt&gt; and &lt;tt&gt;j&lt;/tt&gt; variables scan the array from left and right to locate two elements that need swapping.  We then swap them, and continue scanning until the indicies cross.  After the partitioning we swap the pivot into place and sort the two parts recursively.

So, this function is polymorphic in the monad.  But there is one monad that I think is extra interesting, namely &lt;tt&gt;ST&lt;/tt&gt;.  With this monad you can do references, updatable arrays, etc., and finally you can seal it all of with &lt;tt&gt;runST&lt;/tt&gt;.  The resulting type is pure and shows no signs of what happened inside.
This is a really amazing feat, in my opinion.  The type checker performs the proof that nothing about the "dirty" innards leaks out.  So instead of some informal reasoning that a function with an impure inside can be pure on the outside you have a machine checked proof.  Of course, there's a meta proof that this is all correct, but John Launchbury and Simon Peyton Jones have already done that once, and now we just need the type checker.

Here's the code:
&lt;pre&gt;
qsortA :: forall i a . (Num i, Ix i, Ord a Bool) =&gt; Array i a -&gt; Array i a
qsortA a = runSTArray sa
  where sa :: ST s (STArray s i a)
        sa = thaw a &gt;&gt;= qsortM
&lt;/pre&gt;

We're using &lt;tt&gt;runSTArray&lt;/tt&gt; instead of &lt;tt&gt;runST&lt;/tt&gt;, because it provides an efficient way to turn a mutable array on the inside into an immutable array on the outside.
The &lt;tt&gt;thaw&lt;/tt&gt; function turns an immutable array into a mutable one, but it has to make a copy to be safe, since we don't want to mutate the original.

Finally, if we want to sort lists we can always convert back and forth.
&lt;pre&gt;
asList :: (Array Int a -&gt; Array Int a) -&gt; ([a] -&gt; [a])
asList f xs = elems . f . listArray (1, length xs) $ xs 

qsort :: (Prelude.Ord a) =&gt; [a] -&gt; [a]
qsort = asList qsortA
&lt;/pre&gt;

The final &lt;tt&gt;qsort&lt;/tt&gt; has the normal type signature (I switched to the normal &lt;tt&gt;Ord&lt;/tt&gt; again).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1682710382303035989?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1682710382303035989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1682710382303035989' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1682710382303035989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1682710382303035989'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/08/quicksort-in-haskell-quicksort-is.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-73036381902045144</id><published>2007-08-16T11:17:00.000+01:00</published><updated>2007-08-16T12:09:11.261+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;What about arrays?&lt;/b&gt;

After doing my little C-like DSEL in Haskell I started wondering if you could do arrays that look like C arrays too.  It turns out you can, but I can't figure out a way to make it safe from certain runtime errors.

Array indexing in C is written &lt;tt&gt;a[i]&lt;/tt&gt;, and this is legal syntax in Haskell too, so we're off to a good start.  But to make it work we need &lt;tt&gt;a&lt;/tt&gt; to be a function that takes a list and returns something.  What should it return?  It must be something that is both an l-value and an r-value.

Just as I had a function &lt;tt&gt;auto&lt;/tt&gt; to create a local variable, I'll have a function &lt;tt&gt;arr&lt;/tt&gt; to make a local array.  It will take the size of the array and then return a function that takes the index.  For simplicity I'll make the index be an &lt;tt&gt;Int&lt;/tt&gt;.
&lt;pre&gt;
arr :: forall a . [E Int] -&gt; E (forall v . [E Int] -&gt; E' v a)
&lt;/pre&gt;
So now this type checks
&lt;pre&gt;
atest = do {
    a &amp;lt;- arr[2];
    a[1] =: a[0] + 1;
  }
&lt;/pre&gt;

Now we just need the body of &lt;tt&gt;arr&lt;/tt&gt;.
&lt;pre&gt;
arr [s] = do
    s' &amp;lt;- runE s
    a &amp;lt;- newArray (0, s' - 1) undefined :: IO (IOArray Int a)
    let ix [i] = runE i
    return (\ is -&gt; V (ix is &gt;&gt;= readArray a)
                      (\ x -&gt; ix is &gt;&gt;= \ i -&gt; writeArray a i x))
&lt;/pre&gt;
The &lt;tt&gt;arr&lt;/tt&gt; function takes a list with one element, the size, and allocates an array (indexed from 0) with this size.  It then returns a function that expects a singleton list with an index and returns the &lt;tt&gt;V&lt;/tt&gt; constructor which I used for variables.

For multidimensional arrays we can extend the arr function.
&lt;pre&gt;
arr ss = do
    ss' &amp;lt;- mapM runE ss
    let sz = product ss'
        ix is = do
                    is' &amp;lt;- mapM runE is
                    when (length is' /= length ss') $ error "wrong number of indicies"
                    return $ foldr (\ (i, s) r -&gt; r * s + i) 0 (zip is' ss')
    a &amp;lt;- newArray (0, product ss' - 1) undefined :: IO (IOArray Int a)
    return (\ is -&gt; V (ix is &gt;&gt;= readArray a) (\ x -&gt; ix is &gt;&gt;= \ i -&gt; writeArray a i x))
&lt;/pre&gt;

The problem with both these definitions is that the number of indicies is checked dynamically rather than statically.  To do it statically we'd have to be able to overload the syntax for list literals to use a type that keeps track of the number of elements.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-73036381902045144?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/73036381902045144/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=73036381902045144' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/73036381902045144'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/73036381902045144'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/08/what-about-arrays-after-doing-my-little.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1130234130256692854</id><published>2007-08-13T23:47:00.000+01:00</published><updated>2007-08-14T18:56:18.039+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Programming in C, ummm, Haskell&lt;/b&gt;

Here's a small Haskell program for computing factorial.
&lt;pre&gt;
fac n = do {
    a &amp;lt;- auto 1;
    i &amp;lt;- auto n;
    while (i &gt;. 0) $ do {
        a *= i;
        i -= 1;
    };
    a;
  }
&lt;/pre&gt;
It even runs, &lt;tt&gt;runE (fac 10)&lt;/tt&gt; produces 3628800.

Let's compare that to the C program doing the same thing.
&lt;pre&gt;
int
fac(int n) {
    auto a = 1;
    auto i = n;
    while (i &gt; 0) {
        a *= i;
        i -= 1;
    }
    return a;
}
&lt;/pre&gt;
They look rather similar, don't they?  (BTW, I decided to use the heavily underused C keyword &lt;tt&gt;auto&lt;/tt&gt;.  If you don't know C, don't worry, &lt;tt&gt;auto&lt;/tt&gt; doesn't mean anything.  A local declaration &lt;tt&gt;auto int i;&lt;/tt&gt; is the same as &lt;tt&gt;int i;&lt;/tt&gt; in C.)

I often hear people complain that C-like programming in Haskell is ugly and verbose, but I don't think that has to be the case.  What people complain about is when they write code like this:
&lt;pre&gt;
fac' n = do
    a &amp;lt;- newIORef 1
    i &amp;lt;- newIORef n
    whileM (do i' &amp;lt;- readIORef i; return $ i' &gt; 0) $ do
        i' &amp;lt;- readIORef i
        a' &amp;lt;- readIORef a
        writeIORef a (a' * i')
        writeIORef i (i' + 1)
    a' &amp;lt;- readIORef a
    return a'
&lt;/pre&gt;
And I agree, it's ugly, it's verbose, but it's not necessary.

One of the reasons it's verbose are the uses of &lt;tt&gt;readIORef&lt;/tt&gt; and &lt;tt&gt;writeIORef&lt;/tt&gt;.  In C you just refer to the variable and you'll get an r-value when you need it, and you'll get an l-value when you need it.
And you can do the same in Haskell.

Let's look at a tiny version of the problem.  We want the following operations:
&lt;pre&gt;
type V a ...                     -- variables of type a
type E a ...                     -- expressions of type a
(=:) :: V a -&gt; E a -&gt; IO ()      -- assignment
plus :: E Int -&gt; E Int -&gt; E Int  -- addition
one :: E Int                     -- constant 1
&lt;/pre&gt;
Here's a tiny sample:
&lt;pre&gt;
  x &amp;lt;- auto one
  x =: x `plus` x

-- the following should be a type error
  (x `plus` x) =: one
&lt;/pre&gt;
How can we make this happen?  We want &lt;tt&gt;x&lt;/tt&gt; to be able to be the left hand side of an assignment as well as an expression on the right hand side.  But the assignment operator has different types on the left and right.  So we could possibly make &lt;tt&gt;V&lt;/tt&gt; the same as &lt;tt&gt;E&lt;/tt&gt;.  But this would be bad, because we want the left hand side to be a variable and now we'd have no way of knowing, which means that it would no longer be a type error to assign to a non-variable.

So, we need a little bit of type trickery.  First, we'll make &lt;tt&gt;V&lt;/tt&gt; and &lt;tt&gt;E&lt;/tt&gt; "subtypes" of the same type, so they have something in common.
&lt;pre&gt;
data E' v a = ...
data LValue
data RValue
type V a = E' LValue a
type E a = E' RValue a
&lt;/pre&gt;
The type &lt;tt&gt;E'&lt;/tt&gt; has an additional type argument that determines if the value is an l-value or an r-value.  So what about &lt;tt&gt;auto&lt;/tt&gt;, what type should it have?  It needs to return something that is both an l-value and an r-value.  So we'll make it polymorphic!  First attempt:
&lt;pre&gt;
auto :: E a -&gt; IO (E' v a)
&lt;/pre&gt;
But this won't work.  Saying "&lt;tt&gt;x &amp;lt;- auto 2; ...&lt;/tt&gt;" is the same as "&lt;tt&gt;auto 2 &gt;&gt;= \ x -&gt; ...&lt;/tt&gt;".  And when a variable is lambda bound it is not polymorphic.  What does this mean?  It means that we can use &lt;tt&gt;x&lt;/tt&gt; as either an l-value or as an r-value inside &lt;tt&gt;...&lt;/tt&gt;, but not both; the &lt;tt&gt;v&lt;/tt&gt; type variable can only have a single type.

Are we stuck?  Well, we would have been without higher ranked polymorphism.  But here's a type that works:
&lt;pre&gt;
auto :: E a -&gt; IO (forall v . E' v a)
&lt;/pre&gt;
So now &lt;tt&gt;x&lt;/tt&gt; will have type &lt;tt&gt;forall v . E' v a&lt;/tt&gt; which is indeed polymorphic in &lt;tt&gt;v&lt;/tt&gt; just as we wanted.

OK, so now we have some type signatures that work (using stub implementations it's easy to check that the types work out).  So now we need to implement the types and the operations.

Let's start with r-values.  To make a constructor that only can construct r-values we'll use a GADT.  The &lt;tt&gt;E&lt;/tt&gt; type will simply embed an &lt;tt&gt;IO&lt;/tt&gt; value.  So when we see the type &lt;tt&gt;E' RValue a&lt;/tt&gt; it's really isomorphic to &lt;tt&gt;IO a&lt;/tt&gt;.
To extract the &lt;tt&gt;IO&lt;/tt&gt; value we have the function &lt;tt&gt;runE&lt;/tt&gt;.
&lt;pre&gt;
data E' v a where
    E :: IO a -&gt; E' RValue a

runE :: E' v a -&gt; IO a
runE (E t) = t
&lt;/pre&gt;

So now we can do &lt;tt&gt;plus&lt;/tt&gt; and &lt;tt&gt;one&lt;/tt&gt;; they are totally straightforward except that we need to unwrap and wrap the &lt;tt&gt;E&lt;/tt&gt; constructor.
&lt;pre&gt;
plus :: E Int -&gt; E Int -&gt; E Int
plus x y = E $ do
    x' &amp;lt;- runE x
    y' &amp;lt;- runE y
    return $ x' + y'

one :: E Int
one = E $ return 1
&lt;/pre&gt;

And then the hard part, variables.  To represent a variable we'll use two fields, one that reads the variable and one that assigns the variable.  We need to extend the &lt;tt&gt;runE&lt;/tt&gt; function to use the variable read field to get the value.
The &lt;tt&gt;auto&lt;/tt&gt; function allocates a new variable and then packages up the two fields.
&lt;pre&gt;
data E' v a where
    E :: IO a -&gt; E' RValue a
    V :: IO a -&gt; (a -&gt; IO ()) -&gt; E' v a

runE :: E' v a -&gt; IO a
runE (E t) = t
runE (V t _) = t

auto :: E a -&gt; IO (forall v . E' v a)
auto x = do
    x' &amp;lt;- runE x
    r  &amp;lt;- newIORef x'
    return (V (readIORef r) (writeIORef r))
&lt;/pre&gt;

We're about done, just assignment left.  It's easy, just use the assignment field from the &lt;tt&gt;V&lt;/tt&gt; constructor.
&lt;pre&gt;
(=:) :: V a -&gt; E a -&gt; IO ()
(V _ asg) =: e = do
    e' &amp;lt;- runE e
    asg e'
&lt;/pre&gt;

Hmmm, but there's a missing case in that definition.  What about the constructor &lt;tt&gt;E&lt;/tt&gt;?  Can't we get a runtime failure?  No.  Look at the type of the first argument, it's &lt;tt&gt;V a&lt;/tt&gt;, i.e., &lt;tt&gt;E' LValue a&lt;/tt&gt;.  The &lt;tt&gt;E&lt;/tt&gt; constructor can only construct values of type &lt;tt&gt;E' RValue a&lt;/tt&gt;, so we just can't get a match failure.

Now our little example above compiles, and trying the bad expression &lt;tt&gt;(x `plus` x) =: one&lt;/tt&gt; gives this error:
&lt;pre&gt;
    Couldn't match expected type `LValue'
    against inferred type `RValue'
      Expected type: V a
      Inferred type: E Int
&lt;/pre&gt;

Once we've made the &lt;tt&gt;E&lt;/tt&gt; data type it's totally routine to make the factorial function I first showed work.  We just need to create an instance for &lt;tt&gt;Num (E a)&lt;/tt&gt; and a few more functions.

Is this the end of the story?  Is everything rosy?  Is C programming in Haskell as smooth as that?  Well, there are some flies in the ointment.  While it's true that programming with the &lt;tt&gt;E&lt;/tt&gt; type is easy, it's a little cumbersome if we want to use pure functions.  Pute functions will not have the &lt;tt&gt;E&lt;/tt&gt; type and will have to be lifted into this new brave world.  Likewise, IO types will have to be lifted.  So while we have reduced some verbosity, it has popped up again in a different place.
Which is better?  I don't know, but I thought I'd share the l-value/r-value trick.

Oh, and (of course), there's nothing special about the IO monad.  I just used it as an example; any monad with references will work.  You can parametrize &lt;tt&gt;E'&lt;/tt&gt; over the underlying monad.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1130234130256692854?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1130234130256692854/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1130234130256692854' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1130234130256692854'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1130234130256692854'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/08/programming-in-c-ummm-haskell-heres.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-3590828127636416983</id><published>2007-07-09T10:07:00.001+01:00</published><updated>2007-07-09T17:56:40.799+01:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Experiencing magic&lt;/span&gt;


What happens when you let a road planner be on crack at work?  You get the &lt;a href="http://en.wikipedia.org/wiki/Magic_Roundabout_%28Swindon%29"&gt;Magic Roundabout&lt;/a&gt;!
I've seen and driven it now.  It's clear to me why the locals sometimes refers to this as the Tragic Roundabout, because there's no way an unsuspecting stranger can navigate through this unless totally sober, super alert, there's full daylight, light traffic, ...

I recommend a Sunday afternoon.

&lt;a href="http://local.live.com/default.aspx?v=2&amp;cp=sktdr5gvk40n&amp;amp;style=o&amp;lvl=2&amp;amp;scene=4343838"&gt;Have a look.&lt;/a&gt;  BTW, this is the the UK, so if you drive on the right, flip the image. :)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-3590828127636416983?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/3590828127636416983/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=3590828127636416983' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3590828127636416983'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/3590828127636416983'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/07/experiencing-magic-what-happens-when.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-4815888566985055207</id><published>2007-07-01T03:16:00.000+01:00</published><updated>2007-07-01T10:07:02.102+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Massive overload&lt;/b&gt;

In &lt;a href="http://augustss.blogspot.com/2007/06/generating-more-code-with-harpy-after.html"&gt;my last post&lt;/a&gt; I had a little DSL embedded in Haskell.  Defining the Fibonacci function looked like this:
&lt;pre&gt;
mdo
    fib &lt;- fun $ \ n -&gt; cond (n .&lt; 2) 1 (fib(n-1) + fib(n-2))
    return fib
&lt;/pre&gt;
Compare that with ordinary Haskell:
&lt;pre&gt;
fib n = if n &lt; 2 then 1 else fib(n-1) + fib(n-2)
&lt;/pre&gt;

Why do they look different?  Could I make my DSL look exactly like Haskell?

First, lets look at the parts that look the same, like &lt;tt&gt;fib(n-1)&lt;/tt&gt;.  In Haskell, with &lt;tt&gt;n&lt;/tt&gt; of type &lt;tt&gt;Int&lt;/tt&gt;, this just an expression and it has some value (depending on &lt;tt&gt;n&lt;/tt&gt;).  In the DSL, with &lt;tt&gt;n&lt;/tt&gt; of type &lt;tt&gt;M Int&lt;/tt&gt; the value of this expression is in fact an abstract syntax tree, namely (something like) &lt;tt&gt;M (App (Func (FuncNo 0)) [App (FPrimOp I_Sub) [Arg 0, Con (ILit 1)]])&lt;/tt&gt;.

The reason we can make the same expression mean different things depending on the type is that numerical operators like &lt;tt&gt;-&lt;/tt&gt; are overloaded, and so they do different things for &lt;tt&gt;Int&lt;/tt&gt; and &lt;tt&gt;M Int&lt;/tt&gt;.

So what about &lt;tt&gt;&amp;lt;&lt;/tt&gt;, can we overload that too?
The &lt;tt&gt;&amp;lt;&lt;/tt&gt; operator is already overloaded and has type &lt;tt&gt;(Ord a) =&gt; a -&gt; a -&gt; Bool&lt;/tt&gt;.  But the &lt;tt&gt;.&amp;lt;&lt;/tt&gt; operator in the DSL returns &lt;tt&gt;M Bool&lt;/tt&gt;, so there's no way we can use &lt;tt&gt;&amp;lt;&lt;/tt&gt;; the return type is fixed.

What to do?  Well, there's nothing sacred with the operator &lt;tt&gt;&amp;lt;&lt;/tt&gt;, it's just another name, but defined in the Prelude.  We can make our own if we hide the Prelude.  So let's do that, but carefully.  We want to be able to still use &lt;tt&gt;&amp;lt;&lt;/tt&gt; as before, and for our &lt;tt&gt;M Bool&lt;/tt&gt; return type.  So we need to overload the return type as well as the argument type.

&lt;b&gt;Overloaded Booleans&lt;/b&gt;
Before we do that, let's define a class that is analogous to &lt;tt&gt;Num&lt;/tt&gt;, but for Boolean values.  We'll need this class soon.  The &lt;tt&gt;Bool&lt;/tt&gt; type has a few important functions and values, let's put those in the class.
&lt;pre
import qualified Prelude as P
import Prelude hiding ((&amp;&amp;), (||), not)

class Boolean b where
    false, true :: b
    (&amp;&amp;), (||) :: b -&gt; b -&gt; b
    not :: b -&gt; b

instance Boolean Bool where
    false = False
    true = True
    (&amp;&amp;) = (P.&amp;&amp;)
    (||) = (P.||)
    not = P.not
&lt;/pre&gt;
First we import the Prelude explicitly, this allows us to hide some things we want to redefine.  The we define the Boolean class, which gives new meaning to some prelude functions.  Finally, we make an instance saying that the new &lt;tt&gt;(&amp;&amp;), (||)&lt;/tt&gt;, and &lt;tt&gt;not&lt;/tt&gt; behave as the old ones for good oldfashioned &lt;tt&gt;Bool&lt;/tt&gt;.

After this definition we can use the Boolean operators just as before, they are just overloaded now, but for &lt;tt&gt;Bool&lt;/tt&gt; they resolve to their old meaning.

&lt;b&gt;Overloaded comparison&lt;/b&gt;
So back to comparison, &lt;tt&gt;Eq&lt;/tt&gt; first.
&lt;pre&gt;
class (Boolean b) =&gt; Eq a b | a -&gt; b where
    (==), (/=) :: a -&gt; a -&gt; b
    x /= y  =  not (x == y)
&lt;/pre&gt;
The new &lt;tt&gt;Eq&lt;/tt&gt; (we need to hide the Prelude one) is now more general; the return type is overloaded as well.  Too avoid ambiguities we have a functional dependency.  The type of the values we compare will determine the type of the Boolean result.  (This isn't necessary, but without this we might need many type annotations.)

Nothing is a member of this class, so we need to add the types we need.  We really should all types that have equality, but let's do two samples.
&lt;pre&gt;
instance Eq Int Bool where
    (==) = (P.==)
    (/=) = (P./=)
instance Eq Double Bool where
    (==) = (P.==)
    (/=) = (P./=)
&lt;/pre&gt;

And since the whole point of this was to allow special equality for our &lt;tt&gt;M&lt;/tt&gt; values, let's look at that too.
&lt;pre&gt;
instance Eq (M Int) (M Bool) where
    (==) = binOp I_EQ
    (/=) = binOp I_NE
&lt;/pre&gt;

A new (simplified) &lt;tt&gt;Ord&lt;/tt&gt; looks similar, both the class and the instances.
&lt;pre&gt;
class (Eq a b) =&gt; Ord a b | a -&gt; b where
    (&lt;), (&lt;=), (&gt;), (&gt;=) :: a -&gt; a -&gt; b
&lt;/pre&gt;

&lt;b&gt;Overloaded conditionals&lt;/b&gt;
So now for the next challange.  The Haskell Fibonacci function uses &lt;tt&gt;if&lt;/tt&gt;, and the DSL uses a special function &lt;tt&gt;cond&lt;/tt&gt;.  Well, now we are stuck.  These is no way at all to overload the special &lt;tt&gt;if&lt;/tt&gt; syntax in Haskell; it's wired in.  This could be viewed as a design mistake in Haskell, but so it is.

So could we use the &lt;tt&gt;cond&lt;/tt&gt; function in the Haskell version too?  Well, not the original &lt;tt&gt;cond&lt;/tt&gt;, it has type &lt;tt&gt;M Bool -&gt; M Int -&gt; M Int -&gt; M Int&lt;/tt&gt;.  We need to overload that too.
&lt;pre&gt;
class (Boolean b) =&gt; Cond a b | a -&gt; b where
    cond :: b -&gt; a -&gt; a -&gt; a
&lt;/pre&gt;
Again, we add a functional dependency.  And again, it will save some ambiguities.

The obvious instances:
&lt;pre&gt;
instance Cond Int Bool where
    cond x y z = if x then y else z

instance Cond (M Int) (M Bool) where
    cond = terOp I_Cond
instance Cond (M Bool) (M Bool) where
    cond = terOp I_Cond
terOp op (M c) (M t) (M e) = M $ App (FPrimOp op) [c, t, e]
&lt;/pre&gt;

Oh, and the instance for &lt;tt&gt;M Bool&lt;/tt&gt; that I skipped before.
&lt;pre&gt;
instance Boolean (M Bool) where
    false  = M $ Con $ LInt 0
    true   = M $ Con $ LInt 1
    x &amp;&amp; y = cond x y false
    x || y = cond x true y
    not x  = cond x false true
&lt;/pre&gt;

After this effort we can write the body of the Fibonacci function the same way for both type &lt;tt&gt;Int&lt;/tt&gt; and type &lt;tt&gt;M Int&lt;/tt&gt;, namely
&lt;pre&gt;
cond (n &lt; 2) 1 (fib(n-1) + fib(n-2))
&lt;/pre&gt;

And what about all these new classes and instances. how do we package them to make them easy to use?
&lt;pre&gt;
module MyPrelude(module Prelude, Boolean(..), Eq(..), Ord(..), Cond(..)) where
import qualified Prelude as P
import Prelude hiding (Eq(..), Ord(..), (&amp;&amp;), (||), not)

...
&lt;/pre&gt;
We hide the stuff we want to override, and re-export the rest of the Prelude.  Any module that wants to use this new Prelude now has to say:
&lt;pre&gt;
module M where
import Prelude()
import MyPrelude
...
&lt;/pre&gt;
After this all regular Haskell code works as before, but we can also take advantage of the new overloadings.

Enough for today.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-4815888566985055207?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/4815888566985055207/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=4815888566985055207' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4815888566985055207'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4815888566985055207'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/massive-overload-in-my-last-post-i-had.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-6298018920768383747</id><published>2007-06-29T17:53:00.001+01:00</published><updated>2007-06-29T19:00:29.114+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;b&gt;Generating more code with Harpy&lt;/b&gt;
After a slight detour into expression representation I'm back to generating code again.

To recap, here's how I'm going to do my embedded expressions in Haskell.  For now I'll stick to ugly looking boolean operators.  Fixing that is a whole other story.
&lt;pre&gt;
module Exp(Exp(..), Lit(..), Fun(..), PrimOp(..)) where

data Exp
    = Con Lit
    | Arg Int
    | App Fun [Exp]
    deriving (Eq, Ord, Show)

data Lit
    = LInt  Int
    deriving (Eq, Ord, Show)

data Fun
    = FPrimOp PrimOp
    deriving (Eq, Ord, Show)

data PrimOp
    = I_Add | I_Sub | I_Mul | I_Quot | I_Rem | I_Neg
    | I_EQ | I_NE | I_LT | I_LE | I_GT | I_GE
    | I_Cond
    deriving (Eq, Ord, Show)
-----------------------------------------------------
module MExp(M(..)) where
import Exp

newtype M a = M { unM :: Exp }
    deriving (Show)
-----------------------------------------------------
module M(M, cond,
         (.==), (./=), (.&lt;), (.&lt;=), (.&gt;), (.&gt;=),
         false, true, (.&amp;&amp;), (.||)
        ) where
import Exp
import MExp

instance Eq (M a)
instance Ord (M a)

instance Num (M Int) where
    x + y  =  binOp I_Add x y
    x - y  =  binOp I_Sub x y
    x * y  =  binOp I_Mul x y
    negate x =  unOp I_Neg x
    fromInteger i = M $ Con $ LInt $ fromInteger i

instance Enum (M Int)
instance Real (M Int) where

instance Integral (M Int) where
    quot x y = binOp I_Quot x y
    rem x y = binOp I_Rem x y

binOp op (M x) (M y) = M $ App (FPrimOp op) [x, y]
unOp op (M x) = M $ App (FPrimOp op) [x]

--------

infix 4 ./=, .==, .&lt;, .&lt;=, .&gt;, .&gt;=
(.==), (./=), (.&lt;), (.&lt;=), (.&gt;), (.&gt;=) :: M Int -&gt; M Int -&gt; M Bool
(.==) = binOp I_EQ
(./=) = binOp I_NE
(.&lt;)  = binOp I_LT
(.&lt;=) = binOp I_LE
(.&gt;)  = binOp I_GT
(.&gt;=) = binOp I_GE

cond :: M Bool -&gt; M Int -&gt; M Int -&gt; M Int
cond (M c) (M t) (M e) = M $ App (FPrimOp I_Cond) [c, t, e]

condB :: M Bool -&gt; M Bool -&gt; M Bool -&gt; M Bool
condB (M c) (M t) (M e) = M $ App (FPrimOp I_Cond) [c, t, e]

false, true :: M Bool
false = M $ Con $ LInt 0
true  = M $ Con $ LInt 1

infixr 3 .&amp;&amp;
infixr 2 .||
(.&amp;&amp;), (.||) :: M Bool -&gt; M Bool -&gt; M Bool
x .&amp;&amp; y = condB x y false
x .|| y = condB x true y
&lt;/pre&gt;
So the &lt;tt&gt;Exp&lt;/tt&gt; type is the internal representation of expressions.  It's just constants, variables, and some primitive operations.
The &lt;tt&gt;M&lt;/tt&gt; (for machine) type is the phantom type that the DSL user will see.  The module &lt;tt&gt;M&lt;/tt&gt; contains all the user visible functions.  Note that it handles booleans by represented the C way, by an int that is 0 or 1.  There's a conditional function, &lt;tt&gt;cond&lt;/tt&gt;, and a corresponding primitive that will serve as our &lt;tt&gt;if&lt;/tt&gt;.

So now we need to generate code for these.  This is all very similar to what I did in a previous posting, I have just refactored some of the code generation.  The invariant (that I arbitrarily decided on) is that each block of code that implements a primitive operation will may clobber EAX but must leave all other registers intact.

This is mostly just a chunk of rather boring code.  Division is causing problems as usual, since the IA32 instructions don't allow you to specify a destination register.  One point worth a look is the code generation for &lt;tt&gt;cond&lt;/tt&gt;.  It has to test a boolean and then select one of two code blocks, but that is just what you'd expect.
&lt;pre&gt;
module CodeGen(cgExp, CGen, cgPrologue, cgEpilogue, compileFun) where
import Prelude hiding(and, or)
import Control.Monad.Trans
import Data.Maybe(fromJust)
import Foreign
import Harpy.CodeGenMonad
import Harpy.X86Assembler

import Exp
import MExp

type StackDepth = Int
type Env = ()
type CGen a = CodeGen Env StackDepth a

addDepth :: StackDepth -&gt; CGen ()
addDepth i = do
    d &lt;- getState
    setState (d+i)

cgExp :: Exp -&gt; CGen ()
cgExp (Con l) = cgLit l
cgExp (Arg a) = cgArg a
cgExp (App (FPrimOp I_Cond) [c, t, e]) = cgCond c t e
cgExp (App f es) = do mapM_ cgExp es; cgFun f (length es)

cgFun (FPrimOp p) _ = cgPrimOp p

cgPrimOp I_Add = twoOp add
cgPrimOp I_Sub = twoOp sub
cgPrimOp I_Mul = twoOp (imul InPlace)
cgPrimOp I_Quot = twoOp (cgQuotRem eax)
cgPrimOp I_Rem = twoOp (cgQuotRem edx)
cgPrimOp I_Neg = oneOp neg
cgPrimOp I_EQ = cmpOp sete
cgPrimOp I_NE = cmpOp setnz
cgPrimOp I_LT = cmpOp setl
cgPrimOp I_LE = cmpOp setle
cgPrimOp I_GT = cmpOp setg
cgPrimOp I_GE = cmpOp setge

cgCond c t e = do
    cgExp c
    popReg eax
    test eax eax
    l1 &lt;- newLabel
    l2 &lt;- newLabel
    jz l1
    cgExp t
    addDepth (-1)               -- pretend last cgExp didn't push anything
    jmp l2
    l1 @@ cgExp e
    l2 @@ return ()

cmpOp op = twoOp $ \ r1 r2 -&gt; do
    cmp r1 r2
    op (reg32ToReg8 r1)
    and r1 (1 :: Word32)

cgLit (LInt i) = do
    mov eax (fromIntegral i :: Word32)
    pushReg eax

cgArg :: Int -&gt; CGen ()
cgArg n = do
    d &lt;- getState
    let o = 4 * (d + n)
    mov eax (Disp (fromIntegral o), esp)
    pushReg eax

cgQuotRem res r1 r2 = do
    push edx                    -- save temp reg
    push r2                     -- push second operand (in case it's edx)
    xmov  eax r1                -- first operand must be in eax
    mov  edx eax
    sar  edx (31 :: Word8)      -- sign extend
    idiv (Disp 0, esp)          -- divide by second operand (now on stack)
    add  esp (4 :: Word32)      -- remove second operand
    xmov  r1 res                -- put result in r1
    pop  edx                    -- restore edx

-- Do a register to register move, but do generate no-ops
xmov r1 r2 = if r1 == r2 then return () else mov r1 r2

reg32ToReg8 (Reg32 r) = Reg8 r

--------

twoOp op = do
    pop ebx
    pop ecx
    op ecx ebx
    push ecx
    addDepth (-1)

oneOp op = do
    pop ebx
    op ebx
    push ebx

pushReg r = do
    push r
    addDepth 1

popReg r = do
    pop r
    addDepth (-1)

--------

compileFun :: (M Int -&gt; M Int) -&gt; CGen ()
compileFun f = do
    ensureBufferSize 1000
    setState startDepth
    cgPrologue
    cgExp $ unM $ f $ M $ Arg 0
    cgEpilogue

----

savedRegs = [ebx, ecx]

startDepth = length savedRegs + 1

-- Push all register we'll be using, except eax.
cgPrologue :: CGen ()
cgPrologue = do
    mapM_ push savedRegs

-- Pop return value to eax, restore regs, and return
cgEpilogue :: CGen ()
cgEpilogue = do
    popReg  eax
    mapM_ pop (reverse savedRegs)
    ret
&lt;/pre&gt;
Given all this, we can generate code for a function and call it from Haskell code.
But that's rather boring since we have only primitive functions and no way to call DSL functions from within the DSL.  So we have no recursion.  And as we all know, recursion is where the fun starts.

So we need to add expressions and code generation for calling functions.  First we extend the &lt;tt&gt;Exp&lt;/tt&gt; module.
&lt;pre&gt;
data Fun
    = FPrimOp PrimOp
    | Func FuncNo

newtype FuncNo = FuncNo Int
&lt;/pre&gt;
The idea being that our DSL functions will be represented by a &lt;tt&gt;FuncNo&lt;/tt&gt; which is an &lt;tt&gt;Int&lt;/tt&gt;.

Next, we will extend the code generation.  While generating code we will keep an environment that maps function numbers to the corresponding labels to call.
&lt;pre&gt;
type Env = [(FuncNo, Label)]
...
cgFun (Func f) n = do
    e &lt;- getEnv
    call $ fromJust $ lookup f e
    add esp (fromIntegral (4 * n) :: Word32)
    addDepth (-n)
    pushReg eax
&lt;/pre&gt;
To call a function we look up its label and generate a call instruction to that label.  Then we pop off the arguments that we pushed before the call, and finally we push EAX (where the return value is).  Simple, huh?

But where did the environment come from?  Well, it's time to be a little inventive and introduce a monad.  Functions are represented by unique integers in the expression type, so we'll need a state monad to generate them.

&lt;pre&gt;
module Gen where
import Control.Monad.State
import Harpy.CodeGenMonad
import Exp
import MExp
import CodeGen

data GenState = GenState { funcs :: [(FuncNo, CGen ())], nextFunc :: Int }

startGenState :: GenState
startGenState = GenState { funcs = [], nextFunc = 0 }

type G a = State GenState a

fun :: MInt2MInt -&gt; G MInt2MInt
fun f = do
    s &lt;- get
    let n = nextFunc s
        fno = FuncNo n
    put GenState{ funcs = (fno, compileFun f) : funcs s,
                  nextFunc = n + 1 }
    return $ \ x -&gt; M $ App (Func fno) [unM x]

runG :: G MInt2MInt -&gt; CGen ()
runG g = do
    let (ret, s) = runState g startGenState
        funs = reverse $ funcs s
    funMap &lt;- mapM (\ (fno, _) -&gt; do l &lt;- newLabel; return (fno, l)) funs
    withEnv funMap $ do
        compileFun ret
        zipWithM_ (\ (f, l) (_, g) -&gt; l @@ g) funMap funs
&lt;/pre&gt;
This code actually has some interesting points.

First, the monad, &lt;tt&gt;G&lt;/tt&gt;, keeps a list of defined functions.  The list has the function number and the &lt;tt&gt;CodeGen&lt;/tt&gt; block that will generate code for it.

Second, the &lt;tt&gt;fun&lt;/tt&gt; function is the one that creates a new function number.  It also calls &lt;tt&gt;compileFun&lt;/tt&gt; to get a &lt;tt&gt;CodeGen&lt;/tt&gt; block that will generate code for the function.  Note that no code is generated at this point.  The &lt;tt&gt;G&lt;/tt&gt; monad is just a simple state monad, not the IO based &lt;tt&gt;codeGen&lt;/tt&gt; monad.  Also note how &lt;tt&gt;fun&lt;/tt&gt; returns an expression that is of the same type as the argument, but it now uses an &lt;tt&gt;App&lt;/tt&gt; to call the function.

Finally, the &lt;tt&gt;runG&lt;/tt&gt; function generates all the code.  It uses &lt;tt&gt;runState&lt;/tt&gt; to obtain the list of defined function and the result to return.  Now code generation starts.  For each function we generate a new label.  Pairing up the function number and the label forms our environment, &lt;tt&gt;funMap&lt;/tt&gt;.  With this environment installed we start code generation for the functions, first the return value, then all the defined function; each with its label attached.

Well, that's it.  Let's wrap it all up.
&lt;pre&gt;
module Compile(compileIO, disasmIO, module M, module Gen) where
import Convert
import Gen
import M

type MInt = M Int

compileIO :: G (MInt -&gt; MInt) -&gt; IO (Int -&gt; Int)
compileIO f = fmap flex $ compileIOW32 f
  where flex g = fromIntegral . g . fromIntegral

compileIOW32 :: G (MInt -&gt; MInt) -&gt; IO (Word32 -&gt; Word32)
compileIOW32 = conv_Word32ToWord32 [] undefined . runG

disasmIO :: G (MInt -&gt; MInt) -&gt; IO String
disasmIO = disasm [] undefined . runG
&lt;/pre&gt;

And, of course, a test.  Here we hit a snag.  We want recursion, but monadic bindings (&lt;tt&gt;do&lt;/tt&gt;) are not recursive.
Luckily, ghc does implement recursive &lt;tt&gt;do&lt;/tt&gt; called &lt;tt&gt;mdo&lt;/tt&gt; (why that name, I have no idea) for any monad that is in the class &lt;tt&gt;MonadFix&lt;/tt&gt;.  And the state monad is, so we are in luck.

&lt;pre&gt;
{-# OPTIONS_GHC -fglasgow-exts #-}
module Main where
import Compile

main = do
    let g = mdo
                fib &lt;- fun $ \ n -&gt; cond (n .&lt; 2) 1 (fib(n-1) + fib(n-2))
                return $ fib
    test &lt;- compileIO g
    print (test 40)
&lt;/pre&gt;
It even runs and produces the right answer, 165580141.  Running time for this example is about 4.0s on my machine.  Running fib compiled with 'ghc -O' takes about 3.4s, so we're in the same ballpark.

Oh, and if anyone wonders what the code looks like for this example, here it is.  This is really, really bad code.
&lt;pre&gt;
003d6cd0  53                            push   ebx
003d6cd1  51                            push   ecx
003d6cd2  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6cd6  50                            push   eax
003d6cd7  e8 08 00 00 00                call   [3d6ce4H]
003d6cdc  83 c4 04                      add    esp,4H
003d6cdf  50                            push   eax
003d6ce0  58                            pop    eax
003d6ce1  59                            pop    ecx
003d6ce2  5b                            pop    ebx
003d6ce3  c3                            ret
003d6ce4  53                            push   ebx
003d6ce5  51                            push   ecx
003d6ce6  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6cea  50                            push   eax
003d6ceb  b8 02 00 00 00                mov    eax,2H
003d6cf0  50                            push   eax
003d6cf1  5b                            pop    ebx
003d6cf2  59                            pop    ecx
003d6cf3  3b cb                         cmp    ecx,ebx
003d6cf5  0f 9c c1                      setl   cl
003d6cf8  83 e1 01                      and    ecx,1H
003d6cfb  51                            push   ecx
003d6cfc  58                            pop    eax
003d6cfd  85 c0                         test   eax,eax
003d6cff  0f 84 0b 00 00 00             je     [3d6d10H]
003d6d05  b8 01 00 00 00                mov    eax,1H
003d6d0a  50                            push   eax
003d6d0b  e9 37 00 00 00                jmp    [3d6d47H]
003d6d10  8b 44 24 0c                   mov    eax,dword ptr [esp+12]
003d6d14  50                            push   eax
003d6d15  b8 01 00 00 00                mov    eax,1H
003d6d1a  50                            push   eax
003d6d1b  5b                            pop    ebx
003d6d1c  59                            pop    ecx
003d6d1d  2b cb                         sub    ecx,ebx
003d6d1f  51                            push   ecx
003d6d20  e8 bf ff ff ff                call   [3d6ce4H]
003d6d25  83 c4 04                      add    esp,4H
003d6d28  50                            push   eax
003d6d29  8b 44 24 10                   mov    eax,dword ptr [esp+16]
003d6d2d  50                            push   eax
003d6d2e  b8 02 00 00 00                mov    eax,2H
003d6d33  50                            push   eax
003d6d34  5b                            pop    ebx
003d6d35  59                            pop    ecx
003d6d36  2b cb                         sub    ecx,ebx
003d6d38  51                            push   ecx
003d6d39  e8 a6 ff ff ff                call   [3d6ce4H]
003d6d3e  83 c4 04                      add    esp,4H
003d6d41  50                            push   eax
003d6d42  5b                            pop    ebx
003d6d43  59                            pop    ecx
003d6d44  03 cb                         add    ecx,ebx
003d6d46  51                            push   ecx
003d6d47  58                            pop    eax
003d6d48  59                            pop    ecx
003d6d49  5b                            pop    ebx
003d6d4a  c3                            ret
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-6298018920768383747?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/6298018920768383747/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=6298018920768383747' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6298018920768383747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6298018920768383747'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/generating-more-code-with-harpy-after.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5567427237624288854</id><published>2007-06-28T17:11:00.001+01:00</published><updated>2007-06-28T22:36:32.489+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Representing DSL expressions in Haskell&lt;/b&gt;

When you want to embed a DSL in Haskell you will almost certainly be faced with having some kind of expressions in your DSL.  If you are lucky you can get away with using the Haskell types as the DSL type (i.e., use the meta types as object types).  E.g., if you DSL needs integer expressions you can use the Haskell type &lt;tt&gt;Integer&lt;/tt&gt;.
But sometimes you're not that lucky, so you need to represent the DSL expressions as a Haskell data type.  What are the options for representing expressions in Haskell?

&lt;b&gt;A normal data type&lt;/b&gt;
Let's start with the most obvious one, an ordinary data type with constructors for the different kinds of expressions.  As an example I'll have a simple expression language that has constants (of type &lt;tt&gt;Int&lt;/tt&gt;), addition, less-or-equal, and conditional.
&lt;pre&gt;
data Exp
    = ConI Int
    | Add Exp Exp
    | LE Exp Exp
    | Cond Exp Exp Exp
    deriving (Show)
&lt;/pre&gt;
Just to exemplify, here's an evaluator for these expressions.  We need a type of values that the evaluator can return as well.
&lt;pre&gt;
data Val = ValI Int | ValB Bool
    deriving (Show)

eval :: Exp -&gt; Val
eval (ConI i) = ValI i
eval (Add x y) = case (eval x, eval y) of
                 (ValI vx, ValI vy) -&gt; ValI (vx + vy)
                 _ -&gt; error "Bad arguments to Add"
eval (LE x y) = case (eval x, eval y) of
                (ValI vx, ValI vy) -&gt; ValB (vx &lt;= vy)
                _ -&gt; error "Bad arguments to LE"
eval (Cond x y z) = case (eval x) of
                    ValB b -&gt; if b then eval y else eval z
                    _ -&gt; error "Bad arguments to Cond"
&lt;/pre&gt;
And a test run:
&lt;pre&gt;
main = print $ eval e
  where e = Cond (LE (ConI 3) (ConI 5)) (Add (ConI 1) (ConI 2)) (ConI 0)
&lt;/pre&gt;
prints &lt;tt&gt;ValI 3&lt;/tt&gt;

This is OK, but look at all the error cases in the evaluator.  Remember that these are expressions in a DSL, so we are going to have expression fragments in our Haskell code.  If we make a mistake, like &lt;tt&gt;Cond (ConI 1) (ConI 1) (ConI 1)&lt;/tt&gt; this is not going to be caught by the Haskell type checker since everything is of type &lt;tt&gt;Exp&lt;/tt&gt;.  Instead it is going to cause some error when the program is run.  Being advocates of static type checking (why else use Haskell?), this is rather disgusting.

&lt;b&gt;GADTs&lt;/b&gt;
So let's try a new and shiny feature in GHC, namely GADTs. (GADTs are new in ghc, but the idea is old; it's been well known in constructive type theory for ages, Kent Petersson and I suggested it as an addition to Haskell almost 15 years ago.)  With a GADT you can specify more precise types for the constructors.
&lt;pre&gt;
data Exp a where
    ConI :: Int -&gt; Exp Int
    Add  :: Exp Int -&gt; Exp Int -&gt; Exp Int
    LE   :: Exp Int -&gt; Exp Int -&gt; Exp Bool
    Cond :: Exp Bool -&gt; Exp a -&gt; Exp a -&gt; Exp a
&lt;/pre&gt;
These are the types we want.  It's now impossible to construct ill typed value of type &lt;tt&gt;Exp t&lt;/tt&gt;; it will be caught by the Haskell type checker.

The evaluator looks very neat and natural with GADTs
&lt;pre&gt;
eval :: Exp a -&gt; a
eval (ConI i) = i
eval (Add x y) = eval x + eval y
eval (LE x y) = eval x &lt;= eval y
eval (Cond x y z) = if eval x then eval y else eval z
&lt;/pre&gt;
GADTs is really the way to go, but it has the disadvantage of not being standard Haskell.  It can also get somewhat cumbersome when we have variables; the evaluator now needs a typed environment.  It's certainly doable, but starting to look less nice.

&lt;b&gt;Phantom types&lt;/b&gt;
Let's explore a third way that in some sense combines the previous two.  The idea is to have an untyped representation, like &lt;tt&gt;Exp&lt;/tt&gt;, but to use an abstract data type on top of it that only allows constructing well typed expressions.
We start with the original type, but rename it to &lt;tt&gt;Exp'&lt;/tt&gt;
&lt;pre&gt;
data Exp'
    = ConI Int
    | Add Exp' Exp'
    | LE Exp' Exp'
    | Cond Exp' Exp' Exp'
    deriving (Show)
&lt;/pre&gt;
On top of this we provide the type the user will see.
&lt;pre&gt;
newtype Exp a = E Exp'
    deriving (Show)

conI :: Int -&gt; Exp Int
conI = E . ConI

add :: Exp Int -&gt; Exp Int -&gt; Exp Int
add (E x) (E y) = E $ Add x y

le :: Exp Int -&gt; Exp Int -&gt; Exp Bool
le (E x) (E y) = E $ LE x y

cond :: Exp Bool -&gt; Exp a -&gt; Exp a -&gt; Exp a
cond (E x) (E y) (E z) = E $ Cond x y z
&lt;/pre&gt;
The functions &lt;tt&gt;conI, add, le&lt;/tt&gt;, and &lt;tt&gt;cond&lt;/tt&gt; are the only ones the DSL user will see.  Note that they have the same types as the constructors in the GADT.  To ensure the DSL user can only use these we need to put it all in a module and export the right things.
&lt;pre&gt;
module Exp(Exp, conI, add, le, cond) where ...
&lt;/pre&gt;

We can write an evaluator again (inside the &lt;tt&gt;Exp&lt;/tt&gt; module).
&lt;pre&gt;
eval :: Exp a -&gt; Val
eval (E x) = eval' x

eval' :: Exp' -&gt; Val
eval' (ConI i) = ValI i
eval' (Add x y) = case (eval' x, eval' y) of
                 (ValI vx, ValI vy) -&gt; ValI (vx + vy)
                 _ -&gt; error "Bad arguments to Add"
eval' (LE x y) = case (eval' x, eval' y) of
                 (ValI vx, ValI vy) -&gt; ValB (vx &lt;= vy)
                 _ -&gt; error "Bad arguments to LE"
eval' (Cond x y z) = case (eval' x) of
                    ValB b -&gt; if b then eval' y else eval' z
                    _ -&gt; error "Bad arguments to Cond"
&lt;/pre&gt;
Not as elegant as the GADT evaluator, but at least the error cases are impossible when expressions are constructed using the exported interface.

And the test would now look like
&lt;pre&gt;
main = print $ eval e
  where e = cond (le (conI 3) (conI 5)) (add (conI 1) (conI 2)) (conI 0)
&lt;/pre&gt;

Phantom types gets its name from the fact that the type variable in the definition of the &lt;tt&gt;Exp&lt;/tt&gt; type doesn't occur anywhere in the constructors; so it's a phantom.  Now and then people suggest implementing a dynamically typed version of Haskell.  Phantom types is one of the things that makes such an implementation difficult.  There are no values that carry the type information at runtime, the types only exist at compile time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5567427237624288854?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5567427237624288854/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5567427237624288854' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5567427237624288854'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5567427237624288854'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/representing-dsl-expressions-in-haskell.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1335498223547976619</id><published>2007-06-26T18:17:00.000+01:00</published><updated>2007-06-26T18:39:42.953+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;b&gt;Disassembly&lt;/b&gt;

The Harpy package also contains a disassembler, so let's put it to work.
&lt;pre&gt;
disasm :: e -&gt; s -&gt; CodeGen e s a -&gt; IO String
disasm e s cg = do
    let cg' = do cg; getCodeBufferList
    (_, r) &lt;- runCodeGen cg' e s
    case r of
        Left msg -&gt; error (show msg)
        Right bs -&gt; fmap concat $ mapM disAsm bs
  where disAsm (ptr, n) = do
            r &lt;- disassembleBlock ptr n
            case r of
                Left msg -&gt; error $ show msg
                Right insns -&gt; return $ unlines $ map showIntel insns
&lt;/pre&gt;

Using the same example as in the last post we get.
&lt;pre&gt;
main = do
    let fun x = (x+1) * x `quot` 2
    str &lt;- disasmIO fun
    putStr str
&lt;/pre&gt;
And here is the sad truth:
&lt;pre&gt;
003d6d20  53                            push   ebx
003d6d21  51                            push   ecx
003d6d22  52                            push   edx
003d6d23  8b 44 24 10                   mov    eax,dword ptr [esp+16]
003d6d27  50                            push   eax
003d6d28  6a 01                         push   1H
003d6d2a  5b                            pop    ebx
003d6d2b  58                            pop    eax
003d6d2c  03 c3                         add    eax,ebx
003d6d2e  50                            push   eax
003d6d2f  8b 44 24 14                   mov    eax,dword ptr [esp+20]
003d6d33  50                            push   eax
003d6d34  5b                            pop    ebx
003d6d35  58                            pop    eax
003d6d36  0f af c3                      imul   eax,ebx
003d6d39  50                            push   eax
003d6d3a  6a 02                         push   2H
003d6d3c  5b                            pop    ebx
003d6d3d  58                            pop    eax
003d6d3e  8b d0                         mov    edx,eax
003d6d40  c1 fa 1f                      sar    edx,1fH
003d6d43  f7 fb                         idiv   eax,ebx
003d6d45  50                            push   eax
003d6d46  58                            pop    eax
003d6d47  5a                            pop    edx
003d6d48  59                            pop    ecx
003d6d49  5b                            pop    ebx
003d6d4a  c3                            ret
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1335498223547976619?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1335498223547976619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1335498223547976619' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1335498223547976619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1335498223547976619'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/disassembly-harpy-package-also-contains.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1936438678790224431</id><published>2007-06-26T11:32:00.000+01:00</published><updated>2007-06-26T23:36:15.284+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Compilation'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;b&gt;A simple compiler&lt;/b&gt;
In my last post I played a little with Harpy to generate code from what looked like assembly language.  So the next logical step is to make a compiler.  And I mean a real machine-code compiler, not some wimpy byte-code nonsense.

To make life simple (it is a &lt;b&gt;simple&lt;/b&gt; compiler, after all) I'm going to start by generating code that uses the stack all the time.  So all operands will live on the stack and all operations on them will push and pop the stack.  Of course, the code generated from such a compiler will make any serious compiler writer weep.

The &lt;tt&gt;CodeGen&lt;/tt&gt; type in Harpy is the monad used during code generation.  It allows you to keep some extra information around besides the generated code; it gives you access to a state monad.  I will use the state to keep track of the current stack depth.  (We will see why soon.)

&lt;pre&gt;
type StackDepth = Int
type Gen = CodeGen () StackDepth ()

addDepth :: StackDepth -&gt; Gen
addDepth i = do
    d &lt;- getState
    setState (d+i)
&lt;/pre&gt;
The &lt;tt&gt;addDepth&lt;/tt&gt; function changes the current stack depth by grabbing the old one, adding the argument and storing it back.  The &lt;tt&gt;getState&lt;/tt&gt; and &lt;tt&gt;setState&lt;/tt&gt; functions don't generate any code, they just manipulate the state available in the &lt;tt&gt;CodeGen&lt;/tt&gt; monad.

With that out of the way, let's implement code generation for addition.
&lt;pre&gt;
gadd :: Gen
gadd = do
    pop  ebx
    pop  eax
    add  eax ebx
    push eax
    addDepth (-1)
&lt;/pre&gt;
It pops the two operands off the stack, adds them, and pushes the result.  The net effect on the stack is that it has one word less on it (I count my stack depth in words), so there's also a call to &lt;tt&gt;addDepth&lt;/tt&gt;.

Subtraction and multiplication are very similar.
&lt;pre&gt;
gsub :: Gen
gsub = do
    pop  ebx
    pop  eax
    sub  eax ebx
    push eax
    addDepth (-1)

gmul :: Gen
gmul = do
    pop  ebx
    pop  eax
    imul InPlace eax ebx
    push eax
    addDepth (-1)
&lt;/pre&gt;

On the i386 (signed) division and remainder is computed with the idiv instruction.  It divides the EDX:EAX 64 bit number with the given operand.  So we must convert a 32 signed number to a 64 bit signed number, this is simply done by moving EAX to EDX and then shifting right 31 steps.  This will copy the sign bit into every bit of EDX.  Depending of if we want the quotient or remainder we need to push EAX or EDX.
&lt;pre&gt;
gquot :: Gen
gquot = do
    gquotRem
    push eax
    addDepth (-1)

grem :: Gen
grem = do
    gquotRem
    push edx
    addDepth (-1)

gquotRem :: Gen
gquotRem = do
    pop  ebx
    pop  eax
    mov  edx eax
    sar  edx (31 :: Word8)
    idiv ebx
&lt;/pre&gt;

To put a constant on the stack we simply push it and increment the remembered stack depth.
&lt;pre&gt;
gconst :: Int -&gt; Gen
gconst c = do
    push (fromIntegral c :: Word32)
    addDepth 1
&lt;/pre&gt;

OK, so now for something more interesting.  Assuming we are generating code for a function, we also want to access the arguments to the function.  Where are the arguments?  Well, according to the IA32 calling conventions the caller pushes the arguments on the stack, so we'll follow those.  First we have a bunch things on the stack, how many is kept track of in the stack depth in the &lt;tt&gt;CodeGen&lt;/tt&gt; monad, and then the arguments follow in order, pushed right-to-left.  So to get an argument we compute the offset, convert it to a byte offset, and push that word on the stack.
&lt;pre&gt;
-- Get the Nth argument
gargN :: Int -&gt; Gen
gargN n = do
    d &lt;- getState
    let o = 4 * (d + n)
    mov  eax (Disp (fromIntegral o), esp)
    push eax
    addDepth 1
&lt;/pre&gt;

When generating code for a function we should not clobber any callee-save registers, so to be on the safe side we save all used registers on function entry and restore them on function exit.  On function exit we also return the result in EAX.
&lt;pre&gt;
savedRegs = [ebx, edx]

-- Push all register we'll be using, except eax.
gprologue :: Gen
gprologue = do
    mapM_ push savedRegs

-- Pop return value to eax, restore regs, and return
gepilogue :: Gen
gepilogue = do
    pop  eax
    mapM_ pop (reverse savedRegs)
    ret
&lt;/pre&gt;

OK, so that was a lot of stuff, let's put it together for a test.
&lt;pre&gt;
testGen = conv_Word32ToWord32 () (length savedRegs + 1) $ do
    gprologue
    gargN 0
    gconst 1
    gadd
    gepilogue

main = do
    test &lt;- testGen
    print (test 10)
&lt;/pre&gt;

The &lt;tt&gt;testGen&lt;/tt&gt; function generates the prologue, push argument, push 1, add, and the epilogue.  The &lt;tt&gt;conv_Word32ToWord32&lt;/tt&gt; (from &lt;a href="http://augustss.blogspot.com/2007/06/playing-with-harpy-recently-there-was.html"&gt;my previous post&lt;/a&gt;) converts the machine code to a Haskell function.  We also have to give the start value of the stack depth.  The stack initially contains the return address and the saved registers, so that's the number we pass.

Running this gives 11, as it should.

OK, so let's actually write a compiler and not just a code generator.  Here is a data type for integer expressions.
&lt;pre&gt;
data Exp
    = Con Int
    | Arg Int
    | BinOp BOp Exp Exp
    deriving (Show)

data BOp = Add | Sub | Mul | Quot | Rem
    deriving (Show)
&lt;/pre&gt;
We have constants, arguments (variables), and a few binary operators.

It's all easily translated to machine code.
&lt;pre&gt;
translate :: Exp -&gt; Gen
translate (Con c) = gconst c
translate (Arg n) = gargN n
translate (BinOp op x y) = do translate x; translate y; binop op
  where binop Add = gadd
        binop Sub = gsub
        binop Mul = gmul
        binop Quot = gquot
        binop Rem = grem
&lt;/pre&gt;

For simplicity, let's compile only functions of one argument for now.
&lt;pre&gt;
compileIOW32 :: (Exp -&gt; Exp) -&gt; IO (Word32 -&gt; Word32)
compileIOW32 f = conv_Word32ToWord32 () (length savedRegs + 1) $ do
    gprologue
    translate (f (Arg 0))
    gepilogue 
&lt;/pre&gt;
This function takes an &lt;tt&gt;Exp-&gt;Exp&lt;/tt&gt; function, by giving this function the argument &lt;tt&gt;Arg 0&lt;/tt&gt; we get an expression to translate.  We tack on the usual prologue and epilogue.

So let's try it.
&lt;pre&gt;
main = do
    let fun x = BinOp Add x (Con 1)
    test &lt;- compileIOW32 fun
    print (test 10)
&lt;/pre&gt;
Which prints 11.

But yuck, writing &lt;tt&gt;BinOp&lt;/tt&gt; etc. isn't nice.  Let's make some instances.

&lt;pre&gt;
instance Eq Exp
instance Ord Exp

instance Num Exp where
    x + y  =  BinOp Add x y
    x - y  =  BinOp Sub x y
    x * y  =  BinOp Mul x y
    fromInteger i = Con (fromInteger i)

instance Enum Exp
instance Real Exp

instance Integral Exp where
    quot x y = BinOp Quot x y
    rem x y = BinOp Rem x y
&lt;/pre&gt;

And give it a whirl:
&lt;pre&gt;
main = do
    let fun x = (x+1) * x `quot` 2
    test &lt;- compileIOW32 fun
    print (test 10)
&lt;/pre&gt;
And this prints 55, as expected.

Not bad, a compiler from (a tiny subset of) Haskell functions to machine code in a few pages.  But I do admit being embarrassed about generating such incredibly poor code.  But there's always future blog posts to rectify that.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1936438678790224431?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1936438678790224431/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1936438678790224431' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1936438678790224431'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1936438678790224431'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/simple-compiler-in-my-last-post-i.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-1801788608980767027</id><published>2007-06-25T18:34:00.000+01:00</published><updated>2007-06-26T18:39:09.577+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Code generation'/><title type='text'></title><content type='html'>&lt;b&gt;Playing with Harpy&lt;/b&gt;
Recently there was an announcement of the &lt;a href="http://uebb.cs.tu-berlin.de/harpy/"&gt;Harpy package&lt;/a&gt; for Haskell.  Harpy is a very cool package that lets you generate machine code from a Haskell program into a memory buffer.  Using the Haskell FFI interface you can then turn the machine code into a Haskell function and call it.  The way that you generate machine code from Harpy looks very much like assembly code.

Oh, btw, apologies to those who are not using x86 machines, this blog post is very machine dependent, since Harpy generate x86 machine code.

&lt;b&gt;A small Harpy example&lt;/b&gt;
First, some sample code (stolen from the Harpy tutorial).  It might look like assembly code, but it's actually Haskell.
&lt;pre&gt;
asm_fac = do
    loopTest  &lt;- newLabel
    loopStart &lt;- newLabel
    ensureBufferSize 160
    push ecx
    mov  ecx (Disp 8, esp)
    mov  eax (1 :: Word32)
    jmp  loopTest
    loopStart @@ mul ecx
    sub  ecx (1 :: Word32)
    loopTest @@ cmp ecx (0 :: Word32)
    jne  loopStart
    pop  ecx
    ret
&lt;/pre&gt;

Just some comments: This is (obviously) monadic code.  The newLabel operation creates a new label that later has to be defined and that can be used as a jump target.  The &lt;tt&gt;ensureBufferSize&lt;/tt&gt; is a tedious function to make sure there is enough space to emit the instructions.  I hope the next version of Harpy will not require this, since it's really the computers job to count bytes, not mine (and it's not hard to do).  Apart from that, most of the code should be obvious for anyone who has done assembly programming on an x86.

To generate code and then convert the generated code we need some utility functions.
&lt;pre&gt;
type Importer f = FunPtr f -&gt; f
foreign import ccall safe "dynamic" import_Word32ToWord32 :: Importer (Word32 -&gt; Word32)

conv_Word32ToWord32 :: e -&gt; s -&gt; CodeGen e s a -&gt; IO (Word32 -&gt; Word32)
conv_Word32ToWord32 = conv import_Word32ToWord32

conv :: (Importer f) -&gt; e -&gt; s -&gt; CodeGen e s a -&gt; IO f
conv imp e s cg = do
    let cg' = do cg; getEntryPoint
    (_, r) &lt;- runCodeGen cg' e s
    case r of
        Left msg -&gt; error (show msg)
        Right addr -&gt; return $ imp $ castPtrToFunPtr addr
&lt;/pre&gt;

The important function is &lt;tt&gt;conv_Word32ToWord32&lt;/tt&gt; which takes a block like &lt;tt&gt;asm_fac&lt;/tt&gt; and turns it into a Haskell function.  The &lt;tt&gt;e&lt;/tt&gt; and &lt;tt&gt;s&lt;/tt&gt; arguments we can ignore for now, they are for more advanced uses of the machine code generator.  Note that there is no type safety here.  The &lt;tt&gt;asm_fac&lt;/tt&gt; block has nothing that says what kind of function it might implement, so it's all up to us to use it correctly.  (Once you use FFI Haskell isn't any safer than C.)

Finally, to use it all:
&lt;pre&gt;
main = do
    fac &lt;- conv_Word32ToWord32 () () asm_fac
    print (fac 5, fac 10)
&lt;/pre&gt;

Pretty nifty, huh?  Calling &lt;tt&gt;conv_Word32ToWord32&lt;/tt&gt; with the &lt;tt&gt;asm_fac&lt;/tt&gt; argument will emit machine code for the factorial function into a memory buffer and then wrap it up so it can be called as a regular Haskell function.

Running the program will print (120,3628800) as expected.

&lt;b&gt;Final thoughts&lt;/b&gt;
Harpy is really cool and uses Haskell type classes in a clever way to make Harpy "assembly" code look almost like normal assembly code.  But there is room for improvement.  Currently you have no control over the buffer handling; Harpy uses &lt;tt&gt;mallocBytes&lt;/tt&gt; internally.  I would like to see the buffer handling (&lt;tt&gt;mallocBytes&lt;/tt&gt;, &lt;tt&gt;peek&lt;/tt&gt;, the IO monad) abstracted out.  That way I don't have to generate code into a memory buffer if I don't want to.  I could generate code in, e.g., a list.  Or I could not generate code at all, but just count bytes.  Or I could allocate buffers with something better than &lt;tt&gt;mallocBytes&lt;/tt&gt;.

On the whole, I think Harpy is great for experimenting with code generation.  More about that soon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-1801788608980767027?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/1801788608980767027/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=1801788608980767027' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1801788608980767027'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/1801788608980767027'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/06/playing-with-harpy-recently-there-was.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-4784718730632823160</id><published>2007-05-07T13:27:00.000+01:00</published><updated>2007-05-07T18:43:31.560+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>I was asked if my fixed number module (&lt;tt&gt;Data.Number.Fixed&lt;/tt&gt;) which has the precision built into the type could be used with a dynamic epsilon, i.e., one that is not know at compile time.  The answer is, yes.  It only takes a fraction of an oleg to do.

A recap:
The idea with the Fixed type was that, e.g., &lt;tt&gt;Fixed Eps1&lt;/tt&gt;, is the type where computation happens with an epsilon of 1.  To get the epsilon for a type I have a class
&lt;pre&gt;class Epsilon e where
    eps :: e -&gt; Rational
instance Epsilon Eps1 where eps _ = 1
...
&lt;/pre&gt;

So what do we do if we need an epsilon of say, 0.01?  Well, there's another instance for the type constructor &lt;tt&gt;EpsDiv10&lt;/tt&gt;:
&lt;pre&gt;
instance (Epsilon e) =&gt; Epsilon (EpsDiv10 e) where
    eps x = eps x / 10
&lt;/pre&gt;
So the type &lt;tt&gt;EpsDiv10 (EpsDiv10 Eps1)&lt;/tt&gt; will have an epsilon of 0.01.  Similarly, with the right set of primitive types and type constructors you could build any epsilon you want.

To simplify a little the function below only finds an epsilon within a factor of 10 from the requested one.
&lt;pre&gt;
dynamicEps :: forall a . Rational -&gt;
             (forall e . Epsilon e =&gt; Fixed e -&gt; a) -&gt;
             (Rational -&gt; a)
dynamicEps r f v = loop (undefined :: Eps1)
 where loop :: forall x . (Epsilon x) =&gt; x -&gt; a
       loop e = if eps e &lt;= r then f (fromRational v :: Fixed x)
                else loop (undefined :: EpsDiv10 x)
&lt;/pre&gt;This function takes a desired (rational) epsilon, r, and a function, f, and returns a function that behaves like f, but that will round its argument to the desired epsilon and compute with that.
&lt;pre&gt;
Main&gt; putStrLn $ dynamicEps 1e-40 (show . sin) 1
0.8414709848078965066525023216302989996226
&lt;/pre&gt;
How does it work?  It's starts by passing something of type Eps1 to loop, and then the loop uses that epsilon if it is small enough, otherwise it will pass something that have an epsilon that is a tenth to the next iteration of the loop.
There are two "tricks" here, first the loop function recurses with an argument of a different type than it was given.  So it needs polymorphic recursion, and thus needs a type signature.  Second, the function, f, has a rank two type.
This way the function can work with any epsilon it is given; otherwise we would
not know that.
Note how the argument to loop is actually never ysed, so &lt;tt&gt;undefined&lt;/tt&gt; works great.  All we need is the right type.  And with scoped type variables this is easy to do.

There are some improvements that should be made to this function: it is inefficient and it doesn't give you the exact epsilon back.  But it has the right general idea.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-4784718730632823160?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/4784718730632823160/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=4784718730632823160' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4784718730632823160'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4784718730632823160'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/05/i-was-asked-if-my-fixed-number-module.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-2157917308021167541</id><published>2007-04-28T11:48:00.000+01:00</published><updated>2007-04-28T11:50:07.563+01:00</updated><title type='text'></title><content type='html'>&lt;b&gt;Frustration&lt;/b&gt;

My trusty MacBook suddenly died, :(
It can no longer boot, and the old way of getting an OpenFirmware prompt doesn't seem to work.
Oh well.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2157917308021167541?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2157917308021167541/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2157917308021167541' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2157917308021167541'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2157917308021167541'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/frustration-my-trusty-macbook-suddenly.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-2628302790469218285</id><published>2007-04-27T20:24:00.000+01:00</published><updated>2007-04-27T20:35:20.085+01:00</updated><title type='text'></title><content type='html'>&lt;b&gt;Fixed precision, an update&lt;/b&gt;

So I was a bit sloppy in my last post.  When doing arithmetic it was performed exactly using Rational and not truncated according to the epsilon for the type.

So, for instance, computing &lt;tt&gt;4/10 + 4/10&lt;/tt&gt; with type &lt;tt&gt;Fixed Eps1&lt;/tt&gt; would give the answer 1.  While this might seem nice, it's also wrong if every operation would be performed with epsilon 1, since 4/10 would be 0, and 0+0 is 0.

So I'll amend my implementation a little.
&lt;pre&gt;
instance (Epsilon e) =&gt; Num (Fixed e) where
    (+) = lift2 (+)
    (-) = lift2 (-)
    (*) = lift2 (*)
    negate (F x) = F (negate x)
    abs (F x) = F (abs x)
    signum (F x) = F (signum x)
    fromInteger = F . fromInteger

instance (Epsilon e) =&gt; Fractional (Fixed e) where
    (/) = lift2 (/)
    fromRational x = r
        where r = F $ approx x (getEps r)

lift2 :: (Epsilon e) =&gt; (Rational -&gt; Rational -&gt; Rational) -&gt; Fixed e -&gt; Fixed e
 -&gt; Fixed e
lift2 op fx@(F x) (F y) = F $ approx (x `op` y) (getEps fx)

approx :: Rational -&gt; Rational -&gt; Rational
approx x eps = approxRational (x + eps/2) eps
&lt;/pre&gt;
So after each operation we add half an epsilon (so we get rounding instead of truncation) and call the magical &lt;tt&gt;approxRational&lt;/tt&gt; to get the closest rational within an epsilon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-2628302790469218285?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/2628302790469218285/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=2628302790469218285' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2628302790469218285'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/2628302790469218285'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/fixed-precision-update-so-i-was-bit.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-8281566451632105396</id><published>2007-04-25T23:22:00.000+01:00</published><updated>2007-04-27T20:40:28.890+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Overloading Haskell numbers, part 3, Fixed Precision.&lt;/b&gt;

Fixed precision numbers can be handy to have sometimes.  Haskell only
provides integral, rational, and floating point in the Prelude, but I
want more!

I want to define a type of fixed precision numbers, i.e., numbers that
have a certain number of decimals.  Rational numbers are handy to
represent these numbers so here we go.
&lt;pre&gt;
newtype Fixed = F Rational deriving (Eq, Ord, Enum, Num, Fractional, Real, RealFrac)
&lt;/pre&gt;
This uses an GHC extension that allows you to derive anything from the
representation type so we don't have to define all those tedious
instances.

But wait, where is the fixed precision?  What I just defined is just
the Rational numbers with a different name.  I really want a type
which is parametrized over what precision it has.

OK, so here we go again:
&lt;pre&gt;
newtype Fixed e = F Rational deriving (Eq, Ord, Enum, Num, Fractional, Real, RealFrac)
&lt;/pre&gt;
Here &lt;tt&gt;e&lt;/tt&gt; is meant to be a type that specifies the precision.

Let's capture those types that we allow as the precision as a type class:
&lt;pre&gt;
class Epsilon e where
    eps :: e -&gt; Rational
&lt;/pre&gt;
The idea here is that the actual argument to the &lt;tt&gt;eps&lt;/tt&gt; function is not
used, it's just a proxy and the type carries all the information.
Here's an example:
&lt;pre&gt;
data Eps1
instance Epsilon Eps1 where
    eps _ = 1
&lt;/pre&gt;
The &lt;tt&gt;eps&lt;/tt&gt; returns 1 for the &lt;tt&gt;Eps1&lt;/tt&gt; type.  So the
intended meaning is that the type &lt;tt&gt;Fixed Eps1&lt;/tt&gt; are numbers
within an epsilon of the right answer, and epsilon is 1.

Similarly
&lt;pre&gt;
data Prec50
instance Epsilon Prec50 where
    eps _ = 1e-50
&lt;/pre&gt;
The type &lt;tt&gt;Fixed Prec50&lt;/tt&gt; has an epsilon of 1e-50, i.e., 50
decimals.  Btw, notice that what looks like a floating point literal
is actually a rational number; one of the very clever decisions in the
original Haskell design.

We can also define type constructors like
&lt;pre&gt;
data EpsDiv10 p
instance (Epsilon e) =&gt; Epsilon (EpsDiv10 e) where
    eps e = eps (un e) / 10
       where un :: EpsDiv10 e -&gt; e
             un = undefined
&lt;/pre&gt;

The type &lt;tt&gt;Fixed (EpsDiv10 Prec50)&lt;/tt&gt; has an epsilon of 1e-50/10 =
1e-51.  Note how the function &lt;tt&gt;un&lt;/tt&gt; is just a dummy to get the
&lt;tt&gt;e&lt;/tt&gt;.  If we had used the scoped type variable extension we
could have avoided this function.

OK, so we can compute with these fixed precision numbers, but where
does the epsilon enter?  Nowhere, so far.  All arithmetic is performed
exactly by the rationals, but our first use of the epsilon is in the
show function.
&lt;pre&gt;
instance (Epsilon e) =&gt; Show (Fixed e) where
    showsPrec = showSigned showFixed
      where showFixed f@(F x) = showString $ show q ++ "." ++ decimals r (eps $ un f)
              where q :: Integer
             (q, r) = properFraction x
      un :: Fixed e -&gt; e
             un = undefined
             decimals a e | e &gt;= 1 = ""
                          | otherwise = intToDigit b : decimals c (10 * e)
                 where (b, c) = properFraction (10 * a)
&lt;/pre&gt;
To print a fixed point number we first print the integer part.  The
&lt;tt&gt;properFraction&lt;/tt&gt; method handily returns the integer part and
some left over fraction that is &lt;1.
We then print decimal after decimal by multiplying the fraction by 10,
converting the integer part to a digit, and then recursing.  While
generating these decimals we also multiply the epsilon by 10 each time
and stop printing when epsilon is &gt;= 1, because then we don't need any
more decimals.

All right, lets put it to the test:
&lt;pre&gt;
Data.Number.Fixed&gt; 1/3 :: Fixed Prec50
0.33333333333333333333333333333333333333333333333333
Data.Number.Fixed&gt; 3/8 :: Fixed Prec10
0.3750000000
Data.Number.Fixed&gt; (1 :: Fixed Prec10) + (0.5 :: Fixed Prec50)
    Couldn't match expected type `Prec10'
           against inferred type `Prec50'
      Expected type: Fixed Prec10
      Inferred type: Fixed Prec50
&lt;/pre&gt;
As we want, we can't mix different fixed point numbers with
different precisions.  Sometimes we do want to convert, so let's
provide such a function
&lt;pre&gt;
convertFixed :: Fixed e -&gt; Fixed f
convertFixed (F x) = F x
&lt;/pre&gt;
It looks like this function does nothing, and it's true.  It just
converts the type, and since the epsilon type is not involved in the
number itself it's trivial.

Well, basic arithmetic isn't that much fun.  What about the
transcendental functions?  Luckily Jan Skibinski has already
implemented all of them.  They compute the various function to with an
epsilon using continued fractions.

The instance declaration looks something like:
&lt;pre&gt;
instance (Epsilon e) =&gt; Floating (Fixed e) where
    sqrt = toFixed1 F.sqrt
    exp = toFixed1 F.exp
    ...

toFixed1 :: (Epsilon e) =&gt; (Rational -&gt; Rational -&gt; Rational) -&gt; Fixed e -&gt; Fixed e
toFixed1 f x@(F r) = F (f (eps (un x)) r)
  where un :: Fixed e -&gt; e
        un = undefined
&lt;/pre&gt;
And all the clever code I stole from Jan is in the module F, see link
below.

Now it's getting more fun.
&lt;pre&gt;
Data.Number.Fixed&gt; pi :: Fixed Prec50
3.14159265358979323846264338327950288419716939937510
Data.Number.Fixed&gt; exp 10 :: Fixed Prec10
22026.4657881843
Data.Number.Fixed&gt; log 2 :: Fixed Prec50
0.69314718055994530941723212145817656807550013436026
Data.Number.Fixed&gt; exp (sin (sqrt (54/42))) :: Fixed Prec10
2.4745696343
Data.Number.Fixed&gt; sqrt 2 :: Fixed Prec500
1.41421356237309504880168872420969807856967187537694807317667973799073247
8462107038850387534327641572735013846230912297024924836055850737212644121
4970999358314132226659275055927557999505011527820605714701095599716059702
7453459686201472851741864088919860955232923048430871432145083976260362799
5251407989687253396546331808829640620615258352395054745750287759961729835
5752203375318570113543746034084988471603868999706990048150305440277903164
5424782306849293691862158057846311159666871301301561856898723724

Data.Number.Fixed Data.Complex&gt; let i = 0 :+ 1 :: Complex (Fixed Prec50) 
Data.Number.Fixed Data.Complex&gt; exp (i*pi)
(-0.99999999999999999999999999999999999999999999999999) :+ 0.00000000000000000000000000000000000000000000000000
&lt;/pre&gt;

Naturally, we need to package up all these definitions in a module
that will keep the Fixed type abstract, and you can also keep the
Epsilon class and the various Epsilon types abstract if you provide
enough building blocks.


Source at &lt;a href="http://www.augustsson.net/Darcs/Data.Number/"&gt;http://www.augustsson.net/Darcs/Data.Number/&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-8281566451632105396?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/8281566451632105396/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=8281566451632105396' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/8281566451632105396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/8281566451632105396'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/overloading-haskell-numbers-part-3.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-4123422667783969014</id><published>2007-04-20T00:51:00.000+01:00</published><updated>2007-04-27T20:40:04.274+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>Following a suggestion from Cale Gibbard I added a convenient feature to Djinn.
You can now define type classes and have contexts on functions.
As with Djinn in general there is no polymorphic instantiation, so the methods of a class must not mention any type variables, but the class parameters.  This makes the whole thing rather limited, but it was a quick hack.

Sample session
&lt;pre&gt;
Welcome to Djinn version 2007-04-20.
Type :h to get help.
Djinn&gt; f ? (Eq a) =&gt; a -&gt; a -&gt; (b,c) -&gt; Either b c
f :: (Eq a) =&gt; a -&gt; a -&gt; (b, c) -&gt; Either b c
f a b =
    case a == b of
    False -&gt; \ (c, _) -&gt; Left c
    True -&gt; \ (_, d) -&gt; Right d
&lt;/pre&gt;

The Djinn source is at darcs.augustsson.net/Darcs/Djinn as usual.

Thanks Cale!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-4123422667783969014?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/4123422667783969014/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=4123422667783969014' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4123422667783969014'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/4123422667783969014'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/following-suggestion-from-cale-gibbard.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-5447010763538785423</id><published>2007-04-14T23:42:00.000+01:00</published><updated>2007-11-30T20:13:43.182Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Overloading Haskell numbers, part 2, Forward Automatic Differentiation.&lt;/b&gt;

I will continue my overloading by some examples that have been nicely illustrated by an &lt;a href="http://users.info.unicaen.fr/~karczma/arpap/diffalg.pdf"&gt;article&lt;/a&gt; by Jerzy Karczmarczuk.  And &lt;a href="http://sigfpe.blogspot.com/2005/07/automatic-differentiation.html"&gt;blogged about by sigfpe&lt;/a&gt;.
But at least I'll end this entry with a small twist I've not seen before.

When computing the derivative of a function you normally do by either symbolic derivation, or by a numerical approximation.
Say that you have a function
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f(x) = x&lt;sup&gt;2&lt;/sup&gt; + 1&lt;/i&gt;
and you want to know the derivative at &lt;i&gt;x=5&lt;/i&gt;.  Doing it symbolically you first get &lt;i&gt;f'&lt;/i&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f'(x) = 2x&lt;/i&gt;
using high school calculus (maybe they don't teach it in high school anymore?), and then you plug in 5
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f'(5) = 2*5 = 10&lt;/i&gt;

Computing it by numeric differentiation you compute
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f'(x) = (f(x+h) - f(x)) / h&lt;/i&gt;
for some small h.  Let's pick h=1e-5, and we get &lt;i&gt;f'(5)&lt;/i&gt; = 10.000009999444615.  Close, but not that good.

So why don't we always use the symbolic method?  Well, some functions are not that easy to differentiate.  Take this one
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;tt&gt;g x = if abs (x - 0.7) &lt; 0.4 then x else g (cos x)&lt;/tt&gt;
What's the derivative?  Well, it's tricky because this is not really a proper definition of &lt;i&gt;g&lt;/i&gt;.  It's an equation that if solved will yield a definition of &lt;i&gt;g&lt;/i&gt;.  And like equations in general, it could have zero, one, or many solutions.
(If we happen to use CPOs there is always a unique smallest solution which is what programs compute, as if by magic.)

If you think &lt;i&gt;g&lt;/i&gt; is contrived, lets pick a different example: computing the square root with Newton-Raphson.
&lt;pre&gt;
sqr x = convAbs $ iterate improve 1
  where improve r = (r + x/r) / 2
        convAbs (x1:x2:_) | abs (x1-x2) &lt; 1e-10 = x2
 convAbs (_:xs) = convAbs xs
&lt;/pre&gt;

So symbolic is not so easy here, and numeric differentiation is not very accurate.
But there is a third way!  &lt;a href="http://en.wikipedia.org/wiki/Automatic_differentiation"&gt;Automatic differentiation.&lt;/a&gt;

The idea behind AD is that instead of computing with with just numbers, we instead compute with pairs of numbers.  The first component is the normal number, and the second component is the derivative.

What are the rules for these numbers?  Let's look at addition
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;(x, x') + (y, y') = (x+y, x'+y')&lt;/i&gt;
To add two numbers you just add the regular part and the derivatives.

For multiplication you have to remember how to compute the derivative of a product:
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;(f(x)*g(x))' = f(x)*g'(x) + f'(x)*g(x)&lt;/i&gt;
So for our pairs we get
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;(x, x') * (y, y') = (x*y, x*y' + x'*y)&lt;/i&gt;
i.e., first the regular product, then the derivative according to the recipe above.

Let's see how it works on
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f(x) = x&lt;sup&gt;2&lt;/sup&gt; + 1&lt;/i&gt;
We want the derivative at &lt;i&gt;x=5&lt;/i&gt;.  So what is the pair we use for &lt;i&gt;x&lt;/i&gt;?  It is (5, 1).
Why?  Well it has to be 5 for the regular part, and since this represents &lt;i&gt;x&lt;/i&gt; and the derivative of &lt;i&gt;x&lt;/i&gt; is 1, the pair is (5, 1).
In the right hand side for f we need to replace 1 by (1,0), since the derivative of a constant is 0.
So then we get
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;f (5,1) = (5,1)*(5,1) + (1,0) = (26,10)&lt;/i&gt;
using the rules above.  And look!  There is the normal result, 26, as well as the derivative, 10.

Let's turn this into Haskell, using the type PD to hold a pair of Doubles
&lt;pre&gt;
data PD = P Double Double deriving (Eq, Ord, Show)
instance Num PD where
    P x x' + P y y' = P (x+y) (x'+y')
    P x x' - P y y' = P (x-y) (x'-y')
    P x x' * P y y' = P (x*y) (x*y' + y'*x)
    fromInteger i   = P (fromInteger i) 0
&lt;/pre&gt;

A first observation is that there is nothing Double specific in this definitions; it would work for any Num.  So we can change it to
&lt;pre&gt;
data PD a = P a a deriving (Eq, Ord, Show)
instance Num a =&gt; Num (PD a) where ...
&lt;/pre&gt;
Let's also add abs&amp;signum and the Fractional instance
&lt;pre&gt;
    ...
    abs (P x x') = P (abs x) (signum x * x')
    signum (P x x') = P (signum x) 0

instance Fractional a =&gt; Fractional (PD a) where
    P x x' / P y y' = P (x / y) ( (x'*y - x*y') / (y * y))
    fromRational r  = P (fromRational r) 0
&lt;/pre&gt;

We can now try the sqr example
&lt;pre&gt;
Main&gt; sqr (P 9 1)
P 3.0 0.16666666666666666
&lt;/pre&gt;
The derivative of x**0.5 is 0.5*x**(-0.5), i.e., 0.5*9**(-0.5) = 0.5/3 = 0.16666666666666666.
So we got the right answer.

BTW, if you want to be picky the derivative of signum is not 0.  The signum function makes a jump from -1 to 1 at 0.  So the "proper" value would be 2*dirac, if dirac is a &lt;a href="http://en.wikipedia.org/wiki/Dirac_delta_function"&gt;Dirac pulse&lt;/a&gt;.  But since we don't have numbers with Dirac pulses (yet), I'll just pretend the derivative is 0 everywhere.

The very clever insight that Jerzy had was that when doing these numbers in Haskell there is no need to limit yourself to just the first derivative.  Since Haskell is lazy we can easily keep an infinite list of all derivatives instead of just the first one.

Let's look at how that definition looks.  It's very similar to what we just did.  But instead of the derivative being just a number, it's now one of our new numbers with a value, and all derivatives...

Since we are now dealing with an infinite data structure we need to define our own show, (==), etc.
&lt;pre&gt;
data Dif a = D a (Dif a)

val (D x _) = x

df (D _ x') = x'

dVar x = D x 1

instance (Show a) =&gt; Show (Dif a) where
    show x = show (val x) 

instance (Eq a) =&gt; Eq (Dif a) where
    x == y  =  val x == val y

instance (Ord a) =&gt; Ord (Dif a) where
    x `compare` y  =  val x `compare` val y

instance (Num a) =&gt; Num (Dif a) where
    D x x' + D y y'          =  D (x + y) (x' + y')
    D x x' - D y y'          =  D (x - y) (x' - y')
    p@(D x x') * q@(D y y')  =  D (x * y) (x' * q + p * y')
    fromInteger i            =  D (fromInteger i) 0
    abs p@(D x x')           =  D (abs x) (signum p * x')
    signum (D x _)           =  D (signum x) 0

instance (Fractional a) =&gt; Fractional (Dif a) where
    recip (D x x') = ip
 where ip = D (recip x) (-x' * ip * ip)
    fromRational r = D (fromRational r) 0
&lt;/pre&gt;
This looks simple, but it's rather subtle.  For instance, take the 0 in the definition of fromInteger.  It's actually of Dif type, so it's a recursive call to fromInteger.

So let's try with our sqr function again, this time computing up to the third derivative.
The &lt;tt&gt;dVar&lt;/tt&gt; is used to create a value for "variable" where we want to differentiate.
&lt;pre&gt;
Main&gt; sqr $ dVar 9
3.0
Main&gt; df $ sqr $ dVar 9
0.16666666666666669
Main&gt; df $ df $ sqr $ dVar 9
-9.259259259259259e-3
Main&gt; df $ df $ df $ sqr $ dVar 9
1.5432098765432098e-3
&lt;/pre&gt;

And the transcendentals in a similar way:
&lt;pre&gt;
lift (f : f') p@(D x x') = D (f x) (x' * lift f' p)

instance (Floating a) =&gt; Floating (Dif a) where
    pi               = D pi 0
    exp (D x x')     = r where r = D (exp x) (x' * r)
    log p@(D x x')   = D (log x) (x' / p)
    sqrt (D x x')    = r where r = D (sqrt x) (x' / (2 * r))
    sin              = lift (cycle [sin, cos, negate . sin, negate . cos])
    cos              = lift (cycle [cos, negate . sin, negate . cos, sin])
    acos p@(D x x')  = D (acos x) (-x' / sqrt(1 - p*p))
    asin p@(D x x')  = D (asin x) ( x' / sqrt(1 - p*p))
    atan p@(D x x')  = D (atan x) ( x' / (p*p - 1))
    sinh x           = (exp x - exp (-x)) / 2
    cosh x           = (exp x + exp (-x)) / 2
    asinh x          = log (x + sqrt (x*x + 1))
    acosh x          = log (x + sqrt (x*x - 1))
    atanh x          = (log (1 + x) - log (1 - x)) / 2
&lt;/pre&gt;

And why not try the function g we defined above?
&lt;pre&gt;
Main&gt; g 10
0.6681539175313869
Main&gt; g (dVar 10)
0.6681539175313869
Main&gt; df $ g (dVar 10)
0.4047642621121782
Main&gt; df $ df $ g (dVar 10)
0.4265424381635987
Main&gt; df $ df $ df $ g (dVar 10)
-1.4395397945007182
&lt;/pre&gt;

It all works very nicely.  So now when we can compute the derivative of a function, let's define something somewhat more interesting with it.

Let's revisit the sqr function again.  It uses Newton-Raphson to find the square root.  How does &lt;a href="http://en.wikipedia.org/wiki/Newton-Raphson"&gt;Newton-Raphson&lt;/a&gt; actually work?  Given a differentiable function, &lt;i&gt;f(x)&lt;/i&gt;, it finds a zero by starting with some &lt;i&gt;x&lt;sub&gt;1&lt;/sub&gt;&lt;/i&gt; and then iterating
&amp;nbsp;&amp;nbsp;&amp;nbsp;  &lt;i&gt;x&lt;sub&gt;n+1&lt;/sub&gt; = x&lt;sub&gt;n&lt;/sub&gt; - f(x&lt;sub&gt;n&lt;/sub&gt;)/f'(x&lt;sub&gt;n&lt;/sub&gt;)&lt;/i&gt;
until we meet some convergence criterion.

Using this, let's define a function that finds a zero of another function:
&lt;pre&gt;
findZero f = convRel $ cut $ iterate step start
    where step x = x - val fx / val (df fx) where fx = f (dVar x)

   start = 1  -- just some value
   epsilon = 1e-10
   cut = (++ error "No convergence in 1000 steps") . take 1000
   convRel (x1:x2:_) | x1 == x2 || abs (x1+x2) / abs (x1-x2) &gt; 1/epsilon = x2
   convRel (_:xs) = convRel xs
&lt;/pre&gt;
The only interesting part is the step function that does one iteration with Newton-Raphson.  It computes &lt;tt&gt;f x&lt;/tt&gt; and then divides the normal value with the derivative.
We then produce the infinite list of approximations using step, then cut it of at some point (in case it doesn't converge), and then we look down the list for two values that are within some relative epsilon.

And it even seems to work.
&lt;pre&gt;
Main&gt; findZero (\x -&gt; x*x - 9)
3.0
Main&gt; findZero (\x -&gt; sin x - 0.5)
0.5235987755982989
Main&gt; sin it
0.5
Main&gt; findZero (\x -&gt; x*x + 9)
*** Exception: No convergence in 1000 steps
Main&gt; findZero (\x -&gt; sqr x - 3)
9.0
&lt;/pre&gt;
Note how it finds a zero of the sqr function which is actually using recursion internally to compute the square root.

So now we can compute numerical derivatives.  But wait!  We also have symbolic numbers.
Can we combine them?  Of course, that is the power of polymorphism.

Let's load up both modules:
&lt;pre&gt;
Data.Number.Symbolic Dif3&gt; let x :: Num a =&gt; Dif (Sym a); x = dVar (var "x")
Data.Number.Symbolic Dif3&gt; df $ x*x
x+x
Data.Number.Symbolic Dif3&gt; df $ sin x
cos x
Data.Number.Symbolic Dif3&gt; df $ sin (exp (x - 4) * x)
(exp (-4.0+x)*x+exp (-4.0+x))*cos (exp (-4.0+x)*x)
Data.Number.Symbolic Dif3&gt; df $ df $ sin (exp (x - 4) * x)
(exp (-4.0+x)*x+exp (-4.0+x)+exp (-4.0+x))*cos (exp (-4.0+x)*x)+(exp (-4.0+x)*x+exp (-4.0+x))*(exp (-4.0+x)*x+exp (-4.0+x))*(-sin (exp (-4.0+x)*x))
&lt;/pre&gt;
We define x to be a differentiable number, "the variable", over symbolic numbers, over some numbers.  And then we just happily use df to get the differentiated versions.

So we set out to compute numeric derivatives, and we got these for free.  Not too bad.

One final note, the Dif type is defined above can be made more efficient by not keeping all the infinite tails with 0 derivatives around.  In a real module for this, you'd want to make this optimization.

[Edit: fixed typo.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-5447010763538785423?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/5447010763538785423/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=5447010763538785423' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5447010763538785423'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/5447010763538785423'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/overloading-haskell-numbers-part-2.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-6604811264383212025</id><published>2007-04-11T23:02:00.000+01:00</published><updated>2007-04-13T13:10:57.380+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='overloading'/><category scheme='http://www.blogger.com/atom/ns#' term='Haskell'/><title type='text'></title><content type='html'>&lt;b&gt;Overloading Haskell numbers, part 1, symbolic expressions.&lt;/b&gt;

Haskell's overloaded numerical classes can be (ab)used to do some symbolic maths.  This is in no way a new discovery, but I thought I'd write a few lines about it anyway since I've been playing with it the last few days.

First we need a data type to represent expressions.  We want constants, variables, and function applications.  But we don't want to fix the type of the constants, so that will be a parameter to the type.
&lt;pre&gt;
data Sym a = Con a | Var String | App String [Sym a]
    deriving (Eq, Show)
&lt;/pre&gt;
And we also take the opportunity to derive Eq and Show.

Now we can actually claim that the type Sym N is a number if N is a number.  Let do it:
&lt;pre&gt;
instance (Num a) =&gt; Num (Sym a) where
    x + y         = App "+" [x, y]
    x - y         = App "-" [x, y]
    x * y         = App "*" [x, y]
    negate x      = App "negate" [x]
    abs    x      = App "abs"    [x]
    signum x      = App "signum" [x]
    fromInteger x = Con (fromInteger x)
&lt;/pre&gt;
A small interactive session shows that we are on the right track.
&lt;pre&gt;
Sym1&gt; let x = Var "x"
Sym1&gt; x*x + 5
App "+" [App "*" [Var "x",Var "x"],Con 5]
Sym1&gt; x*x*x
App "*" [App "*" [Var "x",Var "x"],Var "x"]
Sym1&gt; 2 + 3 :: Sym Int
App "+" [Con 2,Con 3]
&lt;/pre&gt;

We can type in normal looking expressions, but when they are printed the Show instance is used so we get to see the raw syntax tree.  That has its uses, but it gets old quickly.  We want a pretty printer.  To get the precedences right we need to define showsPrec and pass it the right arguments.  It's a little tedious, but nothing strange.
&lt;pre&gt;
instance (Show a) =&gt; Show (Sym a) where
    showsPrec p (Con c) = showsPrec p c
    showsPrec _ (Var s) = showString s
    showsPrec p (App op@(c:_) [x, y]) | not (isAlpha c) =
        showParen (p&gt;q) (showsPrec ql x . showString op . showsPrec qr y)
        where (ql, q, qr) = fromMaybe (9,9,9) $ lookup op [
                   ("**", (9,8,8)),
                   ("/",  (7,7,8)),
                   ("*",  (7,7,8)),
                   ("+",  (6,6,7)),
                   ("-",  (6,6,7))]
    showsPrec p (App "negate" [x]) =
        showParen (p&gt;=6) (showString "-" . showsPrec 7 x)
    showsPrec p (App f xs) =
        showParen (p&gt;10) (foldl (.) (showString f)
                                (map (\ x -&gt; showChar ' ' . showsPrec 11 x) xs))
&lt;/pre&gt;

Let's try the same examples again:
&lt;pre&gt;
Sym2&gt; let x = var "x"
Sym2&gt; x*x + 5
x*x+5
Sym2&gt; x*x*x
x*x*x
Sym2&gt; 2 + 3 :: Sym Int
2+3
&lt;/pre&gt;

Look we can type expressions and get them back again!

The instance Num (Sym a) isn't too bad, the only fishy thing about it is the Eq superclass that is required for Num.  We have Eq for Sym, but it doesn't really behave like it should.  E.g., the expression 'x==1' would come out as False since the syntax trees are not equal.  But this isn't really what we would like, ideally (==) would also turn into something symbol, but that is impossible with the standard Prelude.

Let's make some more instances.  A few of these definitions are just there to appease the Haskell numerical hierarchy and supply some operations it need.
&lt;pre&gt;
instance (Fractional a) =&gt; Fractional (Sym a) where
    x / y          = App "/" [x, y]
    fromRational x = Con (fromRational x)

instance (Real a) =&gt; Real (Sym a) where
    toRational (Con c) = toRational c

instance (RealFrac a) =&gt; RealFrac (Sym a) where
    properFraction (Con c) = (i, Con c') where (i, c') = properFraction c

instance (Floating a) =&gt; Floating (Sym a) where
    pi = App "pi" []
    exp = app1 "exp"
    sqrt = app1 "sqrt"
    log = app1 "log"
    (**) = app2 "**"
    logBase = app2 "logBase"
    sin = app1 "sin"
    tan = app1 "tan"
    cos = app1 "cos"
    asin = app1 "asin"
    atan = app1 "atan"
    acos = app1 "acos"
    sinh = app1 "sinh"
    tanh = app1 "tanh"
    cosh = app1 "cosh"
    asinh = app1 "asinh"
    atanh = app1 "atanh"
    acosh = app1 "acosh"

instance (RealFloat a) =&gt; RealFloat (Sym a) where
    exponent _ = 0
    scaleFloat 0 x = x
    atan2 = app2 "atan2"

app1 :: String -&gt; Sym a -&gt; Sym a
app1 f x = App f [x]

app2 :: String -&gt; Sym a -&gt; Sym a -&gt; Sym a
app2 f x y = App f [x, y]
&lt;/pre&gt;

Let's put this code to the test by bringing the Complex number module into scope.
&lt;pre&gt;
Sym3&gt; :m +Data.Complex
Sym3 Data.Complex&gt; let x=Var "x"; y=Var "y"
Sym3 Data.Complex&gt; sin (x:+y)
sin x*cosh y :+ cos x*sinh y
&lt;/pre&gt;
And by that last expression we have recovered the definition of complex sin as it is given in the Data.Complex module.  Let's try another one.

&lt;pre&gt;
Sym3 Data.Complex&gt; asinh(x:+y)
log (sqrt ((x+sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(0.0+(x*y+y*x))*(0.0+
 (x*y+y*x)))+abs (1.0+(x*x-y*y)))/2.0))*(x+sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x
 -y*y))+(0.0+(x*y+y*x))*(0.0+(x*y+y*x)))+abs (1.0+(x*x-y*y)))/2.0))+(y+abs (0.0+
 (x*y+y*x))/(sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(0.0+(x*y+y*x))*(0.0+
 (x*y+y*x)))+abs (1.0+(x*x-y*y)))/2.0)*2.0))*(y+abs (0.0+(x*y+y*x))/(sqrt ((sqrt
 ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(0.0+(x*y+y*x))*(0.0+(x*y+y*x)))+abs (1.0+(x*x
 -y*y)))/2.0)*2.0)))) :+ atan2 (y+abs (0.0+(x*y+y*x))/(sqrt ((sqrt ((1.0+(x*x-y*y))
 *(1.0+(x*x-y*y))+(0.0+(x*y+y*x))*(0.0+(x*y+y*x)))+abs (1.0+(x*x-y*y)))/2.0)*2.0))
 (x+sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(0.0+(x*y+y*x))*(0.0+(x*y+y*x)))+
 abs (1.0+(x*x-y*y)))/2.0))
&lt;/pre&gt;

Hmmmm, that might be right, but it's rather ugly.  There's also a lot of '0.0+...' in that expression.  We need something that can simplify expressions.  It would also be nice if all constant expressions were evaluated instead of stored.

To achieve this we are going to change the representation a little.  The App constructor will store the real function used to work on constants as well as the name of it.  And while we are at it, we'll get rid of the Var constructor.  We might as well use App with an empty argument list.

Furthermore, since this is starting to look useful, we'll give the module a proper name and export only the interface we want to be visible.  We will hide the details of the Sym type and just export some accessor functions.

The simplification happens in the binOp and unOp functions.  I have just put some algebraic laws there (assuming the underlying numeric type is a field).  The list of rewrites performed by these functions is far from complete.  It's just a few that I found useful.

Note how the code in binOp pattern matches on constants like 0, 1, and -1 directly.  This actually works because of the semantics of Haskell pattern matching against numeric literals.  Also note that the constraint on `a' is just Num, even though we do some simplifications with (/) which belongs in Fractional.

The instance declarations have been extended somewhat so that constant expressions in the Sym type will behave as the corresponding expressions in the underlying type.

A small final run
&lt;pre&gt;
*Data.Number.Symbolic Data.Complex&gt; 1+x+2
3+x
*Data.Number.Symbolic Data.Complex&gt; 1+x*(y-y)-1
0
*Data.Number.Symbolic Data.Complex&gt; sin(x:+1e-10)
sin x :+ 1.0e-10*cos x
*Data.Number.Symbolic Data.Complex&gt; asinh(x:+y)
log (sqrt ((x+sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*(x*y+y*x))+abs
 (1.0+(x*x-y*y)))/2.0))*(x+sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*
 (x*y+y*x))+abs (1.0+(x*x-y*y)))/2.0))+(y+abs (x*y+y*x)/(2.0*sqrt ((sqrt ((1.0+(x*x
 -y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*(x*y+y*x))+abs (1.0+(x*x-y*y)))/2.0)))*(y+abs (x*y
 +y*x)/(2.0*sqrt ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*(x*y+y*x))+abs
 (1.0+(x*x-y*y)))/2.0))))) :+ atan2 (y+abs (x*y+y*x)/(2.0*sqrt ((sqrt ((1.0+(x*x
 -y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*(x*y+y*x))+abs (1.0+(x*x-y*y)))/2.0))) (x+sqrt
 ((sqrt ((1.0+(x*x-y*y))*(1.0+(x*x-y*y))+(x*y+y*x)*(x*y+y*x))+abs (1.0+(x*x-y*y)))/2.0))
&lt;/pre&gt;

As the final example shows, there is still a lot to do.
Also note how the underlying numeric type has defaulted to Double, and we have a loss of precision in the second to last example.  But an implementation of real numbers instead of floating point numbers will have to wait until a later posting.

&lt;pre&gt;
module Data.Number.Symbolic(Sym, var, con, subst, unSym) where

import Data.Char(isAlpha)
import Data.Maybe(fromMaybe)
import Debug.Trace

data Sym a = Con a | App String ([a]-&gt;a) [Sym a]

instance (Eq a) =&gt; Eq (Sym a) where
    Con x      == Con x'        =  x == x'
    App f _ xs == App f' _ xs'  =  (f, xs) == (f', xs')
    _          == _             =  False

instance (Ord a) =&gt; Ord (Sym a) where
    Con x      `compare` Con x'        =  x `compare` x'
    Con _      `compare` App _ _ _     = LT
    App _ _ _  `compare` Con _         = GT
    App f _ xs `compare` App f' _ xs'  =  (f, xs) `compare` (f', xs')

var :: String -&gt; Sym a
var s = App s undefined []

con :: a -&gt; Sym a
con = Con

subst :: (Num a) =&gt; String -&gt; Sym a -&gt; Sym a -&gt; Sym a
subst _ _ e@(Con _) = e
subst x v e@(App x' _ []) | x == x' = v
                         | otherwise = e
subst x v (App s f es) =
    case map (subst x v) es of
    [e] -&gt; unOp (\ x -&gt; f [x]) s e
    [e1,e2] -&gt; binOp (\ x y -&gt; f [x,y]) e1 s e2
    es' -&gt; App s f es'

unSym :: (Show a) =&gt; Sym a -&gt; a
unSym (Con c) = c
unSym e = error $ "unSym called: " ++ show e

instance (Show a) =&gt; Show (Sym a) where
    showsPrec p (Con c) = showsPrec p c
    showsPrec _ (App s _ []) = showString s
    showsPrec p (App op@(c:_) _ [x, y]) | not (isAlpha c) =
        showParen (p&gt;q) (showsPrec ql x . showString op . showsPrec qr y)
        where (ql, q, qr) = fromMaybe (9,9,9) $ lookup op [
                   ("**", (9,8,8)),
     ("/",  (7,7,8)),
     ("*",  (7,7,8)),
     ("+",  (6,6,7)),
     ("-",  (6,6,7))]
    showsPrec p (App "negate" _ [x]) =
        showParen (p&gt;=6) (showString "-" . showsPrec 7 x)
    showsPrec p (App f _ xs) =
        showParen (p&gt;10) (foldl (.) (showString f) (map (\ x -&gt; showChar ' ' . showsPrec 11 x) xs))

instance (Num a) =&gt; Num (Sym a) where
    x + y         = binOp (+) x "+" y
    x - y         = binOp (-) x "-" y
    x * y         = binOp (*) x "*" y
    negate x      = unOp negate "negate" x
    abs    x      = unOp abs    "abs"    x
    signum x      = unOp signum "signum" x
    fromInteger x = Con (fromInteger x)

instance (Fractional a) =&gt; Fractional (Sym a) where
    x / y          = binOp (/) x "/" y
    fromRational x = Con (fromRational x)

-- Assume the numbers are a field and simplify a little
binOp :: (Num a) =&gt; (a-&gt;a-&gt;a) -&gt; Sym a -&gt; String -&gt; Sym a -&gt; Sym a
binOp f (Con x) _ (Con y) = Con (f x y)
binOp _ x "+" 0 = x
binOp _ 0 "+" x = x
binOp _ x "+" (App "+" _ [y, z]) = (x + y) + z
binOp _ x "+" y | isCon y &amp;&amp; not (isCon x) = y + x
binOp _ x "+" (App "negate" _ [y]) = x - y
binOp _ x "-" 0 = x
binOp _ x "-" x' | x == x' = 0
binOp _ x "-" (Con y) | not (isCon x) = Con (-y) + x
binOp _ _ "*" 0 = 0
binOp _ x "*" 1 = x
binOp _ x "*" (-1) = -x
binOp _ 0 "*" _ = 0
binOp _ 1 "*" x = x
binOp _ (-1) "*" x = -x
binOp _ x "*" (App "*" _ [y, z]) = (x * y) * z
binOp _ x "*" y | isCon y &amp;&amp; not (isCon x) = y * x
binOp _ x "*" (App "/" f [y, z]) = App "/" f [x*y, z]
{-
binOp _ x "*" (App "+" _ [y, z]) = x*y + x*z
binOp _ (App "+" _ [y, z]) "*" x = y*x + z*x
-}
binOp _ x "/" 1 = x
binOp _ x "/" (-1) = -x
binOp _ x "/" x' | x == x' = 1
binOp _ x "/" (App "/" f [y, z]) = App "/" f [x*z, y]
binOp f x op y = App op (\ [a,b] -&gt; f a b) [x, y]

unOp :: (Num a) =&gt; (a-&gt;a) -&gt; String -&gt; Sym a -&gt; Sym a
unOp f _ (Con c) = Con (f c)
unOp _ "negate" (App "negate" _ [x]) = x
unOp _ "abs" e@(App "abs" _ _) = e
unOp _ "signum" e@(App "signum" _ _) = e
unOp f op x = App op (\ [a] -&gt; f a) [x]

isCon :: Sym a -&gt; Bool
isCon (Con _) = True
isCon _ = False


instance (Real a) =&gt; Real (Sym a) where
    toRational (Con c) = toRational c

instance (RealFrac a) =&gt; RealFrac (Sym a) where
    properFraction (Con c) = (i, Con c') where (i, c') = properFraction c

instance (Floating a) =&gt; Floating (Sym a) where
    pi = var "pi"
    exp = unOp exp "exp"
    sqrt = unOp sqrt "sqrt"
    log = unOp log "log"
    x ** y = binOp (**) x "**" y
    logBase x y = binOp logBase x "logBase" y
    sin = unOp sin "sin"
    tan = unOp tan "tan"
    cos = unOp cos "cos"
    asin = unOp asin "asin"
    atan = unOp atan "atan"
    acos = unOp acos "acos"
    sinh = unOp sinh "sinh"
    tanh = unOp tanh "tanh"
    cosh = unOp cosh "cosh"
    asinh = unOp asinh "asinh"
    atanh = unOp atanh "atanh"
    acosh = unOp acosh "acosh"

instance (RealFloat a) =&gt; RealFloat (Sym a) where
    floatRadix = floatRadix . unSym
    floatDigits = floatDigits . unSym
    floatRange  = floatRange . unSym
    decodeFloat (Con c) = decodeFloat c
    encodeFloat m e = Con (encodeFloat m e)
    exponent (Con c) = exponent c
    exponent _ = 0
    significand (Con c) = Con (significand c)
    scaleFloat k (Con c) = Con (scaleFloat k c)
    scaleFloat _ x = x
    isNaN (Con c) = isNaN c
    isInfinite (Con c) = isInfinite c
    isDenormalized (Con c) = isDenormalized c
    isNegativeZero (Con c) = isNegativeZero c
    isIEEE = isIEEE . unSym
    atan2 x y = binOp atan2 x "atan2" y
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-6604811264383212025?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/6604811264383212025/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=6604811264383212025' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6604811264383212025'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/6604811264383212025'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/overloading-haskell-numbers-part-1.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-25541020.post-117612774383256677</id><published>2007-04-09T15:08:00.000+01:00</published><updated>2007-04-09T15:19:48.176+01:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Well, it was bound to happen.&lt;/span&gt;

I have started a blog, just like everyone else.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/25541020-117612774383256677?l=augustss.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://augustss.blogspot.com/feeds/117612774383256677/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=25541020&amp;postID=117612774383256677' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/117612774383256677'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/25541020/posts/default/117612774383256677'/><link rel='alternate' type='text/html' href='http://augustss.blogspot.com/2007/04/well-it-was-bound-to-happen.html' title=''/><author><name>augustss</name><uri>http://www.blogger.com/profile/07327620522294658036</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
