Quicksort in Haskell
Quicksort is a commonly used example of how succinct Haskell programming can be.
It usually looks something likes this:
qsort :: (Ord a Bool) => [a] -> [a]
qsort [] = []
qsort (x:xs) = qsort (filter (<= x) xs) ++ [x] ++ qsort (filter (> x) xs)
The problem with this function is that it's not really Quicksort. Viewed through sufficently blurry glasses (or high abstraction altitude) it's looks like the real Quicksort. What they have in common is overall algorithm: pick a pivot (always the first element), then recursively sort the ones that are smaller, the ones that are bigger, and then stick it all together.
But in my opinion the real Quicksort has to be imperative because it relies on destructive update; and it uses a very elegant partitioning algorithm. The partitioning works like this: scan from the left for an element bigger than the pivot, then scan from the right for an element smaller than the pivot, and then swap them. Repeat this until the array has been partitioned.
Haskell has a variety of array types with destructive updates (in different monads), so it's perfectly possible to write the imperative Quicksort in Haskell.
To make the code look reasonably nice I'm going to use my C-like DSEL to write the code.
Here it is:
qsortM :: forall i a m r arr .
(MonadRef m r, Num i, Ix i, MArray arr a m, Ord a Bool) =>
arr i a -> m (arr i a)
qsortM ma = runE $ do
(lb, ub) <- embed $ getBounds ma
let mlb = pure0 lb
mub = pure0 ub
a <- liftArray ma
let qsort' l r =
if1 (r > l) $ do
i <- auto l
j <- auto (r+1)
let v = a[l] :: E m a
iLTj = i < (j :: E m i)
while iLTj $ do
while ((i += 1) < mub && a[i] < v)
skip
while (a[j -= 1] > v)
skip
if1 iLTj $ do
a[i] =:= a[j]
a[l] =:= a[j]
qsort' l (j-1)
qsort' i r
qsort' mlb mub
return ma
So the type is
arr i a -> m (arr i a), i.e.,
qsortM takes an array indexed with
i and elements of type
a. It returns the sorted array, but the sorting takes places in some monad
m.
And then there are all kinds of constraints on the type variables.
The
MonadRef m r says that the monad has to have references so we can have some variables.
The array index has to be in the
Ix; that's part of the general array constraints. It also have to be numeric so we can add and subtract indicies.
The array type has to fulfill
MArray arr a m which means that
arr is an array of
a and updatable in monad
m.
Finally, the elements have to be ordered. I'm not using the normal
Ord class, but instead an overloaded
Ord where the return type is overloaded too.
A few comments on the code. The
liftArray function lifts a regular array into one that can be indexed with
[i]. The
=:= operator swaps two variables. The
skip function does nothing. In traditional C style we do all the side effects while computing the condition.
There are some type signatures in the code that are annoying, but that I have not found a way around yet.
But otherwise, the code proceeds like most any imperative Quicksort. The
i and
j variables scan the array from left and right to locate two elements that need swapping. We then swap them, and continue scanning until the indicies cross. After the partitioning we swap the pivot into place and sort the two parts recursively.
So, this function is polymorphic in the monad. But there is one monad that I think is extra interesting, namely
ST. With this monad you can do references, updatable arrays, etc., and finally you can seal it all of with
runST. The resulting type is pure and shows no signs of what happened inside.
This is a really amazing feat, in my opinion. The type checker performs the proof that nothing about the "dirty" innards leaks out. So instead of some informal reasoning that a function with an impure inside can be pure on the outside you have a machine checked proof. Of course, there's a meta proof that this is all correct, but John Launchbury and Simon Peyton Jones have already done that once, and now we just need the type checker.
Here's the code:
qsortA :: forall i a . (Num i, Ix i, Ord a Bool) => Array i a -> Array i a
qsortA a = runSTArray sa
where sa :: ST s (STArray s i a)
sa = thaw a >>= qsortM
We're using
runSTArray instead of
runST, because it provides an efficient way to turn a mutable array on the inside into an immutable array on the outside.
The
thaw function turns an immutable array into a mutable one, but it has to make a copy to be safe, since we don't want to mutate the original.
Finally, if we want to sort lists we can always convert back and forth.
asList :: (Array Int a -> Array Int a) -> ([a] -> [a])
asList f xs = elems . f . listArray (1, length xs) $ xs
qsort :: (Prelude.Ord a) => [a] -> [a]
qsort = asList qsortA
The final
qsort has the normal type signature (I switched to the normal
Ord again).
Interesting. I've sometimes thought of doing a shuffle function like this, since O(n) shuffling as described by Knuth also needs a constant time swap item operation. Oleg has described a functional shuffle algorithm, but its O(n * log n)
ReplyDeleteOn the other hand I wonder if you couldn't do swapping in some wierd zipper structure. I'll think about it.
Thanks for posting -- I'll definitely be looking at this again as I learn more Haskell.
ReplyDeletewonderful! You've made Haskell into C!
ReplyDeleteI thought the whole point of using high level languages was to favor and focus on higher level abstractions and get away from low level details and mechanics. You've just brought it all into your version: state, lots of verbose low-level machinery and imperative thinkage...
look: if you want C, you know where to get it. Let's not fill Haskell code with lots of monads and verbosity just for the sake of it, right?
namekuseijin: Whilst I appreciate what you're saying I firmly believe that if you can save such a large amount of memory, do so.
ReplyDeletepaul johnson: thanks for the tip about an O(n) shuffle, I'll have a look at it.
Blog Author: Thanks for such an informative article on the 'proper' quicksort.
Thanks for posting this code. Only yesterday I was wondering what quicksort in an 'elegant' language (e.g. lisp) would look like. I tried it, and quickly realized making things as efficient as C wasn't so simple. I remembered that Haskell had the 'shortest' quicksort definition of nearly any language, then wondered what an efficient haskell version would look like... and now thanks to haskellwiki, here I am. :)
ReplyDeleteThanks for the articles. It's so can be increased my knowlegde about this..
ReplyDeleteIndonesia Siap Bersaing di SERP | Rumah Mungil Yang Sehat
So, it's certainly more verbose, and uglier. But how much faster is this code than the elegant version?
ReplyDeleteThe problem is not functional vs. imperative. The extra time of the "elegant" solution is because 'filter' is called twice with the same argument on each recursion, so that total time is much more than twice the neccessary.
ReplyDeleteThe solution to keep it elegant and short is to use a modified version of filter who return both, the filtered list and the rest of the list, in order to call it once:
qsort :: Ord a => [a] -> [a]
qsort [] = []
qsort (x:xs) = qsort ls ++ [x] ++ qsort gs
where (ls, gs) = splitFilter (< x) xs
splitFilter :: (a -> Bool) -> [a] -> ([a], [a])
splitFilter p xs = sep xs [] [] where
sep [] ps qs = (ps, qs)
sep (o:os) ps qs
| p o = sep os (o:ps) qs
| otherwise = sep os ps (o:qs)
Enrique, I'm aware of how to make functional quicksort faster. If you take a look at Data.List you can find a qsort (and a merge sort) originally written by me.
ReplyDeleteI stand by my original post; if you don't do the clever in-place partitioning I don't really think it's quicksort.
Ok, augustss, I agree, sorry. Your point of view is good, as well as your article and your programming work.
ReplyDeleteThe problem is that this article is cited in Why Haskell matters after the typical Haskell version of quicksort with the following sentence:
"This implementation has very poor runtime and space complexity, but that can be improved, at the expense of some of the elegance."
It is true, of course, but it makes people who try Haskell to think that it is not a serious language, just a beautiful academic one. The same idea is expressed in Haskellwiki Introduction:
"the Haskell program allocates quite a lot of extra memory behind the scenes, and runs rather slower than the C program."
My point is that there are better implementations of quicksort in Haskell (ok, not true quicksort) between the typical simple one and the long and difficult to read like yours. So, it is not a criticism to you, but to the introduction texts which trying to show Haskell so simple and elegant makes new people to get out of it.
I am not Haskell expert (that is why I read introduction texts). I noticed later that there is the function 'partition' on Data.List that makes the same as my 'splitFilter'. If we substitute the slow (++) operator with (:) in the second place we gain a bit more. Or better we could use Data.Sequence instead of Data.List, improving efficiency with very low loss in elegance, simplicity and readability. So, introduction texts are comparing a bad quicksort implementation in functional programming with the best in imperative programming. That is my criticism.
Your proposal is very interesting for experts, Haskell with monads provides a cleaner way to do destructive updates than C or other imperative languages.
Nice tutorial sir,
ReplyDeleteI have taken reference from this link below
Its easy to understand
Algorithm and Implementation Program of Quick Sort
http://geeksprogrammings.blogspot.com/2014/02/algorithm-quick-sort-program.html
This comment has been removed by the author.
ReplyDeleteCould you explain how to write the code without the C-like DSEL, or give a link to where I can find the documentation for the DSEL? Thanks.
ReplyDelete