Quicksort

Posted on February 14, 2016 by Tommy McGuire
Labels: theory, software development, notation, pony, Dijkstra, programming language

Recently, I got pointed to another new language (yet another new language). And yet again, this one holds a great deal of appeal for me. Among other things, it is built around actor-based concurrency (which, given my network protocol background makes me a happy person (although network protocols and in-language communication/remote procedure calls are generally a bad idea)), it is strongly typed with a modern type system that includes techniques to make concurrency faster, it uses garbage collection with a custom high-performance concurrent collector, and it ahead-of-time compiles to native code making it fast for the kind of things I tend to do. Plus, it's named Pony. I mean, what more could I want?

While I was playing with it, I discovered a need to sort an array of stuff. Unfortunately, Pony is very new, and hasn't grown a sort yet in its standard library. Fortunately, we have the technology to remedy that. We can make Pony better, stronger, faster. Or something. (Anyway, this is the first reasonably serious Pony code that I've written; I'd appreciate any feedback from anyone on it. The Pony version I'm using is 0.2.1-484-g298b292 (built locally from github).)

(Linguistic note: I'm a native English speaker. Lower array indices are to the left, higher to the right.)

Here's quicksort. It's fast, very simple, and lovely in many respects.

fun sort[T: (Comparable[T] val & Stringable)](a: Array[T] ref) =>
try _quicksort[T](a, 0, a.size() - 1) end

fun _quicksort
[T: (Comparable[T] val & Stringable)]
(a: Array[T] ref, left: USize val, right: USize val) ? =>
let pivot = _find_pivot[T](a, left, right)
(let i, let j) = _partition[T](a, pivot, left, right)
if left < j then
_quicksort[T](a, left, j)
end
if i < right then
_quicksort[T](a, i, right)
end

Here's some syntactic details:

So, that's quicksort. (Yay, Sir Tony Hoare!) Simple, eh? Nothin' to it.

Oh, yeah, there are a couple of other parts: finding the pivot element for partitioning and partitioning the array (which actually does the work of sorting). Guess I should mention those...

The pivot

Finding the right pivot is very important for quicksort. A basic solution is to choose the first or last element of the array, but that causes problems when sorting a previously-sorted or nearly sorted array: quicksort's basic performance is O(n log n) in the usual case, but in the worst case becomes O(n2); the worst case is a previously sorted array when the pivot is chosen from the first or last elements.

In _find_pivot, I'm following a suggestion from Robert Sedgewick: the median-of-three method. The pivot is the median of the first element of the array, the last element of the array, and the midpoint of the array. (First index plus last index, divided by two; yay integer division!)

fun _min[T: Comparable[T] val](a: T, b: T): T =>
if a < b then a else b end

fun _max[T: Comparable[T] val](a: T, b: T): T =>
if a < b then b else a end

fun _find_pivot
[T: Comparable[T] val]
(ary: Array[T] box, left: USize val, right: USize val): T ? =>
let mid = (left + right) / 2
let a = ary(left)
let b = ary(right)
let median = _max[T](_min[T](a,b), _min[T](ary(mid), _max[T](a,b)))
median

It's a bit of a complicated dance with max and min, but you can just take my word for it if you like.

The pivot here is just the value of an array element. I'm not swapping it into the first or last location in the array; instead I am copying the value and treating the pivot's location as just another array location. This may not be the best approach---it involves creating a copy of the value---but should work for a garbage collected language where I can assume that any large, complicated structures are allocated on the heap and so I'm just copying a pointer or other primitive.

Partitioning

In case you haven't heard this often enough, here's how quicksort works: it starts with an array and a pivot, partitions the array into (at least) two parts (each possibly empty) with all of the elements in one part being less than or equal to the pivot and all of the elements in the other part being greater than or equal to the pivot, and then recursively calls itself (as seen above) on the two parts. The recursion stops when there are too few elements in a partition to be sorted (left is greater than or equal to j, for example, in the code above).

There is one complication to partitioning: most published versions of the algorithm use signed integers as array indices, but most modern programming languages use unsigned integers. For example, this is a line from a basic quicksort implementation (in C++) from a presentation by Robert Sedgewick:

int i = l-1, Item v = a[r];

l in this case is the left, lowest, index into the array. This works fine if indices are signed; if l is zero, i is -1, and is a perfectly valid flag value if we are careful not to use it as an index before incrementing it. But if array indices are unsigned and i is used as an index and l is zero, then i is assigned something like 264-1, which is larger than j, l, r, and all the other values around. Ain't nobody going to be happy with that.

Robert even says, as part of a list of subtle implementation issues,

Staying in bounds. If the smallest item or the largest item in the array is the partitioning item, we have to take care that the pointers do not run off the left or right ends of the array, respectively.

But he doesn't mention unsigned indices. Thanks, Robert.

It is not very hard to adjust a simple partition algorithm to work with unsigned integers. The normal partitioning algorithm above has an invariant, a logical predicate that applies after the variables are initialized, after each execution of the loop, and at the end of the loop, such that the next item to look at (in the part of the algorithm moving from left to right using i) is at i+1. Adjusting this algorithm for unsigned indices means changing that invariant to have the next item to look at be at index i.

Invariants like that are an important part of algorithms and writing code in general. But more on that later...

Here is the adjusted partition function:

fun _partition
[T: Comparable[T] val]
(a: Array[T] ref,
pivot: T,
left: USize val,
right: USize val): (USize val, USize val) ? =>
var i = left
var j = right
while i <= j do
while (i < right) and (a(i) < pivot) do
i = i + 1
end
while (left < j) and (a(j) > pivot) do
j = j - 1
end
if i <= j then
if i < j then
a(i) = a(j) = a(i)
end
if i < right then
i = i + 1
end
if left < j then
j = j - 1
end
end
end
(i,j)

Syntax note: One neat feature of Pony is that an assignment expression,

x = y

returns the old value of x, rather than the value of y, which is the new value of x. This makes it possible to swap two values by writing

x = y = x

The right assignment assigns the old value of x to y and the left assignment assigns the old value of y to x. Very nice for type systems that involve uniqueness; I wish the Rust folks had thought of it.

The invariant is:

∧ ∀x : left ⩽ x < i : a(x) ⩽ pivot

∧ ∀y : j < y ⩽ right : pivot ⩽ a(y)

∧ left ⩽ i,j ⩽ right

English translation: all of the elements in the array to the left of i are less than or equal to the pivot. All of the elements in the array to the right of j are greater than or equal to the pivot. Oh, and i and j are valid indices in the array. At the beginning, this is true because i equals left and j equals right.

The first inner loop moves i to the right until it finds an element greater than or equal to the pivot; the second moves j to the left likewise. The two elements are swapped if necessary, and the outer loop continues until i is greater than j. At that point, the invariant insures that the array has been broken into three possibly empty partitions: a leftmost from left to j that is less than or equal to the pivot, a rightmost from i to right that is greater than or equal to the pivot, and a partition between j and i that will be equal to the partition. (Demonstrating that last bit is left as an exercise for the reader. It does provide an interface that comes in useful later.) This code goes to no effort to ensure that the third, middle, partition is not empty; It might be, because of the invariant and because i is greater than j, but there is no effort to maximize it by ensuring all of the elements equal to the pivot are in that partition.

Oh, and above, where I mentioned that i indicated an element that hadn't been examined yet? Having i greater than j ensures that the element indicated by i has been examined.

Quicksort, as presented here, works. The code is relatively simple and likely to result in good code from the compiler. There are a couple of things that could be done to improve it:

Three partitions

The key is to officially partition the array into three parts: one with elements less than the pivot, one with elements equal to the pivot, and one with elements greater than the pivot. The recursive calls are limited to the first and last partitions.

This is where E.W. Dijkstra comes into the picture. One of his creations is a "programming exercise" (as Sedgewick puts it) called the Dutch National Flag problem: sorting an array with three possible values for each element, ending with something that resembles the three color bands of the Netherlands flag. See Dijkstra's book, A Discipline of Programming, for more information. (I thought there would be orange in there, somewhere. Hm.) Unfortunately, Dijkstra's algorithm for the problem is a bit excessive...er, I mean, not quite as snazzy as one from Jon Bentley and Doug McIlroy.

The Bentley-McIlroy algorithm is a variant of the simple partition above, except that as it scans the array it accumulates the elements equal to the pivot at the extreme left and right of the array. During its initial phase, the array is a group of elements equal to the pivot (and seen by the iterator moving right), a group of elements less than the pivot, the unexamined elements, a group of elements greater than the pivot, and another group equal to the pivot (but found by the iterator moving left). After this, it swaps the portions of the array equal to the pivot into the center and produces the less than, equal to, greater than structure that we need.

And with that, we're back to the problem of working an incompatible description of an algorithm into an uncaring programming language. Unfortunately, Bentley-McIlroy is a bit more complicated. I spent a bit of time trying to adjust it, but I could never get it to work properly.

You see, I'm not an algorithms guy. I'm a moderately good systems programmer and network guy, and a tolerable sysadmin, but I do struggle with algorithms. Fortunately, when I'm starting on a project, whether it's a completely new application (or whatever), or an algorithmic struggle, I have a very good rule of thumb: Get something working, even if it's just hello world, then improve it, going from working state to working state, until I'm satisfied with it.

So I decided to start from scratch, using a basic definition of the algorithm. One basic definition of the algorithm and some C++ code is available on slides 8 and 9 of Quicksort is Optimal.

Starting from scratch in this case meant going back to the invariant that I wanted; I came up with this:

∧ ∀w : left ⩽ w < p : a(w) = pivot

∧ ∀x : p ⩽ x < i : a(x) < pivot

∧ ∀y : j < y ⩽ q : pivot < a(y)

∧ ∀z : q < z ⩽ right : pivot = a(z)

∧ left ⩽ i,j,p,q ⩽ right

English translation: All of the elements of the array from left to p, not including p, are equal to the pivot. All of the elements from p to i, not including i, are less than the pivot. All of the elements between j and q, but not including j, are greater than the pivot. All of the elements between q and right are equal to the pivot. And i, j, p, and q are valid indices.

The invariant is established when p and i are equal to left and q and j are equal to right.

And then I wrote the following code following Robert and Jon's examples and discussion for the initial loop maintaining that invariant.

fun _bentley_mcilroy
[T: (Comparable[T] val & Stringable)]
(a: Array[T] ref,
pivot: T,
left: USize val,
right: USize val): (USize val, USize val) ? =>
var p = left
var i = left
var j = right
var q = right
while true do
// step 1
while (i < j) and (a(i) <= pivot) do
i = i + 1
end
// step 2
var p' = p
while p' < i do
if a(p') == pivot then
a(p') = a(p) = a(p')
p = p + 1
end
p' = p' + 1
end
// step 3
while (i < j) and (pivot <= a(j)) do
j = j - 1
end
// step 4
var q' = q
while j < q' do
if a(q') == pivot then
a(q') = a(q) = a(q')
q = q - 1
end
q' = q' - 1
end
// step 5
if i < j then
a(i) = a(j) = a(i)
else
break
end
end
// step 6
if (a(i) <= pivot) and (i < right) then
i = i + 1
end
if (pivot <= a(j)) and (left < j) then
j = j - 1
end
...

There's more to it. I never said the initial states of a project were pretty.

Syntax note: Pony allows a single quote character in identifiers, so I can use q' to indicate a variable that is related to q, but different.

Here's what it does:

  1. Step 1 scan from left to right using i until i points to an element greater than the pivot value (or until i is greater than or equal to j). This helps establish the second branch of the invariant and advances the overall algorithm.

    One note: Sedgewick makes a big point of the fact that this iteration should stop when a(i) == pivot, in order to split the array into similarly sized chunks for the recursive calls. My feeling is that this is less necessary, since a three-way partition will not recurse on elements that are equal to the partition.

  2. Step 2 scan from the left again, starting from p. If one of these elements is equal to the pivot, swap it with p and advance p. This maintains the first branch of the invariant and finishes maintaining the second branch.

  3. Step 3 and Step 4 perform the same operations as Step 1 and Step 2, moving from the right and manipulating j and q. These steps maintain the third and fourth branch of the invariant.

  4. Step 5 If i is less than j, then a(i) is greater than the pivot and a(j) is less than the pivot. Exchange them and continue the loop. If i is greater than or equal to j (By the nature of that code, this branch will only be true when i equals j.), the loop has examined (almost) all of the elements in the array; stop.

  5. Step 6 Since at this point, i equals j and this element hasn't been examined, move i to the right if it is less than or equal to the pivot and move j to the left if it is greater than or equal. Following this step, j < i and any elements between i and j are equal to the pivot.

At this point, we are set up to swap the elements equal to the pivot from the leftmost and rightmost of the array into the center. We have also done a ton of extra work, particularly in Steps 2 and 4; repeatedly scanning from p to i converts this from a linear algorithm to O(n2).

The first thing I did was to remove that extra work, since it was killing my run-times while I was trying to make sure I was on the right track here. It's relatively easy to do; move the declarations of p' and q' outside the loop and use them to record which elements have not only been examined, but have been examined for equality. Then, I was back with a linear running time and a working algorithm.

But the next thing I did obviated the need for that entirely. What the Bentley-McIlroy algorithm wants to do is to fold the loops of Steps 1 and 2 together. Removing p' and fusing the loops produces something like this, for Steps 1 and 2:

  while (i < j) and (a(i) <= pivot) do
if a(i) == pivot then
a(i) = a(p) = a(i)
p = p + 1
end
i = i + 1
end

Following this, i == j or a(i) > pivot. Following similar code for j, i equals j or a(j) < pivot. If i does not equal j, the two elements need to be swapped and the outer loop continued. Otherwise the outer loop needs to be broken. Now, in general, I don't like using break, and I don't like using infinite loops. In this case, I don't have to, since the break is at the end of the iteration. So here's my final version of the first bit of the Bentley-McIlroy partition:

fun _bentley_mcilroy
[T: (Comparable[T] val & Stringable)]
(a: Array[T] ref,
pivot: T,
left: USize val,
right: USize val): (USize val, USize val) ? =>
var p = left
var i = left
var j = right
var q = right
while i < j do
while (i < j) and (a(i) <= pivot) do
if a(i) == pivot then
a(i) = a(p) = a(i)
p = p + 1
end
i = i + 1
end
while (i < j) and (pivot <= a(j)) do
if a(j) == pivot then
a(j) = a(q) = a(j)
q = q - 1
end
j = j - 1
end
if i < j then
a(i) = a(j) = a(i)
end
end
if (a(i) <= pivot) and (i < right) then
i = i + 1
end
if (pivot <= a(j)) and (left < j) then
j = j - 1
end
...

As for the remainder of the function, it needs to swap the elements equal to the pivot from the ends to the middle. For the left end, that is done with this code:

  if p <= j then
var k = left
while k < p do
a(k) = a(j) = a(k)
k = k + 1
j = j - 1
end
else
j = left
end

There are two cases (obviously) here:

The right side of the array needs to be handled similarly.

A slight further improvement is possible. Introducing the new variable k is fun and all, but a certain natural parsimony makes me want to get rid of it if I can. It's not difficult, but requires modifying p before the exchange. And so here, in all its glory, is the final version of the Bentley-McIlroy partition, for unsigned array indices in general and Pony specifically.

fun _bentley_mcilroy
[T: (Comparable[T] val & Stringable)]
(a: Array[T] ref,
pivot: T,
left: USize val,
right: USize val): (USize val, USize val) ? =>
var p = left
var i = left
var j = right
var q = right
while i < j do
while (i < j) and (a(i) <= pivot) do
if a(i) == pivot then
a(i) = a(p) = a(i)
p = p + 1
end
i = i + 1
end
while (i < j) and (pivot <= a(j)) do
if a(j) == pivot then
a(j) = a(q) = a(j)
q = q - 1
end
j = j - 1
end
if i < j then
a(i) = a(j) = a(i)
end
end
if (a(i) <= pivot) and (i < right) then
i = i + 1
end
if (pivot <= a(j)) and (left < j) then
j = j - 1
end
if p <= j then
while left < p do
p = p - 1
a(p) = a(j) = a(p)
j = j - 1
end
else
j = left
end
if i <= q then
while q < right do
q = q + 1
a(q) = a(i) = a(q)
i = i + 1
end
else
i = right
end
(i,j)

For comparison, this is Robert's Bentley-McIlroy 3-way partition in Java (Note: a[lo] holds the pivot.):

    // Bentley-McIlroy 3-way partitioning
int i = lo, j = hi+1;
int p = lo, q = hi+1;
Comparable v = a[lo];
while (true) {
while (less(a[++i], v))
if (i == hi) break;
while (less(v, a[--j]))
if (j == lo) break;

// pointers cross
if (i == j && eq(a[i], v))
exch(a, ++p, i);
if (i >= j) break;

exch(a, i, j);
if (eq(a[i], v)) exch(a, ++p, i);
if (eq(a[j], v)) exch(a, --q, j);
}


i = j + 1;
for (int k = lo; k <= p; k++)
exch(a, k, j--);
for (int k = hi; k >= q; k--)
exch(a, k, i++);

(Careful use of pre- and post-increments is a marketable skill.)

My final version, used in the quicksort function, is slightly slower on my test case than the simple partition above, due to the extra work it does to separate elements equal to the pivot. Which is not bad; the initial working version, even after fixing the O(n2) behavior, was significantly slower. Fiddling around to improve the code, to join loops and simplify tests, also improved its performance.

Victory!

So what?

Weirdly, I've actually got a point to make this time with this pile of gibberish. I know, it's odd, and I'm not entirely sure what to do. Here goes.

There is a stream of anti-intellectualism in software development. (Whew. I said it.)

I think there always has been. Way back, years ago, in fact several oil-booms ago, a grad student I knew in Computer Science went to a job interview at Schlumberger. At that time, some of Schlumberger's seismic processing software was written in Common Lisp, and UT Austin was a big Common Lisp school. During the interview, a senior engineering guy told the student that hiring Computer Science people was a bad idea and that Schlumberger would be better off hiring an engineering student and teaching them to program. (And people wonder where CS's perpetual self-image problem comes from. (CS does have a self-image problem, right? Play along with me, anyway.))

One example of this anti-intellectual sentiment that I ran across while writing this post is from a review of A Discipline of Programming:

Dijkstra's book is mis-titled "A Discipline of Programming" - this title points immediately to a fundamental flaw of the book. It suggests, no actually makes claims, that it will depict a system, a formalism, a scientific method of developing programs for average working programmers, something that we have all dreamed of. But that is not at all what it is. It is, in fact, a book with the pretension and advertised claim that if we master its formalism methodology of program development by weakest precondition considerations, enumerations and deductions we will have a formal symbolic logic tool which will literally generate, on paper, the body of the algorithm needed to accomplish the desired task. In point of fact, such a mechanistic approach might be, and probably is of interest to computer scientists doing research in, for example, machine generation of programs or related areas. If Dijkstra's claims are true, and he does give some beautifully worked out examples, easily translatable to C code, then the title of the books should have been something like "A Straightjacket of Programming". But I believe the correct title for this book should have been, even though it would probably have sold far less copies, "A Discipline of Computer Science".

Bottom line - brilliant author, iconoclastic, but unaware apparently that programming is part art, part science, just like being a Doctor or a physicist. Nothing in life is generated in the manner Dijkstra suggests unless one happens to be a mathematician doing formal proofs, a robot, or AI program, the only entities to which this book should have been addressed other than researchers building such things. In 32 years of software, I've known only a few Bell Telephone engineers who used the method on occasion and by now they've probably forgotten it too.

Now, on one hand, Dijkstra was arrogant. Certainly at least some of his friends, if not he himself, had no particular interest in, or consideration for, what we would consider the "average working programmer". And yes, they made claims for their formal methods that are pretty ridiculous. And yes, those particular methods have mostly vanished. (Regrettably, in my opinion. Roland Backhouse has the most recent textbook on the subject that I know about, Program Construction: Calculating Implementations from Specifications, but that's been a while and I don't know anything about its uptake.)

But I was introduced to the whole weakest-precondition thing as an undergraduate; I read A Discipline of Programming over a winter break and had David Gries' A Science of Programming as a text for a (fairly difficult) class. And even though I have never gone through the whole weakest-precondition method in my professional career (and would really have to work to tool up for it now), it became a significant touchstone whenever I need to write difficult code in an imperative language. (At some point, I should write about the difference between pure functional programming, as in Haskell, and its basis in denotational semantics, and general imperative programming and what we called "axiomatic semantics". There are some fun differences there.)

Tips like these became the base for whatever success I've had slinging code:

But back to anti-intellectualism. It comes it an least two forms. One is like that above: No one really uses that ivory-tower gobbledygook. No one needs that ivory-tower gobbledygook.

Sure, no one needs any particular one; there are more than a few brands of gobbledygook and they're all about equal. But it sure is nice to have some organized way to think about things. Attacking one because you don't always need the whole monstrosity, or because of some visceral reaction to the monstrosity isn't helpful. And misrepresenting it still isn't fair. Even though some of the claims about it verge on saying so, no one really believes that anything will let you mechanically generate algorithms. Yes, there's always going to be something like an art to this; that's the difference between quicksort, heap sort, insertion sort, and all the other sortish things out there. It's also why the book celebrating Dijkstra's sixtieth birthday was titled Beauty Is Our Business.

But the other form of anti-intellectualism is more pernicious: I don't need that gobbledygook. There are examples of this one all over the place, from those boot-camps that claim to make you a web developer in just twelve weeks to famous people arguing that you are better off skipping school and getting practical experience. To get a good job, to make good money, to have a good life, you don't really need all this background stuff. Plus, most of it is already done for you, by really smart people, smarter than you've ever seen, smarter than you 'n me put together, and if you tried to do it, you'd probably just screw up things anyway.

Ok, some short, basic training is all you need to do simple things, and almost all things are simple. Most of the work beyond the basics, in practice, has nothing to do with technical skill. And education does not, in any way, substitute for practical experience. Getting a good job is indeed a skill that can be learned quickly (but for the love of god, don't try to get me to teach you; I'm the bad example, I'm the one who demonstrates what not to do). Making good money, in this industry, doesn't necessarily require great technical expertise.

But most of those smart people who have built all these wonderful things for you to use aren't really all that smart. Those wonderful things have all sorts of corners and sharp edges sticking out, just right to bang an elbow on while they slice your belt and leave you with your trousers around your ankles. The people who built them may know more than you do, but they're not any smarter than you are. And the thing is, practical experience doesn't really substitute for education, either.

This very post was brought about by my own need to do something involving a well-known algorithmic problem. I didn't have a sort, so I had to write my own. It happens, surprisingly often. There are plenty of environments that don't have some common thing that most programmers think they cannot live without: sort functions, hash tables, queues, concurrency gimcracks... If someone stands by their belief that they don't need to know anything about these common tools work, they will limit their opportunities severely, and completely unnecessarily. Learning about these things is not that much harder than avoiding having to learn about them.

But that's not really what most people mean when they say, "I don't need that." What they mean is that, if and when they do need it, they can learn it. Until then, these academic exercises are, well, academic. That is to say, useless.

Unfortunately, that isn't true either. There is nothing fundamentally different between thinking hard about a sort algorithm and thinking hard about any other algorithmic task. And those cover the whole of software development, from sorting stuff to figuring out which rows of a database need to be returned (and how to do so quickly---remember, latencies longer than a second mean eyeballs meandering somewhere else), to all the security trivia that keeps your own favorite name from appearing on court documents or worse, news articles.

And how do you learn how to do this kind of thinking? If there is a better way than by studying existing solutions to well-known problems, I do not know what it is. And while you're at it, taking the time to look at the tools that other people have found useful when they were thinking about these things, not as academic exercises, but as real-life problems that they have to solve before they get to eat dinner.

active directory applied formal logic ashurbanipal authentication books c c++ comics conference continuations coq data structure digital humanities Dijkstra eclipse virgo electronics emacs goodreads haskell http java job Knuth ldap link linux lisp math naming nimrod notation OpenAM osgi parsing pony programming language protocols python quote R random REST ruby rust SAML scala scheme shell software development system administration theory tip toy problems unix vmware yeti
Member of The Internet Defense League
Site proudly generated by Hakyll.