Applied Formal Logic: Verifying Quicksort
Have you ever tried to solve a problem, gotten stuck, backed off, tried again, changed approaches, failed, and then suddenly, with some approach that you’d already abandoned, solved the problem as sweetly as Penelope Doglington begging for part of your sandwich? I’ve had one of those weeks, and now I honestly don’t know what the problem was. Everything seems to be all better now.
So, here it is: my proof of Quicksort. The proof, which is mine, is mine. And therefore, my proof, which belongs to me, is as follows.
Proving sorting algorithms
The proof of a sorting algorithm generally has two major steps:
A proof that the train coming out the far end of the tunnel is sorted, and
A proof that the train is made up of the same cars that went into the tunnel.
(Yeah, I don’t know where the train thing came from. Let’s all hope it doesn’t happen again.)
In other words, there is a proof the resulting array is sorted, and a proof that the resulting array is a permutation of the input array, and the two proofs have little to do with each other. Further, weirdly, proving that the result is sorted is frequently easier than proving that the result is a permutation.
There are, however, several other steps in using a verification tool like Frama-C‘s WP plugin to prove a function. Yannick Moy, in his post, VerifyThis Challenge in SPARK, uses the exercise of proving a pair insertion sort to show what some of those steps are, from the standpoint of AdaCore and Thales’ Implementation Guidance for the Adoption of SPARK. The post also demonstrates that the steps short of a complete proof of correctness provide significant value.
Here, in this post, in addition to presenting the proof, I would like to look at the value of the ACSL annotations as documentation for the code. These annotations provide two very important pieces of information:
Functions’ contracts supply detailed and guaranteed documentation about what the functions do, in the form of assertions which the user can rely on.
Internal annotations such as loop invariants show the expected state at the given locations in the code. They may not describe why the code is doing what it is doing, but they do hopefully describe very thoroughly what the code is doing.
But first, some formalities. I am sorting an array of ints, with the range of indexes to be specified by a lower bound, l, and an upper bound, u. The range is inclusive-exclusive: [l, u). Almost all of the predicates and functions have parameters including array
, l, and u.
Sorting
As I mentioned above, sorting was the easiest of the two major components to prove, although it wasn’t entirely straightforward. Here is how I define sorted formally:
/*@
@ predicate sorted(int *array, int l, int u) =
@ ∀ int p, q; l ≤ p ≤ q < u ⇒ array[p] ≤ array[q];
@*/
In this predicate, p and q are array indexes between l (inclusive) and u (exclusive), with q ranging from p to u.
The first thing the Quicksort algorithm does (Spoilers!) is to identify a pivot element in the array, then partition the array into two (for this simplest version) sub-arrays, one with elements less than or equal to the pivot, and one with elements greater than the pivot. The predicate partitioned describes the array once that step has been taken.
/*@
@ predicate partitioned(int *array, int l, int m, int u) =
@ (∀ int p; l ≤ p < m ⇒ array[p] ≤ array[m])
@ ∧ (∀ int q; m < q < u ⇒ array[m] < array[q]);
@*/
Quicksort works (More spoilers!) by partitioning the array around a pivot and then recursing on the two subarrays. After the recursion, though, it’s necessary to demonstrate that the recursion preserves the partitioning; i.e. that the subarrays are still less than or equal to the pivot, or greater than it.
The following two ACSL predicates describe the preservation: given two states of the program, L1 and L2, if the elements in the specified range are less than or equal to the value v in state L1, then they must also be less than or equal to the element in state L2.
Technically, the two states aren’t states, they are C labels, as in the destination of a goto
. The use of locations in the text of the program, rather than some kind of underlying, denotational-semantics-ish state, is an important philosophical point in the difference between Hoare-style axiomatic verification and something like dependent typing. I’ll go deeper into that at some point.
In any case, here’s prsv_leq
and prsv_gt
:
/*@
@ predicate prsv_leq{L1,L2}(int *a, int l, int u, int v) =
@ (∀ int p; l ≤ p < u ⇒ \at(a[p],L1) ≤ v) ⇒
@ (∀ int p; l ≤ p < u ⇒ \at(a[p],L2) ≤ v);
@
@ predicate prsv_gt{L1,L2}(int *a, int l, int u, int v) =
@ (∀ int p; l ≤ p < u ⇒ \at(a[p],L1) > v) ⇒
@ (∀ int p; l ≤ p < u ⇒ \at(a[p],L2) > v);
@*/
I will discuss the use of these guys below.
Permutation
As I mentioned above, proving that the result is a permutation of the input was a bit of a hairy problem for me. I see three ways of doing it:
The ACSL document (Example 2.47, page 56 in my version) has an axiomatic specification of a permutation predicate. This specification describes the form of the predicate and then gives axiomatic rules for the conditions under which it is true. For permutation, the rules describe the predicate as being preserved by a swap (as described below), as well as being reflexive, symmetric, and transitive.
Unfortunately, this was the approach I settled on first. You see, it seems to have a nasty habit of blowing up the search space for the proofs of every verification condition. I did eventually manage to get it to work, but I think I had a typo in one of the permutation axioms; when I corrected that, parts of the proof of sorting blew up.
If a modified array is a permutation of an earlier state of the same array, then there is a “witness” mapping between the locations of elements of the two. If you can prove that
array[i] == original[ witness[i] ]
, for all i, then you have a permutation. Further,witness
is easy to construct and maintain if the changes are due to swapping elements; it starts as the identity mapping:witness[i] == i
, and updating it to track exchanges is as easy as swapping thewitness
elements at the same time and in the same way that the array elements are swapped.I played with this approach for quite a while. In fact, I got out Ada, SPARK, gnatprove, and the GNAT Programming Studio to see if I would have better luck in that environment (and damn, it’s sweet). But since I was foiled in my attempt to allocate memory (I’ve got to look into that), this technique requires threading the original and witness arrays through the sorting functions. It works pretty well (although I then couldn’t prove sorting in SPARK) but doesn’t really seem kosher. You can’t have ghost parameters, for one thing.
The third method requires counting the number of occurrences of each element in the array, then comparing the counts. If they’re the same, then arrays are permutations. Simple, right? This is the approach I finally found to work. I really don’t know why it didn’t seem happy the first time around.
(Technically, another method of demonstrating a permutation would be to sort the two arrays and then compare the two resulting canonical representations. But circularity is circular. Insert recursion joke here, too.)
The counting method has a bit of a subtlety. The point is not to actually count the elements of the array; instead, I define a way to count elements, count_elt
, and then prove that the count isn’t changed by the operations in the sorting function.
The proof-time function count_elt
uses recursion to count all the elements matching a value v, or rather defining what would happen if someone were to count the elements.
/*@
@ logic integer count_elt(int *a, integer l, integer u, int v) =
@ (l == u) ? 0
@ : (((v == a[l]) ? 1 : 0) + count_elt(a, l + 1, u, v));
@*/
And the permutation
predicate uses that ability to count to demonstrate that the same segment of two arrays, a, and b, are permutations. The condition is
∀ v : count_elt(a, l, u, v) ≡ count_elt(b, l, u, v)
In other words, for all values v, the number of occurrences of v in a is the same as the number in b. (Do you see why we’re not going to actually count anything?)
/*@
@ predicate permutation(int *a, int *b, int l, int u) =
@ \valid(a + (l .. u - 1)) ∧ \valid(b + (l .. u - 1)) ∧
@ ∀ int v; count_elt(a, l, u, v) == count_elt(b, l, u, v);
@*/
That is all the mechanism I need in order to demonstrate permutation-hood. (It’s possible the validity of the arrays was missing when I first tried it. Maybe that’s needed? It shouldn’t be. I dunno.)
Above, when I described prsv_leq
, I said that it was necessary to show that the recursive calls preserved the partition properties. In theory, that would be a trivial corollary of the fact that the pre- and post-recursion arrays are permutations. However, it seems like the permutation properties are not specific enough for the prover—that the count of elements is preserved is not enough for it to show that the properties of the elements are preserved. Therefore, this information is partially duplicated between prsv_leq
and prsv_gt
and permutation
.
Swapping elements
There are two basic operations for comparison sorting algorithms: a less-than/greater-than comparison, and exchanging two elements in the array being sorted. swap
is the latter operation.
/*@
@ requires
@ \valid(array + i)
@ ∧ \valid(array + j);
@ assigns
@ array[i], array[j];
@ ensures
@ array[i] == \old(array[j])
@ ∧ array[j] == \old(array[i]);
@*/
void
swap(int *array, int i, int j)
{
int tmp = array[i];
array[i] = array[j];
array[j] = tmp;
}
Given an array and two valid locations, the result is exactly what you would expect. (Truth in advertising!)
Choosing the pivot
In normal behavior, by which I mean under the best of circumstances, Quicksort is an O(n log n) algorithm, which is optimal for a comparison sort. It is also one of the faster O(n log n) sorts. The element of the array used as the pivot of the partition is somewhat important to the performance of Quicksort. One traditional option is to pick the first element of the array; this leads to O(n2) behavior if the array is already sorted. Likewise, choosing the last element has the same O(n2) problem. In both cases, partitioning picks one element of the array as the pivot and uses the entire rest of the array as one side of the partition.
pivot
here picks the midpoint of the array segment, which is a reasonable choice. However, the interface of the function includes the array, allowing for a possibly smarter choice, such as median-of-three or Tukey’s ninther, the median-of-three-medians-of-three. (Another option is the Chinese zither.)
/*@
@ requires
@ 0 ≤ l < l + 1 < u < INT_MAX;
@ requires
@ \valid(array + (l .. u - 1));
@ assigns
@ \nothing;
@ ensures
@ l ≤ \result < u;
@*/
int
pivot(int *array, int l, int u)
{
return l + ((u - l) / 2);
}
This is one case where the ACSL annotations should not and need not be too precise. If I were more detailed about which element was chosen, the annotations would require modification. As is, I can adjust the pivot choice without too much extra work. The point being, it does not matter what choice is made for the correctness of the algorithm. As long as the pivot element is within the array segment, we’re all good as far as the correctness of the algorithm goes.
Partitioning the array
The next function to be presented is partition
, the workhorse of the algorithm. It is complicated, so I have attached line numbers for your locating pleasure. I’ve also broken it into pieces so that I can see what I’m writing about.
1 /*@
2 @ requires
3 @ 0 ≤ l < l + 1 < u < INT_MAX;
4 @ requires
5 @ \valid(array + (l .. u - 1));
6 @ assigns
7 @ *(array + (l .. u - 1));
8 @ ensures
9 @ l ≤ \result < u;
10 @ ensures
11 @ partitioned(array, l, \result, u);
12 @ ensures
13 @ ∀ int v; prsv_leq{Pre,Post}(array, l, u, v);
14 @ ensures
15 @ ∀ int v; prsv_gt{Pre,Post}(array, l, u, v);
16 @ ensures
17 @ permutation(\old(array), array, l, u);
18 @*/
19 int
20 partition(int *array, int l, int u)
The two requirements, lines 2-5, simply ensure the validity of the array. The assigns clause, lines 6-7, indicates that the function only modifies the array segment, but can modify all the locations in the segment.
The postconditions are where the function gets interesting.
The first, lines 8-9, assert that the pivot is within the array.
The second, lines 10-11, notes that the function creates a partition around the element returned,
\result
. The elements between l and\result
are less than or equal to the element at\result
, and the elements between\result
and u are greater.The third and fourth, lines 12-15, indicate that
partition
preserves the relationship between the elements in the array segment and any otherwise-unspecified number. This becomes important later.Finally, the last postcondition, lines 16-17, asserts that the array after the function is a permutation of the array before the function (as indicated by the ACSL notation
\old
).
These clauses describe the important details of what partition
does concisely, from what it requires to what it provides. Some of the clauses, like lines 12-15, duplicate in a different form other clauses, lines 16-17, which is an unhappy consequence of the proof technology. It might be possible to add that corollary as a lemma function or something, but again, that is extra work that does not improve the Quick Sort implementation.
Now to the first part of partition
itself.
19 int
20 partition(int *array, int l, int u)
21 {
22 int i = l + 1;
23 int j = u - 1;
24 int m = pivot(array, l, u);
25 swap(array, l, m);
26 /*@
27 @ loop invariant
28 @ l < i ≤ j < u;
29 @ loop invariant
30 @ (∀ int p; l < p < i ⇒ array[p] ≤ array[l])
31 @ ∧ (∀ int q; j < q < u ⇒ array[l] < array[q]);
32 @ loop invariant
33 @ ∀ int v; prsv_leq{Pre,Here}(array, l, u, v);
34 @ loop invariant
35 @ ∀ int v; prsv_gt{Pre,Here}(array, l, u, v);
36 @ loop invariant
37 @ permutation(\at(array, Pre), \at(array, Here), l, u);
38 @ loop assigns
39 @ i, j, *(array + (l .. u - 1));
40 @*/
41 while (i < j) {
...
partition
uses three variables, i, j, and m. The pivot, m, is not used further, but here, line 24-25, it allows the pivot element to be swapped into a safe location at the lowest element of the array. In lines 22-23, i is initialized to be the second location in the array, while j is initialized to the last location in the array segment. In general terms, the outer loop, begun on line 41, maintains the condition that array elements between l + 1 and i and between j and u have been compared with the pivot, with array[i]
and array[j]
being the next elements to be examined.
Specifically, the outer loop has a stack of invariants.
Lines 27-28 put limits on the values of i and j, a safety property, as is the assigns clause on lines 38-39.
The invariant on lines 29-31 describe the partial partitioning of the segment.
The two invariants on lines 32-35 assert
prsv_leq
andprsv_gt
for the body of this loop, andlines 36-37 indicate that the current value of the array (i.e.
\at(array,Here)
) is a permutation of the array when the function was entered.
In documentation terms, this block is kind of a mess. The only drastically important part is the second loop invariant, lines 29-31, which describes how the array is partitioned: by moving the two partitioned segments from the ends of the array towards the center. The first invariant is arguable, since it’s good to know the loop variables are well-behaved, but the last three exist only to allow the function postconditions to be proved.
Inside the body of the outer loop are two further loops.
...
40 @*/
41 while (i < j) {
42 //@ ghost CurrentOuter:
43 /*@
44 @ loop invariant
45 @ l + 1 ≤ i ≤ j < u;
46 @ loop invariant
47 @ ∀ int p; \at(i, CurrentOuter) ≤ p < i ⇒
48 @ array[p] ≤ array[l];
49 @ loop assigns
50 @ i;
51 @*/
52 while (i < j && array[i] <= array[l]) {
53 i += 1;
54 }
55 //@ assert ∀ int p; l < p < i ⇒ array[p] <= array[l];
56 /*@
57 @ loop invariant
58 @ l + 1 ≤ i ≤ j < u;
59 @ loop invariant
60 @ ∀ int q; j < q ≤ \at(j, CurrentOuter) ⇒
61 @ array[l] < array[q];
62 @ loop assigns
63 @ j;
64 @*/
65 while (i < j && array[l] < array[j]) {
66 j -= 1;
67 }
68 //@ assert ∀ int q; j < q < u ⇒ array[l] < array[q];
...
The two loops are symmetric. The first increments i, and the second decrements j, as long as doing so satisfies the partitioning property. These two loops close the distance between i and j until either
i = j, in which case the outer loop should terminate as well, or
array[i] > array[l]
andarray[l] >= array[j]
. This second case is handled next.
Check the appearance of a ghost label on line 42 and its use on lines 46-48 and 59-61, identifying the state of the program at the start of the current outer iteration. The invariants of the arrays compare the current values of i and j with their values at the top of the iteration, showing that the partitioning is proceeding.
Here, again, the loop invariants are mostly pro-forma, with a possible argument for the second invariant of each loop: that the newly examined elements of the array are in position for partition.
The two asserts following the bodies of the loops show the transitivity of the inner loop invariants and the outer partitioning invariant—I’m not even sure those are required for the proofs, I put them there for documentation.
...
67 }
68 //@ assert ∀ int q; j < q < u ⇒ array[l] < array[q];
69 if (i < j) {
70 //@ assert array[i] > array[l] >= array[j];
71 swap(array, i, j);
72 //@ assert array[i] <= array[l] < array[j];
73 }
74 }
75 if (array[l] < array[i]) {
76 i -= 1;
77 }
78 swap(array, l, i);
79 return i;
80 }
In the second case I described above, two array elements identified by i and j are out of place; array[i] > array[l]
and array[l] >= array[j]
. The thing to do, then, is to swap the two elements as seen in line 71, putting each on the correct side of the unpartitioned portion of the array. Then, the loop continues and the inner loops are guaranteed to advance past the swapped elements.
The two assertions around the swap
, lines 70 and 72, are also intended primarily for documentation, showing the effects of the swap on the partitioning state.
If i = j (remember that both i and j advance by one element at a time; they cannot be advanced past each other) then the outer loop is completed and partitioning is almost completed, except for the element identified by i and j, which hasn’t been compared to array[l]
. Lines 75-77 handle that task, adjusting i to put it on the correct side of the partitioning. Then, line 78 swaps elements i and l, putting the pivot element in the middle.
At this point, the postconditions of partition
have been satisfied: the array is partitioned and the important properties have been preserved.
Quicksort
The Quicksort function itself is relatively simple. It has basically the same two safety preconditions as partition
, and the same assigns clause. The four postconditions are evenly divided into “internal” matters, the preservation of less-than-or-equal and greater-than, and sorting and permutation. The first two are needed by the recursive calls, as are, technically, the latter two since the recursion “scales up” the properties to a wider segment of the array.
1 /*@
2 @ requires
3 @ 0 ≤ l ≤ u < INT_MAX;
4 @ requires
5 @ \valid(array + (l .. u - 1));
6 @ assigns
7 @ *(array + (l .. u - 1));
8 @ ensures
9 @ ∀ int v; prsv_leq{Pre,Post}(array, l, u, v);
10 @ ensures
11 @ ∀ int v; prsv_gt{Pre,Post}(array, l, u, v);
12 @ ensures
13 @ sorted(array, l, u);
14 @ ensures
15 @ permutation(\old(array), array, l, u);
16 @*/
17 void
18 qsort(int *array, int l, int u)
19 {
20 if (u - l <= 1) {
21 return;
22 } else {
23 int m = partition(array, l, u);
24 //@ ghost PostPartition:
25 // assert permutation(\at(array, Pre), \at(array, Here), l, u);
26 qsort(array, l, m);
27 //@ ghost int mm = m + 1;
28 //@ assert ∀ int v; prsv_leq{Pre, Here}(array, l, u, v);
29 //@ assert ∀ int v; prsv_gt{Pre, Here}(array, l, u, v);
30 //@ assert prsv_gt{PostPartition, Here}(array, mm, u, array[m]);
31 qsort(array, m + 1, u);
32 //@ assert prsv_leq{PostPartition, Here}(array, l, m, array[m]);
33 //@ assert prsv_gt{PostPartition, Here}(array, mm, u, array[m]);
34 }
35 }
Internally, the base-case of the recursion is trivial, and all of the necessary properties are trivially satisfied. For the recursive case, I had to do a little bit of a shuffle with assertions to get the proofs of the postconditions to follow.
A ghost label on line 24 is used to allow references to the state of the array after partitioning.
Line 25 asserts that the array is a permutation of the original array; since
qsort
itself also ensures the result is a permutation, nothing further needs to be done for that requirement.The key assertions for sorting are on lines 30, 32, and 33. Lines 30 and 33 claim that the portion of the array after m are greater than the element
array[m]
, while line 32 claims that the region before m are less than or equal toarray[m]
. Since these individual sub-segments are sorted, the overall array between l and u is sorted.
In this case, I attempted to minimize the assertions by removing any that weren’t important to the proofs. The result is kind of confusing from a documentation standpoint. permutation
is used once, and the two preservation predicates see a lot of use, in two different ways: general, applying to all values v, and specific, applying to the pivot element, array[m]
. Not ideal.
And with that, my proof is completed. Quicksort, ladies and gentlemen.
Overall, I like the use of ACSL properties as documentation; the use of a formal notation is much clearer than an English description of what is supposed to be happening. On the other hand, it is very clear that development of the proofs proceeded by finding a function-level condition that needed to hold, say sorted(array,l,u)
or permutation(\old(array),array,l,u)
, and then adding assertions until the condition could be found to hold. (I admit that I knew roughly how Quick Sort was supposed to be written and did not discover any need for major code changes.)
Update: Fri Aug 11 20:47:01 CDT 2017
Hopefully fixed some sloppiness with the difference between array elements and array indexes. Thanks, /u/Enamex and /u/Nathanfenner on /r/programming.