Abstracted Algebra in Rust

Posted on July 19, 2015 by Tommy McGuire

Labels: c++, notation, rust, toy problems, math

or, Why Rust is better than C++

or, How To Break a Coding Interviewer

This week I have been reading Alexander Stepanov and Daniel Rose's From Mathematics to Generic Programming. (I've had Sepanov and Paul McJones' Elements of Programming on the unread pile for quite a while, but so far it has been the Australian sparkling wine of the bookshelf: something for laying down and avoiding. I'll have to unearth it, once I have recovered from this.) From Mathematics to Generic Programming is the sort of book intended to convince you that programming is not only applied formal logic but also applied abstract algebra, and it does a pretty fair job. If you're unfamiliar with Stepanov, he is one of the people behind generic programming in general and both Ada and the C++ Standard Template Library specifically. Do be warned, though, that so far the book has been rather heavy on abstract algebra and its history, and that of number theory, and thin on programming. On the other hand, the programming bits have been very good, particularly those on optimizing an algorithm.

While reading chapter 7, Deriving a Generic Algorithm, the authors pose an interesting example and I thought it might be entertaining to try to convert the example into Rust.

To begin, consider multiplication. (No, I'm serious. Just keep going for a minute.) Multiplication is repeated addition; a*n is a + a + ... + a, with n a's and n-1 + signs. Implemented naively, it is a O(n) algorithm, right? But there is an alternative algorithm, Egyptian multiplication, which is O(log n). I don't want to go into how this algorithm works, or the derivation of the following version; see Count like an Egyptian or From Mathematics to Generic Programming for yourself. But this is the optimized, accumulative, iterative version, in Rust:


fn odd(n: usize) -> bool { n & 0x01 == 1 }
fn half(n: usize) -> usize { n >> 1 }
fn mult_acc(mut r: usize, mut n: usize, mut a: usize) -> usize {
    loop {
        if odd(n) {
            r += a;
            if n == 1 {
                return r;
            }
        }
        n = half(n);
        a += a;
    }
}

The important part of this algorithm is that you have an element, a, an operation, + (that obeys some algebraic laws), and a counter, n, that indicates how many times to apply the operation to the element. And an efficient algorithm for doing so. Now, what can you do with this?

Natural numbers

To start with, what are the requirements needed from the counters? It needs a zero, an additive identity (purely for predicate purposes), and a one (also for predicates), and a couple of operations: a predicate indicating that a value is odd and integral division by 2. Further, it needs to be ordered (see predicates), and it will be convenient later to have a subtraction operator. In Rust, those requirements are described by traits:


pub trait Zero { fn zero() -> Self; }¹
pub trait One { fn one() -> Self; }
pub trait Odd { fn odd(&self) -> bool; }
pub trait Half { fn half(&self) -> Self; }
pub trait Natural: cmp::Ord + Sub<Output=Self> + Zero + One + Odd + Half { }

macro_rules! natural_for {
    ( $( $t:ty ),* ) => {
        $(
            impl Zero for $t { fn zero()      -> $t   { 0 } }
            impl One  for $t { fn one()       -> $t   { 1 } }
            impl Half for $t { fn half(&self) -> $t   { *self >> 1 } }
            impl Odd  for $t { fn odd(&self)  -> bool { *self & 0x01 == 1 } }
            impl Natural for $t { }
        )*
    }
}
natural_for!(usize,u8,u16,u32,u64,i32,i64);

The trait Natural there indicates that a type implementing it also has to implement Ord and Sub from the standard library along with Zero, One, Odd, and Half. All of those either already are implemented by the primitive types or are pretty easy. I have taken the liberty of using a macro to implement this set of traits for a collection of useful counting types: usize, u8, u16, etc.

The collected Natural trait provides all the operations needed for the counter type, to implement the Egyptian multiplication algorithm in a general form. On the other hand, the algorithm operates on a type (for a) and an associated operation. These two form an algebraic structure, a semigroup.

Semigroups


trait Semigroup {
    type Elt: Clone;
    fn op(Self::Elt, Self::Elt) -> Self::Elt;
}

For a semigroup, the operation should be associative: (x `op` y) `op` z == x `op` (y `op` z). Unfortunately, I have not figured out how to enforce that. Maybe later. On the other hand, the type of elements should be closed under the operation, which is enforced by having the return type of the operation be Self::Elt.

Note that the operation, op, is not a trait method; it does not have a &self argument. In other words, it's a static function, not bound to any objects implementing the trait, although it does operate on the type associated with the trait, Elt.

On numbers (as in, integers), there are two common semigroups: addition and multiplication. This is actually one of the stumbling points of simple approaches to describing abstract algebra in code: because a value can only have one type, a simple method would only show one operation, addition say, rather than allowing an unlimited number of semigroup definitions over the same types with different operations. Fortunately, Rust has very frightening and complicated way out:


struct MultiplicativeSemigroup<Elt>(PhantomData<Elt>);
struct AdditiveSemigroup<Elt>(PhantomData<Elt>);

macro_rules! add_mult_semigroups_for {
    ( $( $t:ty ),* ) => {
        $(
            impl Semigroup for AdditiveSemigroup<$t> {
                type Elt = $t;
                fn op(x: $t, y: $t) -> $t { x + y }
            }
            impl Semigroup for MultiplicativeSemigroup<$t> {
                type Elt = $t;
                fn op(x: $t, y: $t) -> $t { x * y }
            }
        )*
    }
}
add_mult_semigroups_for!(usize,u8,u16,u32,u64,i8,i16,i32,i64,f32,f64);

This code defines additive and multiplicative semigroups over the generic type Elt. The structures, weirdly, don't contain any data fields. Instead, they will be used purely at the type level in the following code. (Beware! You have been warned! Here be dragons! And ferrets looking to get into your trousers.) The PhantomData bit tells the compiler that I know that the structure does not use the type Elt. To quote from std::marker::PhantomData<T>:

PhantomData<T> allows you to describe that a type acts as if it stores a value of type T, even though it does not. This allows you to inform the compiler about certain safety properties of your code.

Though they both have scary names, PhantomData<T> and "phantom types" are unrelated. 👻👻👻

Yes, those seem to be little unicode ghosts. I'm not the only one with a weird sense of humor.

I have again taken the liberty of implementing both semigroups for the common primitive types.

You may be wondering where Rust's Add and Mul traits appear in all this. Currently, for this code, they are implicit in the operations + and *, and the types I have defined the two semigroups for. However, when I get down to doing more complicated stuff, they will indeed be needed.

Power!

Once you have the traits Natural and Semigroup, and implementations of those traits, you can implement the Egyptian multiplication algorithm in a generic form:


fn power_accumulate_semigroup<N,S>(mut r: S::Elt, mut a: S::Elt, mut n: N) -> S::Elt
    where N: Natural,
          S: Semigroup
{
    loop {
        assert!(n > N::zero());²
        if n.odd() {
            r = S::op(r.clone(), a.clone());
            if n == N::one() { return r; }
        }
        n = n.half();
        a = S::op(a.clone(), a.clone());
    }
}

fn power_semigroup<N: Natural, S: Semigroup>(mut a: S::Elt, mut n: N) -> S::Elt {
    assert!(n > N::zero());
    while !n.odd() {
        a = S::op(a.clone(),a.clone());
        n = n.half();
    }
    if n == N::one() { return a }
    let y = S::op(a.clone(),a.clone());
    return power_accumulate_semigroup::<N,S>(a, y, (n - N::one()).half());
}

Yes, I know, mutable function arguments are, stylistically, sketchy in Rust. But that's what the authors have and I'm just going with their work.

As an aside, the associativity of the semigroup operation is important here, because the Egyptian multiplication algorithm works by reordering the evaluation of the operations.

The first function there, power_accumulate_semigroup, should be visibly the same as the original algorithm above, with some changes related to:

the operator, which is fetched out of the type argument S,
the use of clone (and the related requirement of the Clone trait on Semigroup::Elt) which is needed because primitive types are implicitly copyable and passed by value while the generic type Elt is not but is being passed by value here, and
the excess verbosity required by using generic Naturals for the counter.

The second function, power_semigroup, is a wrapper for the accumulative version, providing simplified parameters and optimizing the arguments to power_accumulate_semigroup. I do not want to go into the derivation of that, either. See Stepanov and Rose.

Why go to this much effort? Because power_semigroup provides a way of repeatedly applying an operation to a value, and is as generic as it can be over both the types of the values and the operations (and over the types of the counter describing how many times to apply the operation). With the additive and multiplicative semigroups, you can now write code like


    power_semigroup::<_,MultiplicativeSemigroup<usize>>(2, 5)

and get back the result 32 = 2⁵. Or, you can write


    power_semigroup::<_,AdditiveSemigroup<u64>>(2, 5)

and get back the result 10 = 2 * 5.

Hardly seem worth it? Perhaps not, but that's the nature of abstraction: you go through a lot of rigmarole (my spelling checker suggested 'oleomargarine' there) to get something idiotically simple in trivial cases but remarkably powerful in complex ones. So, wait for it.

There is one thing you cannot do with this monstrosity. Calling


    power_monoid::<_,AdditiveSemigroup<i64>>(2, 0)

results in a run-time error; because this is dealing with a semigroup, there are some weird limits on what you can do. In particular, we cannot currently describe what 2*0 or 2^0 (for the multiplicative semigroup) means. But it is possible to extend our algebraic structure.

Monoids

(What was that about mathematical notation being clean, elegant, and beautiful?)

A monoid is a semigroup with an identified element, an identity. Calling the identity e, the additional required behavior of the operation is (x `op` e) == (e `op` x) == x. Following the style above, I define a Monoid trait and supply implementations for the additive and multiplicative operations. (Yeah, AdditiveSemigroup and MultiplicativeSemigroup are looking more and more like poor names.)


trait Monoid: Semigroup { fn identity() -> Self::Elt; }

macro_rules! add_mult_monoids_for {
    ( $( $t:ty ),* ) => {
        $(
            impl Monoid for AdditiveSemigroup<$t>       { fn identity() -> $t { 0 as $t } }
            impl Monoid for MultiplicativeSemigroup<$t> { fn identity() -> $t { 1 as $t } }
        )*
    }
}
add_mult_monoids_for!(usize,u8,u16,u32,u64,i8,i16,i32,i64,f32,f64);

The additive identity is 0, because x + 0 = 0 + x = x, and the multiplicative identity is 1, because x * 1 = 1 * x = x. As before, I supply an implementation for all of the standard primitive types. Using the Monoid trait, I can then define a function power_monoid:


fn power_monoid<N: Natural, M: Monoid>(a: M::Elt, n: N) -> M::Elt {
    if n == N::zero() { M::identity() }
    else              { power_semigroup::<_,M>(a, n) }
}

With this function, the operation that failed above now works; supplying an operation count argument of zero results in the appropriate identity value;


    power_monoid::<_,MultiplicativeSemigroup>(2, 0)

produces 1 = 2⁰. Likewise, the additive operation would produce 0 = 2*0.

Unfortunately, there is still another problem. Calling


    power_monoid::<_,AdditiveSemigroup<i64>>(2, -2);

Would result in another runtime error, because neither the semigroup nor the monoid support a negative operation count. (What does that even mean? Screw it, this is mathematics. Damn the torpedoes; full speed ahead!)

Extending Natural to Integral

Before the power operation can be extended, the Natural trait needs to be extended to support negation; otherwise simply writing the function call would be an error. Once again, defining mathematical structures as traits is the necessary task:


trait Integral: Natural + Neg<Output=Self> { }

macro_rules! integral_for {
    ( $( $t:ty ),* ) => { $( impl Integral for $t { } )* }
}
integral_for!(i32,i64);

As with the traits Mul and Add, I am using Rust's standard Neg trait to express negation; anything implementing the trait Integral has to implement Natural and have the negation operation. The number of types for which I can implement the Integral trait is significantly limited. Floating point values seem wrong, and none of Rust's unsigned primitive types will work, leaving only i32 and i64.

Groups

A group, as an algebraic structure, is a monoid with an operation to compute the inverse of each element, relative to the operation. The specific requirement for the inverse is, (x `op` x^-1) == (x^-1 `op` x) == e. This is the trait that describes groups:


trait Group: Monoid { fn inverse(&Self::Elt) -> Self::Elt; }

macro_rules! add_groups_for {
    ( $( $t:ty ),* ) => {
        $(
            impl Group for AdditiveSemigroup<$t> {
                fn inverse(n: &$t) -> $t { -(*n as $t) }
            }
        )*
    }
}
add_groups_for!(i8,i16,i32,i64,f32,f64);

macro_rules! mult_groups_for {
    ( $( $t:ty ),* ) => {
        $(
            impl Group for MultiplicativeSemigroup<$t> {
                fn inverse(n: &$t) -> $t { (1 as $t) / (*n as $t) }
            }
        )*
    }
}
mult_groups_for!(f32,f64);

And once again I provided an implementation of the Group for the types where it makes sense. Unfortunately, that is becoming limited: negative numbers, the additive inverses of positive numbers, are not supported by unsigned types and 1/n, the multiplicative inverse of a number n, is only supported by floating point types (since I don't have rationals handy). However, once Natural and Monoid have been extended to Integral and Group, a function power_group can be defined.


fn power_group<N: Integral, G: Group>(mut a: G::Elt, mut n: N) -> G::Elt {
    if n < N::zero() {
        n = -n;
        a = G::inverse(&a);
    }
    power_monoid::<_,G>(a, n)
}

This function is a wrapper for power_monoid, which wraps power_semigroup, which does the heavy lifting here. In this wrapper, if the count n is less than zero, it is inverted to be greater than zero and the base argument a is also inverted, by the operation that depends on the definition of the group for the operation. For addition, a becomes -a; for multiplication, a becomes 1/a.

As a result, the following function invocations work:


    power_group::<_,AdditiveSemigroup<i64>>(2, -2i64)

produces -4 = 2*-2, and


    power_group::<_,MultiplicativeSemigroup<f64>>(2.0, -2i64)

produces 0.25 = 2^-2. So, now we have the ability to invoke an operation on a element in a very general but well-defined way. Wouldn't it be nice if we could do something cool with it?

Fibonacci

The Fibonacci sequence (If you don't know what that is, who Fibonacci was, or why it exists at all, go look it up. I'll wait, but this post is getting too long anyway.) is normally defined as:

F₀ = 0
F₁ = 1
F_i = F_i-1 + F_i-2

This leads to a simple recursive implementation which is spectacularly inefficient, and a host of techniques like memoization and iterative versions that are more efficient. As a result, finding the nth Fibonacci number is O(n). Or is it? What if we could use our Egyptian multiplication trick, plus slathering on some other stuff, to get an O(log n) algorithm?

Matrix multiplication

Although I have no intention to explore the reasoning, it turns out to be possible to find the nth Fibonacci number using matrix multiplication:

\[ \left[ \begin{array}{c}v_{i+1} \\ v_{i}\end{array} \right] = \left[ \begin{array}{cc}1 & 1 \\ 1 & 0\end{array} \right] \left[ \begin{array}{c}v_{i} \\ v_{i-1}\end{array} \right] \]

...and therefore...

\[ \left[ \begin{array}{c}v_n \\ v_{n-1}\end{array} \right] = \left[ \begin{array}{cc}1 & 1 \\ 1 & 0\end{array} \right]^{n-1} \left[ \begin{array}{c}1 \\ 0\end{array} \right] \]

(If the TeX doesn't show up correctly, check out the damn book. Trust me, it looks beautiful.)

In other words, we can compute the nth Fibonacci number by raising a certain matrix to a power. [...] Matrices are a multiplicative monoid, so we already have an O(log n) algorithm---our power algorithm from Section 7.6.

Therefore, all we need to do is whip up a quick matrix data structure...


struct Matrix<T:Clone> {
    m: Vec<Vec<T>>,
}

impl<T: Clone + Zero> Matrix<T> {
    fn zero(n: usize, m: usize) -> Matrix<T> {
        let mut rows = Vec::with_capacity(n);
        for _ in 0..n {
            let row = vec!(T::zero(); m);
            rows.push(row);
        }
        Matrix{ m: rows }
    }
}

impl<T:Clone + Zero> Clone for Matrix<T> {
    fn clone(&self) -> Matrix<T> {
        let m = self.m.len();
        let n = self.m[0].len();
        let mut res = Matrix::zero(m, n);
        for i in 0..m {
            for j in 0..n {
                res.m[i][j] = self.m[i][j].clone();
            }
        }
        res
    }
}

..., provide a spiffy multiplication operation...


impl<T> Mul for Matrix<T> where T: Clone + Zero + Add<Output=T> + Mul<Output=T>
{
    type Output = Matrix<T>;

    fn mul(self, rhs: Matrix<T>) -> Matrix<T> {
        let n = self.m.len();
        let m = self.m[0].len();        // lhs    is n x m
        let p = rhs.m[0].len();
        assert_eq!(rhs.m.len(), m);     // rhs    is m x p
        let mut ab = Matrix::zero(n,p); // result is n x p
        for i in 0..n {
            for j in 0..p {
                ab.m[i][j] =
                    (0..m)
                    .map( |k| self.m[i][k].clone() * rhs.m[k][j].clone() )
                    .fold( T::zero(), Add::add );
            }
        }
        ab
    }
}

(Note the requirement that T, the type of the elements in the matrix, have addition and multiplication (and a zero).)

...and then make a multiplicative semigroup for matrices:


impl<T> Semigroup for MultiplicativeSemigroup<Matrix<T>>
     where T: Clone + Zero + Add<Output=T> + Mul<Output=T>
{
    type Elt = Matrix<T>;
    fn op(x: Matrix<T>, y: Matrix<T>) -> Matrix<T> { (x) * (y) }
}

And with that I can print out all the Fibonacci numbers that you need (or at least all of those that will fit in a 64-bit integer).


    let mut fib_l: Matrix<i64> = Matrix::zero(2,2);
    fib_l.m[0][0] = 1;
    fib_l.m[1][0] = 1;
    fib_l.m[0][1] = 1;

    let mut fib_r: Matrix<i64> = Matrix::zero(2,1);
    fib_r.m[0][0] = 1;

    for i in 2..n {
        let fib_l = fib_l.clone();
        let fib_r = fib_r.clone();
        let res =
            power_semigroup::<_,MultiplicativeSemigroup<Matrix<i64>>>(fib_l, i-1)
            * fib_r;
        println!("{} {}", i, res.m[0][0]);
    }

That code prints out from the second to the sixty-second Fibonacci number; I cannot use it for the 0th or first number because I never added the identity necessary to make matrix multiplication a monoid; I'm lazy and it would require either knowing what size matrix to produce or having a specialized identity matrix. And 64-bits is only enough for 62 Fibonacci numbers.


2 1
3 2
4 3
5 5
6 8
7 13
...
57 365435296162
58 591286729879
59 956722026041
60 1548008755920
61 2504730781961
62 4052739537881

To sum up, I'll quote Stepanov and Rose:

The process we went through---taking an efficient algorithm, generalizing it (without losing efficiency) so that it works on abstract mathematical concepts, and then applying it to a variety of situations---is the essence of generic programming.

Subtitles

How is Rust better than C++? (I'm not sure it is better than C??+.) Two reasons:

Stepanov and Rose's C++ code uses an additional operator parameter to pass the necessary algebraic structure operations into the generic functions; my Rust code here uses type arguments to do the same thing. In other words, in C++, they have
```
template <Regular A, Integer N, SemigroupOperation Op>
A power_semigroup(A a, N n, Op op) {...}
```
While in Rust, I have
```
fn power_semigroup<N: Natural, S: Semigroup>(mut a: S::Elt, mut n: N) -> S::Elt {...}
```
Here, Natural and Regular have the same function and their Integer and SemigroupOperation correspond to Semigroup. Using their power_semigroup requires three arguments; mine, two.
In the code examples above, Regular, Integer, and SemigroupOperation are C++ concepts, according to the authors. Unfortunately, as of C++11, as used by the authors, C++ did not include concepts, so for the code in the book, they #define Regular, etc., as "typename". (I don't follow C++ much; I have no idea about C++15/7.) To tell the truth, I'm not entirely sure I understand the concept of concepts, but it appears they act as a type system for C++ templates (the C++ template sublanguage is untyped and interpreted by the compiler) and that they act like Rust and Haskell traits: specifying an interface that must be implemented by types that use the generic function. Rust, on the other hand, has traits.

One side effect of this is that Rust potentially has better compiler error messages; a trait can be typechecked when it is defined and implemented, while a C++ template is, for the most part, typechecked only when it is instantiated, and the resulting error messages are specific to the instantiation. Rust's errors aren't great, but they aren't 27 pages of angle brackets and line noise.

How do you break a coding interviewer?

Suppose you are in an interview, or on a phone interview, and the person on the other end asks you to write the code to find the nth Fibonacci number without all that icky recursion stuff. Memorize this post, and you can start your answer with, "You know what a semigroup is, right? And Egyptian multiplication?" The interviewer will be bleeding from the eyes and ears before you get through a few paragraphs.

Footnotes

¹ As of this writing, Rust's own Zero and One traits are unstable.

² Stepanov and Rose do not have this assertion there, but have more general code, seen below. But, I don't think their code makes algebraic sense here, and replacing it with what I have does not seem to change the outcome.


        assert!(n >= N::zero());
        if n == N::zero() { return r; }

Comments

Could you do `impl Monoid for AdditiveSemigroup where T: Zero {

fn identity(&self) -> T { T::Zero(self) }

}`

And similar for MultiplicativeSemigroup?

Ashkan
'2015-08-04T03:54:16.797-05:00'

Sure, and it's not a bad idea.

In general, though, I am a bit conflicted about Zero and One. They're useful for the generic counter-thing that I'm using them for here, but you can't really rely on them algebraically. If you used Zero instead of that identity() function for addition, you'd have a hard time also having an identity for multiplication on the same types.

Tommy McGuire
'2015-08-09T20:06:00.813-05:00'