Tuesday, March 26, 2013

Creating a Letterpress cheating program in Rust: Traits

Ok, I'm not saying you would want to do this. But you could. That's all I want to get across. Just that you could do this, if you really wanted to.

What could you do? Let's say you are writing a Letterpress cheating thingy and need a program to generate the anagram dictionary: a file that contains a list of sorted letters, followed by the word or words that are made from those letters:

...
aabcdiinot abdication
aabcdkrsw backwards drawbacks
aabcdkrsy backyards
...

One way would be to just do it, which I've mentioned before. But that would hardly be any fun, right? How about we use (or overuse) traits?

Rust's traits are akin to Java interfaces or Haskell's type classes. Traits provide two things in Rust: a set of related functions (or "methods" in this case) which can be implemented for any suitable data structure, and as the Rust tutorial describes, a way to bound the type parameters of generic operations, limiting the types that can be arguments to the generic functions to those which implement the given trait as well as making the trait's methods available on the values of the generic type.

The first thing the mk_anadict program does is to read the /usr/dict/words file and create a dictionary mapping sorted sequences of letters to a list of words that can be made from those letters. Here is a trait that describes that operation:

trait DictReader {
    fn read_dict(&self) -> ~LinearMap<~str,~[~str]>;
}

That declaration says that an implementation of DictReader provides a method, read_dict, that takes no arguments (other than the object the method is invoked on) and returns a map from strings to vectors of strings. Now, I could provide a sui generis implementation of that trait, by creating a new type and defining an implementation on it, but that doesn't really show the power of traits. Instead, I wrote an implementation of the trait for an existing type, in effect adding the read_dict method to the existing type:

impl DictReader for Reader {

    fn read_dict(&self) -> ~LinearMap<~str,~[~str]> {
        let mut map = ~LinearMap::new();
        for self.each_line |line| {
            let line = line.trim();
            let length = line.len();
            // Original is using pre-strip() line for comparisons
            if length >= 2 && length < 19
                && line.all(|ch| (char::is_ascii(ch)
                                  && char::is_lowercase(ch))) {
                let mut chars = str::chars(line);
                std::sort::quick_sort(chars, |a,b| *a <= *b);
                let key = str::from_chars(chars);

                // What I'd like to do:
                // map.find_or_insert(key, ~[]).push(line);
                // But find_or_insert's result isn't mutable.

                let mut value = match map.pop(&key) {
                    None => { ~[] }
                    Some(old) => { old }
                };
                value.push(line);
                map.insert(key,value);
            }
        }
        return map;
    }

}

Reader is an existing Rust core library trait for reading input, in this case from a file. It has a method, each_line, which is an iterator over the lines from the input stream. What this implementation is saying is that anything that is a Reader is also a DictReader; any type that implements Reader also implements DictReader, including whatever file_reader returns.

How's that for spiffy?

I did a similar thing for writing the output, anagram dictionary file:

trait DictWriter {
    fn write_dict(&self, dict : &LinearMap<~str,~[~str]>);
}

impl DictWriter for Writer {

    fn write_dict(&self, dict : &LinearMap<~str,~[~str]>) {
        for dict.each_key_sorted() |key| {
            let line = str::connect(dict.get(key).map(|v| copy *v), " ");
            self.write_str(fmt!("%s %s\n", *key, line));
        }
    }

}

...with a small difficulty: LinearMap doesn't provide a way to iterate across the keys of the dictionary in sorted order. So, I added a third trait and implementation (see how addictive these things are?):

trait SortedKeys<K : Ord> {
    fn each_key_sorted(&self, blk : &fn(key : &K) -> bool);
}

impl<K : Hash + IterBytes + Eq + Ord, V> SortedKeys<K> for LinearMap<K,V> {

    fn each_key_sorted(&self, blk : &fn(key : &self/K) -> bool) {
        let mut keys : ~[&self/K] = vec::with_capacity(self.len());
        for self.each |&(k,_)| { keys.push(k); }
        std::sort::quick_sort(keys, |a,b| *a <= *b);
        for keys.each |&k| { if !blk(k) { break; } }
    }

}

...which is where the each_key_sorted method in the DictWriter implementation came from.

The type constraints on the implementation of SortedKeys<K> for LinearMap<K,V> deserve some explanation. The Hash and Eq traits are required of anything that can be a key in a LinearMap, and the IterBytes trait is required by Hash if I understand things correctly. The Ord trait requirement is added by the SortedKeys<K> trait, because the keys have to be ordered in order to be sortable. That implementation declaration is saying, if a type K is Hash-able, IterBytes-able, Eq-able, and Ord-able, then any LinearMap with keys of type K also implements Sortedkeys and has an each_key_sorted method.

The bottom line is the new implementation of mk_anadict's main function:

fn main() {
    match (file_reader(&Path("/usr/share/dict/words")),
           file_writer(&Path("anadict-rust.txt"), [Create,Truncate])) {
        (Ok(r),    Ok(w))     => {
            // It's like magic, but not as exciting.
            w.write_dict(r.read_dict());
        }
        (Err(msg), _)         => { fail!(msg); }
        (_,        Err(msg))  => { fail!(msg); }
    }
}

Like magic, indeed. The neat part is that I have done nothing to file_reader and file_writer; those functions from Rust's core io module still return whatever they return; I have just added some capabilities to the values they do return.

Like I wrote above, I am not advocating this style as a good way of writing idiotic programs like mk_anadict. I'm just saying that it's possible. And that you should get some exposure to traits (and type classes in Haskell) because they're powerful juju. Anyway, the trait version of the program, mk_anadict_traits.rs, is on github.

No comments: