# Rust 0.8: External iterators

If you have browsed through my ridiculous Rust posts, you will know that I have been using a sad, silly combinatorics problem that I stole from someone else's blog to play with the programming language Rust. I am still doing it, I'm sorry to say.

If you have been following the development of Rust, you will know that between 0.6 and the recent release of 0.8, Rust's **for** expressions have changed from an "internal" iterator basis to an "external" iterator basis. (I've described the change before, way back when 0.7 was released. Geeze, that was a long time ago.)

Basically, an "internal" iterator involves a higher-order function that is passed as an argument a new function that represents the body of the **for** expression. This new function is called once on each element of whatever is being iterated over. An "external" iterator is instead an implementation of the trait **Iterator**, which means that it has a **next()** method. External iterators have many advantages, such as more flexibility and better composability, which is why the behavior of the **for** expression was changed. However, external iterators also seem to have downsides.

This is part of my implementation of an external iterator for generating all permutations of the contents of a vector:

impl<'self,T:Clone> Iterator<~[T]> for PermutationIterator<'self,T> {

#[inline]

fn next(&mut self) -> Option<~[T]> {

if self.initial {

self.initial = false;

} else if self.length <= 1 {

return None;

} else {

// find largest k such that indices[k] < indices[k+1]

// if no such k exists, all permutations have been generated

let mut k = self.length - 2;

while k > 0 && self.indices[k] >= self.indices[k+1] { k -= 1; }

if k == 0 && self.indices[0] > self.indices[1] { return None; }

// find largest l such that indices[k] < indices[l]

// k+1 is guaranteed to be such

let mut l = self.length - 1;

while self.indices[k] >= self.indices[l] { l -= 1; }

// swap indices[k] and indices[l]; sort indices[k+1..]

// (they're just reversed)

self.indices.swap(k,l);

self.indices.mut_slice(k+1, self.length).reverse();

// fixup permutation based on indices

for i in range(k, self.length) {

self.permutation[i] = self.values[ self.indices[i] ].clone();

}

}

return Some(self.permutation.clone());

}

}

The algorithm for generating the permutations (in lexical order) is straight out of the Wikipedia page, with the modification that it is working on an internal vector of indices rather than the data vector (to avoid potentially expensive copies). What I want to discuss is the line

return Some(self.permutation.clone());

**permutation** contains the current permutation being generated by the iterator. It is intended to be passed directly to the body of the loop, which would avoid an allocation and data copies. Unfortunately, this code calls **clone()** there.

The reason it calls **clone()** is that this code implements an **Iterator<~[T]>** and therefore has a return type of **Option<~[T]>**. In other words, it returns an owned vector; hence the **clone()**. What I think I would like to do is:

// impl<'self,T:Clone> Iterator<&'self[T]> for PermutationIterator<'self,T> {

// fn next(&mut self) -> Option<&'self[T]> {

// ...

// return Some(self.permutation.slice_from(0));

// }

// }

Unfortunately, I cannot convince the compiler to let me return a borrowed reference to the **permutation** vector, which *is* an element of the **PermutationIterator** structure and should have the same lifetime^{1}. On one hand, it *should* work, I think. On the other, well, maybe it is a sketchy thing to do.

Why is this important? Time to break out a performance comparison:

Variation | Time (seconds) |
---|---|

vec permutations_iter | 11 |

vec permutations_iter (manual) | 7 |

lexical iterator | 6 |

old each_permutation | 1.2 |

These timings are for generating all 39,916,800 permutations of a vector of 11 integers, counting the number of permutations.

*vec permutations_iter* is the iterator version currently in the **vec** module, based on the Steinhaus–Johnson–Trotter algorithm. It allocates and copies the returned permutation as well.

*vec permutations_iter (manual)* is based on a suggestion from the implementer of the vec iterator (from the commit message introducing the S-J-T iterator); it does not copy the permutation, but iterates over element swaps from the permutation state space and lets the body of the loop update its own copy of the permutation. The code looks like:

let mut v = do vec::from_fn(VECTOR_SIZE) |i| { i };

let mut count = 0;

for (a,b) in vec::ElementSwaps::new(VECTOR_SIZE) {

v.swap(a,b);

count = count + 1;

}

assert_eq!(count, NUM_PERMUTATIONS);

*lexical iterator* is my iterator above, also allocating and copying the return permutation.

*old each_permutation* is the previous permutation generator from the **vec** module. It used similar code to my iterator above, also based on the Wikipedia lexical generator algorithm. Critically, it *does not* allocate and copy the returned permutation; it is the "internal" iterator version.

The code for **each_permutation** can trivially be converted to be useful in a **do** expression, which may be the best option. It may be possible that this kind of generator is a bad fit for external iterators.

On the other hand, I am hoping that some very smart Rust developer will show up and tell me that, A), I'm an idiot, and B), there is a trivial way to do this that I'm not seeing.

## Update

In the discussion on reddit, englabenny writes,

I'm surprised the manual ElementSwaps version performs so poorly. I didn't expect that. The premise of my Permutations Iterator was really that it could be implemented without copies -- which it could, but acrichto helped discover that was through a borrowchecker hole. So the fixed version that clones the whole vector for each iterator element is quite terrible.

Pull req: https://github.com/mozilla/rust/pull/9062 most of the discussion was on irc, and the initial copyless version isn't available, I'll pastebin it if git still has it somewhere. [Edit] Thanks git reflog: Buggy copyless permutations iter

In the discussion of that pull request, alexcrichton opened issue 9069, "Unsound behavior when returning &-pointers", writing,

I thought something like this should be a compiler error? It seems to me that the lifetime ofbwould conflict with attempting to re-borrowaasmut.

That would seem to imply that my desired behavior is wrong. That issue was marked as a duplicate of 8624, which seems to emphasize that changing the elements of a vector that potentially has an outstanding reference is a no-go.

In the same reddit thread, ericanderton suggested using managed vectors, which I had not thought about. It might work, but I'm stuck trying to dynamically allocate a mutable managed vector. And it still has the objection that mutating the iterated value is not entirely kosher.

## Notes

1. For what it's worth, the error message is:

$ rust build --opt-level 2 --lib combinatorics.rs

combinatorics.rs:62:20: 62:51 error: cannot infer an appropriate lifetime due to conflicting requirements

combinatorics.rs:62 return Some(self.permutation.slice_from(0));

^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

combinatorics.rs:62:20: 62:36 note: first, the lifetime cannot outlive the expression at 62:20...

combinatorics.rs:62 return Some(self.permutation.slice_from(0));

^~~~~~~~~~~~~~~~

combinatorics.rs:62:20: 62:36 note: ...due to the following expression

combinatorics.rs:62 return Some(self.permutation.slice_from(0));

^~~~~~~~~~~~~~~~

combinatorics.rs:62:20: 62:51 note: but, the lifetime must be valid for the method call at 62:20...

combinatorics.rs:62 return Some(self.permutation.slice_from(0));

^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

combinatorics.rs:62:20: 62:36 note: ...due to the following expression

combinatorics.rs:62 return Some(self.permutation.slice_from(0));

^~~~~~~~~~~~~~~~