Automatic memory management: reference counting

Posted on February 12, 2014 by Tommy McGuire
Labels: rust, programming language

Last December, when I wrote about Flex- and BisonModule, I mentioned that it was one of my favorite hacks. But I did not describe what the hack was.

Python uses reference-counted memory management. When I wrote BisonModule, that became the chief obstacle: while flex behaves very simply, bison takes control of the program and calls back into Python code to read tokens and to build the parse tree. As a result, managing the lifetimes of symbol objects is a bit tricky. My first attempt dug rather deep into bison's runtime and used the fact that you can poke around with negative sub-expression references ("$1" in the right-hand part of a rule, for example) to access previous symbols. The result was ugly and stood a very good chance of leaking memory like a sieve.

So I scrubbed that and started over; I created an array of Python objects for tokens and symbols and arranged for this array to "own" the objects. Every new symbol or token object was added to the array, with a reference count of 1, for the array ownership. Adding an object to the parse tree incremented the reference count for that object and setting the tree root provided an external pointer and reference count bump to keep the tree from being deleted. Getting rid of the extraneous objects was as simple as walking the array when the parse completed and decrementing each object's reference count. Objects that then had zero counts were garbage and immediately deleted; objects that were part of the tree had non-zero counts and were returned to Python. The algorithm is simple and does not require intimate knowledge of bison's internals, but it does have a weaknesses---all of the tokens and symbols produced in intermediate steps in the parse are kept around until the parse is completed.

It also does not leak memory, even when using Python's reference counting.

Automatic memory management

(There was a beautiful segue here, somewhere.)

There are three fundamentally different styles of managing dynamically-allocated memory in a computer program:

Before I go further, I need some specialized vocabulary. If you are peering through automatic-memory-manager-colored glasses, the world is a strange place.

The program that you or I would think of as doing all the work is just a mutator. A mutator does one of four things: it allocates an object; deallocates an object; links two objects, adding a pointer from one object to another; and unlinks two objects, removing a link from the first to the second. (Strictly speaking, the last may not be necessary; I am including it for extra verisimilitude.) And an object is a block of memory allocated and used by the mutator. These objects are allocated from a heap, the chaotic source of all memory.

Continuing with my list above, there are several different kinds of automated memory managers:

There are hordes of variations on each of these categories; the originals of most date back to Lisp in 1960, and many people have done a great deal of exploration; some of the most recent changes in the Java Virtual machine have been new garbage collectors. The reference I am currently using is Garbage Collection: Algorithms for Automatic Dynamic Memory Management, by Richard Jones and Rafael D. Lins, a book that I've had for years. It describes a great many variations on the basic themes. My copy of The Garbage Collection Handbook: The Art of Automatic Memory Management, by Jones, Antony Hosking, and Eliot Moss, is on its way from Amazon as I type; I hope it is as good and more up-to-date.

In the mean time, let's talk reference counting. Here is a simple reference counter in Rust:

Reference counting in Rust


#[no_send]
pub struct Rc<T> { priv ptr : *mut Box<T>, }

A value of type Rc<T> represents a pointer to a box containing a value of type T. The type is marked no_send to prevent it from being shared with another thread; there is no locking here. Also, an unsafe '*' pointer is used to prevent the behavior of Rust's other pointer types from getting in the way.

A Box is:


struct Box<T> {
count : uint,
value : T,
}

The count is the number of references from Rc pointers while value is the value. (Note that the Rust standard library's own Rc type goes through an extra level of indirection for mutable values, in order to support run-time borrow checking and to match the language's other safety guarantees.)

The basic operations

The basic operations are creating a new Rc pointer and Box, and borrowing the value in the Box.


// unsafe operations
impl<T> Rc<T> {
pub unsafe fn unsafe_new(t : T) -> Rc<T> {
Rc { ptr : cast::transmute( ~Box{ count : 1, value : t } ) }
}
}

// safe operations
impl <T> Rc<T> {
pub fn new(t:T) -> Rc<T> { unsafe { Rc::unsafe_new(t) } }

pub fn borrow<'l>(&'l self) -> &'l T { unsafe { &(*self.ptr).value } }

// DANGER! DANGER!
pub fn borrow_mut<'l>(&'l mut self) -> &'l mut T {
unsafe { &mut (*self.ptr).value }
}
}

unsafe_new creates a new box with a given value and a count of one, and returns a pointer to it wrapped in an Rc type. cast::transmute converts the ~, owning pointer into a *, unsafe pointer; using an owned pointer makes it easy to free the Box when needed, but the Rc value does not own the box and the raw owned-pointer behavior would get in the way.

The new function provides a safe, simple wrapper for creating boxes and Rc values, and the borrow method allows the mutating code access to the value in the box. The lifetime on borrow indicates that the borrowed pointer must not outlive the Rc value; by construction, the Rc value will not outlive the Box containing the value.

borrow_mut is a bad idea. It allows a mutator to change the value in the box. Rust's standard library Rc type goes to considerable lengths to ensure that it cannot be used to violate Rust's safety. This thing doesn't. Don't do this. But for my simple example here, it's fine.

The magic

The magic happens on dropping and cloning Rc values.


#[unsafe_destructor]
impl<T> Drop for Rc<T> {
fn drop(&mut self) {
if self.ptr.is_not_null() {
unsafe {
(*self.ptr).count -= 1;
if (*self.ptr).count == 0 {
let _ : ~Box<T> = cast::transmute(self.ptr);
self.ptr = ptr::mut_null();
}
}
}
}
}

impl <T> Clone for Rc<T> {
fn clone(&self) -> Rc<T> {
unsafe { (*self.ptr).count += 1; }
Rc { ptr : self.ptr }
}
}

The Drop trait specifies a destructor or finalizer for a Rust type. In this case, the Rc being removed results in a decrement of the count in the Box. Then, if the count is zero, the box is converted back into an owned value, which is immediately destroyed. Because the type T has been tracked all through the Rc code, this ensures that any finalization of the T value happens.

A "bad interaction" between Drop and @-pointers forces the #[unsafe_destructor] annotation; otherwise Rust rejects a Drop implementation for a parameterized type.

The Clone trait copies the Rc value by creating a new Rc value pointing to the same box, in which the count is incremented.

And that is all she wrote, as far as simple reference counters go.

Rust includes a reference counting memory management module in its standard library. That module is roughly the basis of this one, but mine is considerably simplified.

Mutants! Mutants everywhere!

In order to play with this and other memory managers, I need a mutator. Theoretically, I could write a number of real, semi-useful programs that use the managers and try them in a live situation. But I'm not going to because I'm lazy. Instead, I am going to use a quick and dirty mutator harness to exercise the manager, to see if it fails spectacularly.

The first requirement is for the mutator simulator is an implementation of the four operations.


enum Op { Create, Delete, Link, Unlink, }

pub trait Store {
fn perform_create(&mut self); // Create
fn perform_delete(&mut self); // Delete
fn perform_link(&mut self); // Link
fn perform_unlink(&mut self) -> bool; // Unlink
}

An object implementing the trait Store provides methods (using a heap) to create an object, delete an object, and link and unlink objects. Because the object chosen for the operation may not have links, perform_unlink returns a bool indicating whether it did anything. All of the other operations either succeed or cause the mutator simulator to fail.

The mutator

A Mutator itself contains an implementation of Store, a random-number generation framework, and a set of operation counts.


pub struct Mutator<'s,S> {
store : &'s mut S,
range : Range<f32>,
rng : TaskRng,
c_count : uint,
d_count : uint,
l_count : uint,
u_count : uint,
o_count : uint,
}

impl<S> ToStr for Mutator<S> {
fn to_str(&self) -> ~str {
format!("[ creates: {}, deletes: {}, links: {}, unlinks: {}, ops: {} ]",
self.c_count, self.d_count, self.l_count, self.u_count, self.o_count)
}
}

The only output of the test program is the number of operations that have been performed. Any actual value will be provided by valgrind, an instrumentation framework that can identify a plethora of memory management bugs.

Creating a new Mutator is trivial, as expected.


impl<'s, S:Store> Mutator<'s,S> {
pub fn new(s:&'s mut S) -> Mutator<'s,S> {
Mutator {
store : s,
range : Range::new(0f32, 1f32),
rng : rand::task_rng(),
c_count : 0,
d_count : 0,
l_count : 0,
u_count : 0,
o_count : 0,
}
}

Once the mutator has been configured, it can repeatedly choose and execute an operation.


pub fn perform_operation(&mut self) {
match self.choose() {
Create => { self.store.perform_create(); self.c_count += 1; },
Delete => { self.store.perform_delete(); self.d_count += 1; },
Link => { self.store.perform_link(); self.l_count += 1; },
Unlink => { if self.store.perform_unlink() { self.u_count += 1; } },
}
self.o_count += 1;
}

The mutator chooses the operation to perform randomly; in this code with a 40% of creating a new object, 30% chance of deleting an object, 20% chance of linking two objects, and 10% chance of unlinking them.


fn choose(&mut self) -> Op {
let x = self.range.ind_sample( &mut self.rng );
if x < 0.4f32 { Create }
else if x < 0.7f32 { Delete }
else if x < 0.9f32 { Link }
else { Unlink }
}
}

The store

The final element of the mutator is a simulation of a heap of memory: a collection of cells of type T. For this heap simulation, there are two basic operations, peek, which returns the contents of a cell in the heap, actually removing it from the heap, and poke, which stores an object into a cell in the heap. In order to mimic the memory access behavior of a real program, for loose definitions of "mimic", a random cell is fetched for each call to peek.


trait RandomHeap<T> {
fn peek(&mut self) -> T;
fn poke(&mut self, t : T);
}

impl<T> RandomHeap<T> for ~[T] {
fn peek(&mut self) -> T {
let i = rand::task_rng().gen_range(0, self.len());
return self.swap_remove(i)
}
fn poke(&mut self, t : T) { self.push(t); }
}

(In peek, rand::task_rng() returns a task-local random number generator.)

The test program

The representation of a memory cell capable of being linked and unlinked, using my simple Rc, is as Rc<~[...]>, a box containing an array of something. The something is ultimately going to be more references, so to prevent an infinite type-level recursion, I introduce a Cell type:


enum Cell { C( Rc<~[Cell]> ), }

Cell and its constructor, C, do not do anything other than breaking up Rc<~[Rc<...>]>.

As a next step, I implemented Store for a vector of Rc cells, making use of the random heap operations.


impl Store for ~[Rc<~[Cell]>] {
fn perform_create(&mut self) { self.poke( Rc::new( ~[] ) ); }

fn perform_delete(&mut self) { self.peek(); }

fn perform_link(&mut self) {
let mut l : Rc<~[Cell]> = self.peek();
let r : Rc<~[Cell]> = self.peek();
l.borrow_mut().push( C(r.clone()) );
self.poke( l );
self.poke( r );
}

fn perform_unlink(&mut self) -> bool {
let mut res = false;
let mut l = self.peek();
{
let contents = l.borrow_mut();
if contents.len() > 0 {
match contents.pop() {
C(r) => { self.poke(r); res = true; },
}
}
}
self.poke(l);
res
}
}

And finally, I create the Store, the Mutator, introduce an initial population of 100 cells, and set the mutator off to work.


fn main() {
let mut s : ~[Rc<~[Cell]>] = ~[];
let mut m = Mutator::new(&mut s);
for _ in range(0, 100) { m.store.perform_create(); }
for _ in range(0, 1000000) { m.perform_operation(); }
println(m.to_str());
}

Results

If the mutator simulator is run without the ability to link objects together, it allocates and frees a fair amount of memory without losing track of any.


[ creates: 398879, deletes: 301084, links: 200009, unlinks: 0, ops: 1000000 ]
==20013==
==20013== HEAP SUMMARY:
==20013== in use at exit: 40 bytes in 1 blocks
==20013== total heap usage: 800,686 allocs, 800,685 frees, 45,327,266 bytes allocated
==20013==
==20013== LEAK SUMMARY:
==20013== definitely lost: 0 bytes in 0 blocks
==20013== indirectly lost: 0 bytes in 0 blocks
==20013== possibly lost: 0 bytes in 0 blocks
==20013== still reachable: 40 bytes in 1 blocks
==20013== suppressed: 0 bytes in 0 blocks

The mutator made approximately 40000 create operations, approximately 30000 deletes, and 20000 attempts to link objects (I emasculated the operation itself, not its invocation by the simulator). The remaining ten thousand operations were attempts to unlink objects, which failed.

The remainder of the output is courtesy of valgrind, and indicates that only one of the 800,686 objects allocated was not freed before the program ended. (That remaining 40 byte object is in the Rust runtime somewhere).

If, however, the simulator has the ability to link objects, valgrind tells a different story.


[ creates: 399155, deletes: 300946, links: 199940, unlinks: 31003, ops: 1000000 ]
==19863==
==19863== HEAP SUMMARY:
==19863== in use at exit: 124,616 bytes in 2,547 blocks
==19863== total heap usage: 803,544 allocs, 800,997 frees, 45,679,158 bytes allocated
==19863==
==19863== LEAK SUMMARY:
==19863== definitely lost: 240 bytes in 3 blocks
==19863== indirectly lost: 124,336 bytes in 2,543 blocks
==19863== possibly lost: 0 bytes in 0 blocks
==19863== still reachable: 40 bytes in 1 blocks
==19863== suppressed: 0 bytes in 0 blocks

Here, the program has lost 2,546 objects, amounting to about 124kb of memory; not a lot on this run, but enough.

active directory applied formal logic ashurbanipal authentication books c c++ comics conference continuations coq data structure digital humanities Dijkstra eclipse virgo electronics emacs goodreads haskell http java job Knuth ldap link linux lisp math naming nimrod notation OpenAM osgi parsing pony programming language protocols python quote R random REST ruby rust SAML scala scheme shell software development system administration theory tip toy problems unix vmware yeti
Member of The Internet Defense League
Site proudly generated by Hakyll.