Thursday, November 25, 2010

Finger Trees

I want to take a minute to think about finger trees.[1]

Ralf Hinze and Ross Paterson published "Finger trees: a simple general-purpose data structure" in the Journal of Functional Programming in 2006, and since then it has become relatively famous in functional programming circles as, as someone put it, "a data structure in a can". Finger trees combine several features that make them incredibly useful, including
  • purity; finger trees are a purely functional data structure,
  • flexibility; finger trees can be used as the basis for a host of more specialized data structures,
  • efficiency; finger trees provide efficient implementations of a number of operations, and
  • simplicity; like most extra fancy data structures, finger trees are more tedious than complex, and their relatively simple structure even reduces the complexity.
Until now, however, I have not seen any implementations in Java. Let's see what we can do about that.

But first...


I want to make a quick digression about purely functional data structures. A purely functional data structure is one which does not provide nor make use of modification operations. The most common example of a purely functional data structure is a singly-linked list, as seen in Lisp and other functional languages. These structures have a number of advantages over traditional data structures: they are "persistent", in the sense that an update to the structure creates a new structure, leaving the original unchanged; and they are almost automatically thread safe. However, they have had problems achieving the performance of traditional structures. For example, while a non-functional hash table can provide best-case O(1) lookups, a search tree cannot normally provide better than O(log n) lookups.

2-3 trees


Several of the discussions about finger trees such as apfelmus' and Good Math, Bad Math's have focused on the interaction of monoids with tree structures; that is certainly where finger trees get much of their flexibility, but I think missing the first half of the solution Hinze and Paterson present is a significant loss. For one thing, bare finger trees by themselves support efficient, amortized O(1) deque operations (accessing the first and last elements) as well as O(log n) sequence concatenation.

Finger trees, as originally described, are based on a form of search tree called a 2-3 tree. 2-3 trees are a little-known (to me, at least) variant designed for simplicity in keeping trees balanced. A 2-3 tree consists of
  • an empty tree,
  • a tree with a single value
  • a tree made of nodes, each of which has either two or three values and two or three child links.

Part of Hinze and Paterson's improvement involves modifying the base 2-3 tree to provide "fingers" akin to Gérard Huet's zippers. These fingers provide efficient access to one to four elements on either end of the structure, while the remainder of the tree creates a "spine" down the center of the structure. As an end result, a finger tree consists of two collections of 1 to 4 elements, the first and last few elements of the collection, and a central spine made of a tree of 2-3 tree nodes of elements.

Digits


In this implementation, the "fingers" are instances of Digit (there is a pun there somewhere, but I am not sure I understand it). A Digit is a non-empty sequence of one to four elements. My original implementation used separate subclasses for the four possibilities, but that seemed excessively verbose.
public class Digit<T> implements Iterable<T> {
  private final int length;
  private final List<T> contents;
Instead, I have implemented the Digits using a list in order to share the implementation. (I used the alternate approach on the Nodes below.)

Methods like length(), isEmpty(), and get() are straightforward.
public int length() { return length; }
  public boolean isEmpty() { return length == 0; }
  public T get(int index) { return this.contents.get(index); }

On the other hand, because a Digit is necessarily a functional data structure, head()[2] (which returns a Digit containing all but the last element of this Digit) may seem a little strange.
public Digit<T> head() { return new Digit<T>(this.contents.subList(0, this.length - 1)); }
This method creates and returns a new digit instance containing the first length-1 elements of this Digit; the method tail() is symmetrical. These methods as written may not preserve the invariant of Digit, that there is at least one element.

Likewise, append() and prepend return new Digits. The constructors do ensure that the length invariant is established.
public Digit<T> append(T t) { return new Digit<T>(this.contents, t); }

I also implemented a small number of collect() static methods to create Digits from zero to four elements.
public static <T> Digit<T> collect(T t1, T t2) { return new Digit<T>(Arrays.asList(t1,t2)); }

The remainder of the Digit class consists of the Iterable<T> implementation and assorted constructors.

Nodes


The Node class is the implementation of the 2-3 tree internal nodes; for reasons that probably made sense at the time, I wrote the 2-nodes and 3-nodes as separate subclasses of a general Node class.
public abstract class Node<T> {
  public abstract Digit<T> toDigit();
Fortunately, the only method converts a Node to a Digit with the same elements.

Two static methods provide easy constructors for Node2s and Node3s. (These serve the same function as Digit's collect(), which originally worked the same way.
public static <T> Node<T> two(T x, T y) { return new Node2<T>(x,y); }
  public static <T> Node<T> three(T x, T y, T z) { return new Node3<T>(x,y,z); }

The Node2 and Node3 classes are about as simple as possible.
private static class Node2<T> extends Node<T>
  {
    public final T x, y;

    public Node2(T x, T y)
    {
      this.x = x;
      this.y = y;
    }

    @Override public Digit<T> toDigit() { return Digit.collect(x, y); }
  }

Finger tree


The base finger tree provides efficient implementations of the deque operations: accessing, adding, and removing elements from each end of a sequence. It adds implementations of sequence concatination. The interface is below.
public abstract class FingerTree<T>
{
  public abstract boolean isEmpty();

  public abstract FingerTree<T> push(T t);  
  public abstract FingerTree<T> pop();
  public abstract T top();
  
  public abstract FingerTree<T> unshift(T t);
  public abstract FingerTree<T> shift();
  public abstract T bot();

  public abstract FingerTree<T> append(FingerTree<T> other);
  public abstract FingerTree<T> prepend(FingerTree<T> other);
Notice that the deque operations come in trios: push() adds an element and pop() removes an element with both returning a new, modified FingerTree, and top() returns the last element in the sequence without changing the sequence. (Keep in mind I am misusing the words "add", "remove", and "modified"; the original FingerTree is still available and the new, returned tree shares as much of the structure of the original as possible.)

Empty


The FingerTree is implemented as three classes, Empty representing an empty tree, Single representing a tree containing one element, and Deep representing a tree containing more than one element. The Empty and Single classes are very simple; Empty is below.
public static class Empty<T> extends FingerTree<T>
  {
    @Override public boolean isEmpty() { return true; }

    @Override public Single<T> push(T t) { return new Single<T>(t); }
    @Override public FingerTree<T> pop() { throw new IllegalStateException(); }
    @Override public T top() { throw new IllegalStateException(); }
    // ...
    @Override public FingerTree<T> append(FingerTree<T> other) { return other; }
    @Override public FingerTree<T> prepend(FingerTree<T> other) { return other; }
  }
The implementations of shift(), unshift(), and bot() are identical to push(), pop(), and top().

Single


Single is very nearly as simple.
public static class Single<T> extends FingerTree<T>
  {
    public final T t;

    public Single(T t) { this.t = t; }

    @Override public boolean isEmpty() { return false; }

    @Override public FingerTree<T> push(T n)
      { return new Deep<T>(Digit.collect(t), new Empty<Node<T>>(), Digit.collect(n)); }

    @Override public T top() { return t;  }

    @Override public FingerTree<T> pop() { return new Empty<T>(); }
    // ...
    @Override public FingerTree<T> append(FingerTree<T> other) { return other.unshift(t); }
    @Override public FingerTree<T> prepend(FingerTree<T> other) { return other.push(t); }
  }
The important method is push() (and unshift()), which introduce Deep, the class which implements the trio of left and right Digits and central FingerTree spine. The important detail is that, while the new Deep object is a collection of T, the spine FingerTree is a collection of Node<T>.

Deep


Deep is indeed the magic ingredient of the FingerTree. As expected, it consists of left and right Digit<T> and a centeral FingerTree<Node<T>> spine.
public static class Deep<T> extends FingerTree<T>
  {
    public final Digit<T> left;
    public final FingerTree<Node<T>> spine;
    public final Digit<T> right;

    public Deep(Digit<T> l, FingerTree<Node<T>> s, Digit<T> r)
    {
      this.left = l;
      this.spine = s;
      this.right = r;
    }

    @Override public boolean isEmpty() { return false; }

    @Override public T top() { return right.get(right.length() - 1); }

    @Override
    public FingerTree<T> push(T t)
    {
      if (right.length() < 4)
      {
        return new Deep<T>(left, spine, right.append(t));
      }
      else
      {
        Node<T> node = Node.three(right.get(0), right.get(1), right.get(2));
        return new Deep<T>(left, spine.push(node), Digit.collect(right.get(3), t));
      }
    }
In the push() method, if the current length of the right Digit is less than four, the new element t can be handled by adding it to the Digit. If the right Digit is already full, it must be broken into the first three elements, which are placed in a Node<T> and pushed onto the spine, and the fourth and new elements, which become the contents of the new right Digit.

push() demonstrates one of the principles that makes FingerTrees efficient. Adding a new element to a FingerTree is O(1) 2/3 of the time, when the new element fits in the terminal Digit; it only needs to recurse 1/3 of the time. As a result, amortized over a number of operations, the push() (and the other deque operations) are O(1). Hinze and Paterson say,

Each deque operation may recurse down the spine of the finger tree, and thus take Θ(log n) time in the worst case. However, it can be shown that these operations take only Θ(1) amortized time, even in a persistent setting. The analysis is essentially the same as Okasaki’s for implicit deques (Okasaki, 1998). We outline the main points here.

We classify digits of two or three elements (which are isomorphic to elements of type [Node<T>]) as safe, and those of one or four elements as dangerous. A deque operation may only propagate to the next level from a dangerous digit, but in doing so it makes that digit safe, so that the next operation to reach that digit will not propagate. Thus, at most half of the operations descend one level, at most a quarter two levels, and so on. Consequently, in a sequence of operations, the average cost is constant.

In other words, while any specific operations may require O(log n) operations, a large number of sequential operations will behave appear to be O(1).

The pop() method is longer, but not any more difficult. It has three cases:
  • If the right Digit has more than one element, the new FingerTree can be built by dropping the rightmost element.
  • If the right Digit has only one element, that element can be dropped and a new FingerTree constructed by pulling the rightmost Node from the spine and unpacking it into the right Digit.
  • If the right Digit has only one element and the spine is empty, then pop() must redistribute elements from the left Digit, potentially making the new FingerTree an instance of Single.
@Override
    public FingerTree<T> pop()
    {
      if (right.length() > 1)
      {
        return new Deep<T>(left, spine, right.head());
      }
      else if (!spine.isEmpty())
      {
        return new Deep<T>(left, spine.pop(), spine.top().toDigit());
      }
      else
      {
        switch (left.length())
        {
        case 1:
          return new Single<T>(left.get(0));
        case 2:
          return new Deep<T>(Digit.collect(left.get(0)),
                                spine,
                                Digit.collect(left.get(1)));
        case 3:
          return new Deep<T>(Digit.collect(left.get(0)),
                                spine,
                                Digit.collect(left.get(1), left.get(2)));
        case 4:
          return new Deep<T>(Digit.collect(left.get(0), left.get(1)),
                                spine,
                                Digit.collect(left.get(2), left.get(3)));
        default:
          throw new IllegalStateException();
        }
      }
    }
The analysis of pop() is similar to that of push().

Concatenation


In addition to the deque operations, Deep implements sequence concatenation in the form of append() and prepend() (which is symmetric with append()).
@Override
    public FingerTree<T> append(FingerTree<T> other)
    {
      if (! (other instanceof Deep<?>))
      {
        return other.prepend(this);
      }
      else
      {
        Deep<T> o = (Deep<T>) other;
        return new Deep<T>(left, foldWings(this, o), o.right);
      }
    }
In the new FingerTree, the left Digit is the left digit of the appended-to tree, the right Digit is the right digit of of the appended tree, and the spine is built from the remaining two digits and the spines by the foldWings() method.

The foldWings() method first collects all of the elements from the two inner Digits and, arbitrarily choosing the left spine, pushes them into the spine. This process is the task of the while loop in foldWings() and is complicated by the need to break the two to eight elements into Nodes of two or three elements. This code prefers three-element Nodes, and only uses two-element Nodes when either only two elements are left or four elements are left, in which case two two-element Nodes are needed. Finally, the new spine is created and returned, by recursively appending the new left and original right spines.
private static <T> FingerTree<Node<T>> foldWings(Deep<T> port, Deep<T> starboard)
    {
      List<T> fist = new ArrayList<T>(port.right.length() + starboard.left.length());
      for (T t : port.right) { fist.add(t); }
      for (T t : starboard.left) { fist.add(t); }
      int seen = 0;
      int all = fist.size();  /* 2 <= all <= 8 */
      FingerTree<Node<T>> spine = port.spine;
      while (seen < all)
      {
        int diff = all - seen;  /* Initially, 2 <= diff <= 8 */
        if (diff == 2)
        {
          T a = fist.get(seen++);
          T b = fist.get(seen++);
          spine = spine.push(Node.two(a,b));
          /* diff reduced by 2 */
        }
        else if (diff == 4)
        {
          T a = fist.get(seen++);
          T b = fist.get(seen++);
          T c = fist.get(seen++);
          T d = fist.get(seen++);
          spine = spine.push(Node.two(a,b)).push(Node.two(c, d));
          /* diff reduced by 4 */
        }
        else
        {
          T a = fist.get(seen++);
          T b = fist.get(seen++);
          T c = fist.get(seen++);
          spine = spine.push(Node.three(a,b,c));
          /* diff reduced by 3 */
        }
      }
      /*    (port.spine)
         + ("loose" port-right nodes, folded)
         + ("loose" starboard-left nodes, folded)
         + (starboard.spine) */
      return spine.append(starboard.spine);
    }
  }
The analysis of concatenation by Hinze and Paterson is,

It is easy to check that in each of the invocations of [append, the list fist has length at most 8], so each of these invocations takes Θ(1) time. The recursion terminates when we reach the bottom of the shallower tree, with up to 4 insertions, so the total time taken is Θ(log(min{n1 , n2 })), where n1 and n2 are the numbers of elements in the argument trees.

Note that the debit analysis of Section 3.2 remains valid, because although [append] must discharge any debits on the nodes on the spines that are merged, there will be at most Θ(log(min{n1 , n2 })) of these.

Comparisons


Given the analysis of the behavior of finger trees as sequences, how do they actually perform? I ran some quick tests, comparing them with JDK-standard ArrayLists and LinkedLists on element addition, removal, and concatenation. The contents of the sequences were Integers.

Keep in mind that these results are benchmarks (at the lower level of lies, damned lies, statistics, and benchmarks), and are also poorly done benchmarks. I did not use a quiescent machine, I did not "warm up" the program, I did not make any effort to optimize anything, I even ran the things in Eclipse for crying out loud. As a result, they are full of noise, they are not statistically significant (although they were repeatable), and I would not hesitate to disclaim them if someone really wants to complain. But they are kind of interesting.

Push


For inserting elements, I started with a sequence of n elements and timed adding n elements, reporting the time taken divided by n. The additions were performed by
t = t.push(j);         /* FingerTree */
al.add(j);             /* ArrayList */
ll.add(j);             /* LinkedList */
The results are shown in Figure 1.

graph of FingerTree vs ArrayList and LinkedList performance for element addition
Figure 1: Adding an element

Yes, the left side of the graph is very inaccurate. Sorry.

I added some guide lines showing log n, n, and \(n^2\) performance. I also used log scales for both horizontal and vertical axes, which I really wish more systems researchers used in papers, since otherwise the right half of the graph would be effectively empty.

The primary thing to notice is that the trends of the three lines are roughly the same, and indicate something like O(1) performance, which all three should exhibit.

Pop


For removing elements, I followed a similar process, starting with a sequence of n elements, inserting n more elements and then timing removing those n elements with the following operations:
t = t.pop();                /* FingerTree */
al.remove(al.size() - 1);   /* ArrayList */
ll.remove(ll.size() - 1);   /* LinkedList */
The results are shown in Figure 2.

graph of FingerTree vs ArrayList and LinkedList performance for element removal
Figure 2: Removing an element

Concatenation


Comparing concatenation, I concatenated two sequences of n elements 100 times (resulting in some truly huge sequences), reporting the total time. The operations were:
t = t.append(t2);       /* FingerTree */
al.addAll(al2);         /* ArrayList */
ll.addAll(ll2);         /* LinkedList */
The results are shown in Figure 3.

graph of FingerTree vs ArrayList and LinkedList performance for concatenation
Figure 3: Concatenating sequences

Figure 3 shows the one advantage of finger trees: the slope of the finger tree concatenation line is significantly less than that of the ArrayList or LinkedList; FingerTree looks, in fact, to be roughly log n, while the other two are roughly parallel to n.

With some luck, I will be following this post with another that tells the rest of the finger tree story.

Footnotes

1. Cue Petey Fisk.


2. Think the head and tail Unix commands.

[Edit: fixed Ralf Hinze's and Ross Paterson's names. Darn you, traitorous fingers! Darn you!]

No comments: