Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Wednesday, February 1, 2012

State space search: A*

Last time, I quoted Wikipedia, "For the 15-puzzle, lengths of optimal solutions range from 0 to 80 moves". Unfortunately, the two solutions I found require 255 and 205 moves. That seems less than satisfactory.

So, here's the state of 15-puzzle solutions as it stands: we have depth-first search, which is relatively fast but which is not guaranteed to find a solution. On the other hand, we have breadth-first search, which is guaranteed to find an optimal solution, in the sense of the smallest number of moves for the 15-puzzle, but which is slow and expensive. And finally, we have a couple of heuristics which, used alone, relatively quickly find a solution, having some of the benefits of depth-first search, but which find markedly sub-optimal solutions.

What this situation calls for is a slightly modified algorithm. The algorithm I have in mind is called A*. A* is actually a basic modification of the state evaluation function used by an informed search. Instead of the Manhattan distance or the number of tiles out of position (which I have learned should be called the Hamming distance) as the evaluation function, A* uses the sum of a forward-looking heuristic function and the cost of the path from the initial state to the state being evaluated. For the 15-puzzle, this would be, say, the Manhattan distance of the current state from the goal state plus the number of moves from the initial state to the current state.

A* works by being a best-first search; by expanding the state that looks like it will produce the best solution, rather than a state that is simply structurally next (as in the case of uninformed searches) or a state that looks like it is nearest to a solution. As a result, A* is likely to use fewer resources than breadth-first search while finding a better solution than an informed depth-first search.

In fact, A* has a very nice property: if the forward-looking heuristic function is admissible, then the search algorithm is guaranteed to find the optimal solution. Admissible simply means that the heuristic function does not overestimate the cost of the path from the current state to a goal. Possibly the simplest way of finding an admissible heuristic function is to relax some of the constraints on the problem; for example, the Hamming distance heuristic relaxes the 15-puzzle problem by allowing tiles to be directly moved to their goal position and counting the number of tiles thus moved while the Manhattan distance heuristic relaxes the constraints by allowing tiles to move directly to their goal states (rather than requiring sliding them into the empty space) and counting the number of moves needed. Both the Hamming distance and the Manhattan distance are admissible.

Converting the fifteenpuzzle code to use A* is incredibly easy:

  public static class AStarDistance implements Comparator<FState>
  {
    @Override
    public int compare(FState o1, FState o2)
    {
      return (o1.totalDistance() + o1.pathLength) - (o2.totalDistance() + o2.pathLength);
    }
  }

I created a new implementation of Comparator<FState> which compares the two states by computing the totalDistance value (the Manhattan distance) of each object and the length of the path from the initial state to the states and then comparing the resulting sums. When creating an instance of Search<FState>, use a PriorityQueue based on this Comparator.

How well does it work? I've got no clue. Sorry. The A* search goes through hundreds of thousands of states before it exhausts the default JVM heap size, and it takes long enough to do so that I don't really feel interested in trying it with more memory. The implementation could be optimized, but this is an exponential algorithm and optimization would be unlikely to make too much of a dent. (On the other hand, there is the possibility, however remote, that I've really hoarked something up. If so, I'd appreciate knowing about it.)

But I have the next best thing: by modifying the A* evaluation function, I can bias the evaluation towards the heuristic function, producing a solution in reasonable time and space. Unfortunately by doing so, I am likely to effectively make the heuristic function inadmissible, meaning that A* is no longer guaranteed to find an optimal solution. This is the implementation:

  public static class BiasedAStarDistance implements Comparator<FState>
  {
    private final int bias;
    
    public BiasedAStarDistance(int bias)
    {
      this.bias = bias;
    }

    @Override
    public int compare(FState o1, FState o2)
    {
      return (bias * o1.totalDistance() + o1.pathLength) - (bias * o2.totalDistance() + o2.pathLength);
    }
  }

For me, it works significantly better than the un-biased A* search.

BiasStates examinedPath length
5386575
4426683
315,44273
219,87957

You can get the code on github.

Sunday, January 22, 2012

State space search: heuristics

Last time, I wrote, without any explanation

Using a FIFO queue searches the state space breadth-first, visiting all of the states at a given distance from the initial state before any states at a greater distance. Using a LIFO queue searches the state space depth-first, visiting all of the states following a given state before visiting any other states.

The way that works is important to the power of search. Consider using a first-in, first-out queue and expanding an initial state \(n\). Expanding \(n\) enqueues the states \((o_1, o_2, o_3)\). The next step is to remove the first state from the queue, expand it, and enqueue the results, leaving the queue as \((o_2, o_3, p_1, p_2)\) (assuming expanding \(o_1\) produced \(p_1\) and \(p_2\)). As a result, following this queuing strategy, all of the states at depth \(o\) from the initial state are expanded before any of the states at depth \(p\). Hence, "breadth-first".

On the other hand, if the algorithm uses a last-in, first-out structure, expanding the same initial state \(n\) still produces a queue of \((o_1, o_2, o_3)\). However, the next step is to remove the last state from the queue, expand it, and enqueue the results, producing \((o_1, o_2, p_1, p_2)\), assuming expanding \(o_3\) produced \(p_1\) and \(p_2\) this time. This time, the next step will be to expand \(p_2\), with the result that all of the states below \(o_3\) will be examined before any of the other states \(o_1\) or \(o_2\). Thus, "depth-first".

So far, I have been discussing uninformed search; these approaches order their exploration of the problem's state space based on the pattern of state-space expansion. There is a better way: informed search applies information about the problem to focus the exploration of the state space on important regions. Here's an example:

Sliding number puzzles: fifteenpuzzle

Returning to Gerhard Wickler, he describes the basics of the 8-puzzle (a smaller version of the 15-puzzle) as,

Hideous illustration of an 8-puzzle

The 8-puzzle is a smaller version of the slightly better known 15-puzzle. The puzzle consists of an area divided into a grid, 3 by 3 for the 8-puzzle, 4 by 4 for the 15-puzzle. On each grid square is a tile, expect for one square which remains empty. Thus, there are eight tiles in the 8-puzzle and 15 tiles in the 15-puzzle. A tile that is next to the empty grid square can be moved into the empty space, leaving its previous position empty in turn. Tiles are numbered, 1 thru 8 for the 8-puzzle, so that each tile can be uniquely identified.

The aim of the puzzle is to achieve a given configuration of tiles from a given (different) configuration by sliding the individual tiles around the grid as described above.

This puzzle is not hard to put into a form suitable for state space search, which should provide a solution as long as the algorithm does not expand a state more than once. If the shortest solution is needed, the breadth-first approach is a straightforward choice.

The state space for sliding puzzles is significantly larger than that of the missionaries and cannibals puzzle; there are \(n!\) ways to permute \(n\) tiles, although only half of those are reachable from a given goal state, although the proof is enlightening itself. (The 8-puzzle therefore illustrates one of the hazards of state space manipulation: some states may be unsolvable.) So, the 8-puzzle version has 181,440 possible states, including the empty tile; the 15-puzzle has 10,461,394,944,000.

As a result, a complete search of the state space is likely to be infeasible and any uninformed search will also likely take more resources than strictly desirable. But there are ways to exploit knowledge about the problem to shorten the search.

Suppose, for example, you can find out, given a state, approximately how many moves are needed to get from the state to a goal? Then, you could select the next state to explore based on your estimate of how close it is to the solution. Fortunately, both of these are possible: there are a number of heuristics that can provide the necessary estimate, and our friend, the priority queue, can handle the selection of the next state to expand.

One possible heuristic is the number of out of place tiles, because that is obviously going to be related to the number of moves necessary to get the tiles into their places. Importantly, for reasons that I don't intend to go into here, it always underestimates the number of moves. If the search explores the state space, focusing on the states with fewer and fewer tiles out of place, the search is likely to find the goal while significantly reducing the resources required.

To implement the states for the 15-puzzle, I chose to represent the positions of the tiles as locations in an ArrayList of integers, with the tile values being 1 through 15 with 0 as the empty tile. (For expediency, I also record the location of the empty tile in the array.) As a result, determining whether a state is the goal is simple. For locations 0 to 14, whenever the value of the location is one greater than the location number, the puzzle is solved.

  @Override
  public boolean isGoal()
  {
    // The final square should be 0 but I don't care.
    for (int i = 0; i < BOARD - 1; ++i)
    {
      if (board.get(i) != i + 1) { return false; }
    }
    return true;
  }

Expanding a state a bit more complicated since it requires attempting to move the empty space left, right, up, and down. (Or, alternately, move the tile to the left into the empty space. Depends on how you look at these things.) One of the directional methods is shown below; the other three are similar.

  @Override
  public List<FState> expand()
  {
    List<FState> result = new ArrayList<FState>();
    for (FState newState : Arrays.asList(this.left(), this.right(), this.up(), this.down()))
    {
      if (newState != null) { result.add(newState); }
    }
    return result;
  }

[...]

  private FState left()
  {
    if (empty % ROW != 0)
    {
      // Empty location is not at left edge
      return new FState(this, empty - 1);
    }
    else
    {
      return null;
    }
  }

(I'll leave it to you to figure out how modular arithmatic lets me fiddle with a rectangular grid in a one-dimensional array.)

Implementing the out-of-position heuristic is similar to checking isGoal, with the complication of counting the number of tiles out of place.

  public int outOfPosition()
  {
    int out = 0;
    for (int i = 0; i < BOARD - 1; ++i)
    {
      if (board.get(i) != i + 1)
      {
        ++out;
      }
    }
    return out;
  }

And finally, to allow the heuristic function to interface with a priority queue, I need a Comparator subclass.

  public static class OutOfPosition implements Comparator<FState>
  {
    @Override
    public int compare(FState o1, FState o2)
    {
      return o1.outOfPosition() - o2.outOfPosition();
    }
  }

The following code runs the search on a "harder" problem (harder than the one in the illustration above, anyway).

    System.out.println("Heuristic search: out of position");
    Queue priority = new PriorityQueue(100, new FState.OutOfPosition());
    Search search = new Search(queue, new HashSet(), FState.harder());
    FState end = search.findGoal();

This search exampines 4387 states to find a goal with a path length of 255.

There are certainly many possible heuristics. An alternative heuristic for the 15-puzzle is the total distance for each tile in a state between its position and its goal position. Specifically, the Manhattan distance, which is sum of the number of rows and columns each tile is out of position. The total of the Manhattan distance is computed by the totalDistance method. The advantage of this heuristic is that it more closely estimates the number of moves necessary to solve a puzzle from a given state.

  public int totalDistance()
  {
    int out = 0;
    for (int i = 0; i < BOARD; ++i)
    {
      if (i == empty) continue;
      int position = board.get(i) - 1;
      out += Math.abs((i % ROW) - (position % ROW));
      out += Math.abs((i / ROW) - (position / ROW));
    }
    return out;
  }

In this case, rather than implementing Comparator, I made the FState class Comparable

  @Override
  public int compareTo(FState o)
  {
    return this.totalDistance() - o.totalDistance();
  }

The driver program creates a PriorityQueue with no arguments to use the Comparable.

    System.out.println("Heuristic search: manhattan distance");
    Queue priority = new PriorityQueue();
    Search search = new Search(queue, new HashSet(), FState.harder());
    FState end = search.findGoal();

This version only examines 1469 states to find a solution of 205 moves.

For comparison, according to Wikipedia, "For the 15-puzzle, lengths of optimal solutions range from 0 to 80 moves". The difference is that the raw heuristic search that I am writing about here behaves similarly to depth-first search in that it makes no claim to find an optimal path. On the other hand, the raw heuristic search does find a solution quickly, much faster than a breadth-first search.

You can get the code on github .

Sunday, January 15, 2012

State space search: the basics

A while back, briefly discussing Steve McConnell's Code Complete, I wrote, "On the other hand, McConnell previously presented as a good use of recursion a maze-solving function; maze solving is an application of state-space search, which has a simple, clear, and wildly flexible iterative implementation based on a priority queue." Between that, and the Stanford AI course (which only briefly mentioned most of what I spent a semester learning back in the late '80's), I have been pondering state space search lately. It really is an elegant algorithm. So, I thought I'd spend some spare time writing an elaborate and likely unnecessary post or three about it.

I'll begin with a state. A state is a particular configuration of the system in question. For example, a board position in chess is a state. When you place the pieces on the board, you create an initial state; after some number of moves, the board is in some other state. When the game is over, the board is in a terminal state; if you won, it's in a goal state, but the difference is the topic of some other post. For the moment, I will only be talking about systems where the only terminal states are goals.

A state space is the set of all possible states for a system. For chess, the state space is all possible board positions, and there are a lot of them. Chess has a fairly large state space. The states in a state space are related by the valid actions of the system; given a state those actions produce a collection of the next states reachable by one operation of the system. For chess, the actions are the valid moves of individual chess pieces and, because the initial state is defined by the rules of chess, the state space includes all board positions reachable from that initial state by legal chess moves.

State space search is the technique of searching for a goal state, beginning with an initial state and using the operations available in each state to generate new states. Or, alternatively, beginning with a goal state and searching for a suitable initial state. Or possibly even starting with an initial state and a goal state and searching for a way to connect the two in the state space. Each of those approaches is suitable for some problems, depending on the nature of the problem. The key point is to explore the state space of the problem, searching for a solution.

State space search is one of the most powerful techniques in computer science. If a problem can be framed in a way suitable to state space search (and most can), then the algorithm is guaranteed to find a solution. Theoretically, anyway. If you have a problem and do not have any more direct algorithm, it is the go-to way to solve it. Or, at least it was; there are other techniques that have become popular, usually for very good reasons.

On the other hand, state space search has some serious weaknesses. For one thing, not all problems can be usefully framed in terms of a state space. It requires that the problem be both discrete and deterministic; discrete in such a way that all states are distinct from one another and deterministic so that an action produces exactly one subsequent state from an input state. But that is not the worst part. State space search is the poster child for exponential algorithms. I said it was guaranteed to find a solution, but I carefully didn't say you would be alive to see it. Or, anything would be alive to see it. But there are plenty of problems to which state space search is applicable.

The algorithm

So, with all that buildup, the algorithm is a typical let-down:

def search(queue):
    while not empty?(queue):
        current_state := remove_element(queue)
        if is_goal?(current_state):
            return current_state
        for next_state in expand(current_state):
            if not seen?(next_state):
                mark_seen(next_state)
                add(queue, next_state)
    return nil

The intelligence of the algorithm is actually in the:

  • Queue that is used to store states before they are examined and expanded and which determines the nature of the search,
  • The rules that determine whether a state is a goal and how the state is expanded into subsequent states, and
  • The set used to determine whether a state has been seen before; this and the queue will determine how much memory the algorithm uses.

How does this look in Java?

  public T findGoal()
  {
    while (!queue.isEmpty())
    {
      statesExamined++;

      T current = queue.remove();
      if (current.isGoal())
      {
        return current;
      }
      for (T newState : current.expand())
      {
        if (!seen.contains(newState))
        {
          seen.add(newState);
          queue.add(newState);
        }
      }
    }
    return null;
  }

statesExamined is an accounting addition, just used for entertainment purposes. The rest of the class similarly simple, initializing seen and queue based on constructor arguments. Like I said, nothing up my sleeve....

public class Search<T extends State<T>>
{
  private int statesExamined;
  private final Queue<T> queue;
  private final Set<T> seen;

  public Search(Queue<T> queue, Set<T> seen, T initialState)
  {
    this.queue = queue;
    queue.add(initialState);
    this.seen = seen;
    this.statesExamined = 0;
  }

  [...findGoal...]

  public int statesExamined()
  {
    return statesExamined;
  }
}

The Set seen is used to prevent states from being expanded repeatedly. It records when a state is first seen and subsequently prevents it from being re-added to the queue. Without this, the algorithm could enter an infinite loop if the state space is not acyclic.

The State interface is not much more interesting. The only required methods are to expand the state, producing a list of subsequent states produced by valid actions, and a predicate that is true if the state is a goal.

public interface State
{
  public List expand();

  public boolean isGoal();
}
If the state space search code is so simple, how is the algorithm capable of such power? How about an example?

Missionaries and cannibals: rivercrossing

The problem is, according to Gerhard Wickler,

On one bank of a river are three missionaries and three cannibals. There is one boat available that can hold up to two people and that they would like to use to cross the river. If the cannibals ever outnumber the missionaries on either of the river’s banks, the missionaries will get eaten.

How can the boat be used to safely carry all the missionaries and cannibals across the river?

(I have seen versions where the threat is the missionaries converting the cannibals, rather than the cannibals cannibalizing (ahem) the missionaries. I am a traditionalist, however.)

Gerhard Wickler also has an excellent illustration of the complete state space for this problem. The initial state is on the left; the goal state is on the right. With 16 states, the state space is tiny in comparison with the space of other problems, but it gets my point across.

Each state is a configuration of the missionaries and cannibals on each side of the river as well as which shore the boat is on. The process of crossing the river is abstracted away; each state transition involves the boat, with at least one passenger, crossing the river.

Specifically, a state in the puzzle is made up of the participants on this side of the river, the participants on that side of the river, and the location of the boat, this side or that side.

  private static enum BoatSide { THIS, THAT };

On either side of the river are some number of missionaries and cannibals. I abstracted out the state of each side of the river as a class. There are a couple of special features of RiverSide, though.

  • First, the class includes methods missionaries, cannibals, and both which returns a new RiverSide instance with the number of missionaries or cannibals modified. These are used when generating new states while expanding a given state.
  • Second, the class implements equals and hashCode, making instances eligible to be stored in a HashMap.
  private static class RiverSide
  {
    public final int missionaries;
    public final int cannibals;
    
    public RiverSide(int missionaries, int cannibals)
    {
      this.missionaries = missionaries;
      this.cannibals = cannibals;
    }
    
    public RiverSide missionaries(int n)
    {
      return new RiverSide(missionaries + n, cannibals);
    }

    public RiverSide cannibals(int n)
    {
      return new RiverSide(missionaries, cannibals + n);
    }

    public RiverSide both(int n)
    {
      return new RiverSide(missionaries + n, cannibals + n);
    }
    
    @Override
    public boolean equals(Object o)
    {
      if (! (o instanceof RiverSide)) return false;
      final RiverSide other = (RiverSide) o;
      return this.missionaries == other.missionaries
          && this.cannibals == other.cannibals;
    }
    
    @Override
    public int hashCode()
    {
      return missionaries ^ cannibals;
    }
    
    @Override
    public String toString()
    {
      return String.format("[%d missionaries, %d cannibals]", missionaries, cannibals);
    }
  }

There are two constructors for the state class: the first, public, one creates the initial state for the problem, with three missionaries, three cannibals, and the boat on this side of the river. The second, private, one creates a new MCState based on a previous one and updated inhabitants of this and that side. The previous state is recorded so that the program can reconstruct the sequences of moves in a solution.

The methods required for State are next. isGoal is implemented by checking how many missionaries and cannibals are on that side of the river. expand is more complex, but breaks down into a number of branches; if the boat is on this side of the river, then expand generates all possible ways for it to get to the other side of the river, carrying one or two missionaries or cannibals; if the boat is on the other side, it does the reverse. Specifically, it collects a list of subsequent states; if at least one missionary is on this side, then a possible next state has the boat and one more missionary on that side. Likewise, if there are two missionaries on this side with the boat, another next state would be two less missionaries on this side and two more and the boat on that. The same logic applies to the cannibals and to a missionary and a cannibal moving to the other side. All of these branches go through queueNewState, which first checks whether the new state is invalid if it presents the opportunity for some cannibals to devour some missionaries. As a result of the branches, each state has five possible next states, although some (or most) of those states are invalid.

MCState also implements hashCode and equals, making instances eligible for HashSet storage.

public class MCState implements State<MCState>
{
  [...]
  
  public final BoatSide boatSide;
  public final RiverSide thisSide;
  public final RiverSide thatSide;
  public final MCState previous;
  
  public MCState()
  {
    boatSide = BoatSide.THIS;
    thisSide = new RiverSide(3, 3);
    thatSide = new RiverSide(0, 0);
    previous = null;
  }
  
  private MCState(MCState previous, RiverSide thisSide, RiverSide thatSide)
  {
    this.boatSide = previous.boatSide == BoatSide.THIS ? BoatSide.THAT : BoatSide.THIS;
    this.thisSide = thisSide;
    this.thatSide = thatSide;
    this.previous = previous;
  }
  
  @Override
  public boolean isGoal()
  {
    return thatSide.missionaries == 3 && thatSide.cannibals == 3;
  }

  @Override
  public List<MCState> expand()
  {
    List<MCState> newStates = new ArrayList<MCState>();
    if (boatSide == BoatSide.THIS)
    {
      if (thisSide.missionaries > 0)
      {
        final MCState newState =
            new MCState(this, thisSide.missionaries(-1), thatSide.missionaries(1));
        queueNewState(newStates, newState);
      }
      if (thisSide.missionaries > 1)
      {
        final MCState newState =
            new MCState(this, thisSide.missionaries(-2), thatSide.missionaries(2));
        queueNewState(newStates, newState);
      }
      if (thisSide.cannibals > 0)
      {
        final MCState newState =
            new MCState(this, thisSide.cannibals(-1), thatSide.cannibals(1));
        queueNewState(newStates, newState);
      }
      if (thisSide.cannibals > 1)
      {
        final MCState newState =
            new MCState(this, thisSide.cannibals(-2), thatSide.cannibals(2));
        queueNewState(newStates, newState);
      }
      if (thisSide.missionaries > 0 && thisSide.cannibals > 0)
      {
        final MCState newState =
            new MCState(this, thisSide.both(-1), thatSide.both(1));
        queueNewState(newStates, newState);
      }
    }
    else
    {
      if (thatSide.missionaries > 0)
      {
        final MCState newState =
            new MCState(this,thisSide.missionaries(1), thatSide.missionaries(-1));
        queueNewState(newStates, newState);
      }
      if (thatSide.missionaries > 1)
      {
        final MCState newState =
            new MCState(this,thisSide.missionaries(2), thatSide.missionaries(-2));
        queueNewState(newStates, newState);
      }
      if (thatSide.cannibals > 0)
      {
        final MCState newState =
            new MCState(this,thisSide.cannibals(1), thatSide.cannibals(-1));
        queueNewState(newStates, newState);
      }
      if (thatSide.cannibals > 1)
      {
        final MCState newState =
            new MCState(this,thisSide.cannibals(2), thatSide.cannibals(-2));
        queueNewState(newStates, newState);
      }
      if (thatSide.missionaries > 0 && thatSide.cannibals > 0)
      {
        final MCState newState =
            new MCState(this,thisSide.both(1), thatSide.both(-1));
        queueNewState(newStates, newState);
      }
    }
    return newStates;
  }
  
  private void queueNewState(final List newStates, final MCState newState)
  {
    if (newState.isValid())
    {
      newStates.add(newState);
    }
  }
  
  private boolean isValid()
  {
    return (thisSide.missionaries == 0 || thisSide.missionaries >= thisSide.cannibals)
        && (thatSide.missionaries == 0 || thatSide.missionaries >= thatSide.cannibals);
  }

  @Override
  public String toString()
  {
    final String format = (boatSide == BoatSide.THIS) ? "%s * -> %s" : "%s -> * %s";
    return String.format(format, thisSide, thatSide);
  }
  
  @Override
  public boolean equals(final Object o)
  {
    if (! (o instanceof MCState)) return false;
    final MCState other = (MCState) o;
    return this.thisSide.equals(other.thisSide)
        && this.thatSide.equals(other.thatSide)
        && this.boatSide == other.boatSide;
  }
  
  @Override
  public int hashCode()
  {
    return thisSide.hashCode() ^ thatSide.hashCode() ^ boatSide.hashCode();
  }
}

That is a fair amount of code, and nicely tedious. The key points, though, are the implementation of the State interface; the State knows whether or not it is a goal, and it knows how to generate the next states, following immediate, single step actions.

After that big wall o' code, you are probably waiting for a punch line. Unfortunately, there isn't one. Just a driver class invoking the search twice with a couple of different queue parameters. The first option uses ArrayDeque, which implements a First-In, First-Out queue.

    Queue fifo = new ArrayDeque();
    Search search = new Search(fifo, new HashSet(), new MCState());
    MCState end = search.findGoal();

The second option uses an anonymous subclass of ArrayDequeue where remove calls removeLast rather than removeFirst. As a result, it implements a Last-In, First-Out queue, or a stack.

    Queue lifo = new ArrayDeque() {

      @Override
      public MCState remove()
      {
        return super.removeLast();
      }
      
    };
    Search search = new Search(lifo, new HashSet(), new MCState());
    MCState end = search.findGoal();

These two options are my first demonstration of the flexibility of the state space search algorithm. Using a FIFO queue searches the state space breadth-first, visiting all of the states at a given distance from the initial state before any states at a greater distance. Using a LIFO queue searches the state space depth-first, visiting all of the states following a given state before visiting any other states. The difference in this case is modest; a FIFO queue visits 16 states while a LIFO queue visits 13 to find a solution. But that is just the tip of the iceberg.

Update: completely forgot about this. If you want to try it yourself, there is a related problem involving a farmer crossing a river. The farmer is carrying a bushel of corn, a duck, and a fox. (Yeah, I don't really get it either.) If the duck is left unattended with the corn, the corn gets eaten. If the fox is left unattended with the duck, the duck gets eaten. The boat will only carry the farmer and one of the other items. How can the farmer get to the other side of the river with the corn, duck, and fox intact? You can get the state space search code, such as it is, and you will need to write a State for the problem similar to MCState and a driver program.

For the continuing story of state space search, check out the next post, State space search: heuristics.

You can get the code on github. I owe a considerable debt to Gerhard Wickler for his problem definition and illustration. He has a Java library of documented search code that you might want to check out as well.

Monday, December 12, 2011

Mad science, abstract data types, and objects

Mad science, and mad scientists, are always good to find. They represent guaranteed entertainment. But they do show up in the oddest places. How about this one: OOPSLA 2009, William Cook presents a talk entitled "No one listened! But I'll show them, I'll show them all!". And the weird part was, nobody noticed. Well, sure, it's actually a popular paper that gets referenced frequently and the title is really "On Understanding Data Abstraction, Revisited", but there's no real sign that the peasants went scurrying for their pitchforks and torches.

William Cook

So, what's the deal with Cook and his paper?

A little backstory

In 1990, Cook published "Object-Oriented Programming Versus Abstract Data Types" which is almost identical in content to, but somewhat longer than, "On Understanding Data Abstraction, Revisited". In both papers Cook describes the long-established, but little known, difference between objects and abstract data types. (For simplicity, the object-oriented side is limited to a subset of what I learned to call "object-based programming"; objects, as you might normally think of them, minus inheritance and also without modifications.)

Given those limitations, objects are data abstractions that "have a collection of methods, or procedures, that share access to private local state." Consider the type specimen of object-oriented programming: Smalltalk. In Smalltalk, objects have internal fields that contain the object's state, but this state is entirely unavailable to the outside world. Instead the object has a collection of messages that it responds to, exposing information potentially based on its internal state as necessary. Once an object is created, it is entirely encapsulated, isolated from any other code than that implementing its behavior. Because the interface of the data abstraction is the methods exposed by the object, Cook identifies objects as procedural data abstractions. The key feature of an object is that its interface is defined by its constructor: the collection of methods the object responds to and the semantics of those methods are defined by the construction of the object. In a sense, those methods are associated with the specific value or instance.

Abstract data types, on the other hand, have "a public name, a hidden representation, and operations to create, combine, and observe values of the abstraction." The classic examples of them are the primitive types in Java, C++, or most other common languages. A value has a defined type name, say "double", some representation (Do you really know, or care, what a double actually looks like? (Note: You should. Floating point numbers are a notoriously leaky abstraction.)), and operations like reading one or converting one from some other representation, adding or subtracting two of them, and converting one to another representation, such as a string of characters. The key feature of an ADT is that its interface is defined by a collection of operations that apply to a specific type.

Constructors
Observations[]::
empty?truefalse
headerrorvalue
Let's say we create a matrix of operations on a data abstraction. My data abstraction is a list; it has two constructors, the Haskell-like "[]", or nil, building an empty list, and "::" or "cons", representing a constructor that takes a new element and an existing list and produces a new, combined list with the new element at the head. The list also has two "observation" operations, empty? which returns true for an empty list and false otherwise, and head, which produces some kind of error on an empty list, but returns the first value in the list otherwise. (This example is taken directly from "OOP Versus ADT".)

Constructors
Observations[]::
empty?truefalse
headerrorvalue

Object-oriented

Suppose I want to create an object-oriented version of this data abstraction. I would divide up the operations by constructor, so that I had an implementation of empty? and head for an empty list and empty? and head for a cons cell. The details of each operation is divided among the objects created by each specific constructor. Adding a new constructor is easy, since it simply requires providing implementations of the observations on the new object. On the other hand, adding a new observation is difficult, since it requires modifying all of the existing constructors. This should sound familiar to anyone who has written code in almost any object oriented language.

Constructors
Observations[]::
empty?truefalse
headerrorvalue

ADT

On the other hand, suppose I want an ADT version of this data abstraction. In this case, I would divide the operations by observation, so that I had an implementation of empty? and head that each covered all of the constructors. The details of each constructor is spread among the observation implementations. Adding new observation is easy; it just has to handle the individual cases of the constructors. Adding a new constructor is hard, though, because every observation needs to be modified. This option should be familiar to anyone who has used ML or Haskell, since it is how their algebraic data types work.

There are some less obvious consequences to these two styles of data abstraction. For one thing, the two approaches are almost duals; the advantages of one are the weaknesses of the other, and vice-versa. Both approaches do, in fact, provide data abstraction; objects by encapsulating the state of the data instance behind procedural abstractions, ADTs by encapsulating that state behind a type boundary. Both also describe (if you squint a bit) common data abstraction patterns. On the other hand, each approach has some peculiarities: because an object only has access to its own state, complex operations require weakening the encapsulation of objects. Cook calls "an operation 'complex' if it inspects multiple representations," like for example equality testing; I may not want to expose all of the implementation information necessary to compare two instances in the methods that an object responds to, yet I have no other option in order to implement some observations. Cook claims that object oriented, procedural abstractions are possible in dynamically typed languages but ADTs are not; "abstract data types depend upon a static type system to enforce type abstraction. It is not an accident that dynamic languages use objects instead of user-defined abstract data types. Dynamic languages typically support built-in abstract data types for primitive types; the type abstraction here is enforced by the runtime system."

I do not doubt the validity of this deconstruction of objects and abstract data types, if for no other reason than that it exactly covers Philip Wadler's "expression problem". Also, both ideas seem pretty productive, which is always the primary test of science on the hoof. If that was all there was to it, there would be no mad science involved.

The mad science part

Mad Science Tip #1: If you have to repeat the same message over 20 years, and no one seems to be listening to you, they're the crazy ones. They're the ones who don't see.

Mad Science Tip #2: A beautiful theory, one that explains everything, is obviously true. Even if reality doesn't seem to particularly agree. The ADT half of Cook's papers seems to be relatively uncontroversial, perhaps because no one actually uses the languages (like ML and Haskell) that have user-defined ADTs as the primary data abstraction scheme. On the other hand, Cook's minor point about the inability to use ADTs in dynamically typed languages doesn't seem particularly, well, true; most Lisps, for example, allow you to define structures (or write macros to build such structures) that have representations hidden from any functions other than a limited group. Such pseudo-ADTs may not be as encapsulated as Cook would like, but coupled with a common coding style rule, say "Don't do stupid things," they work well enough. Objects, likewise, do not actually seem to exist. It's been too long since I last used Smalltalk, but every other "object-oriented" language allows a method in a class to access the internals of more than one object of the same class, neatly avoiding (some cases of) the "complex" operation issue. (To avoid all of the difficulties presented by complex operations requires something like multimethods or other techniques that are even more rare.) Maybe some of the object systems for Scheme satisfy the requirements for objects, but Scheme doesn't have a standard anything, much less an object system; it is a toolkit for implementing other languages. Further, Cook's ideas on actually doing object-oriented programming in Java would likely horrify most Java-istas, even those that love interfaces as much as I do.

Mad Science Tip #3: You get extra points for hosting dinner parties where you torment your rivals.

What is the relationship between objects and abstract data types (ADTs)? I have asked this question to many groups of computer scientists over the last 20 years. I usually ask it at dinner, or over drinks. The typical response is a variant of “objects are a kind of abstract data type”.
Whereupon he goes all maniacal on them.

Last and possibly least, Mad Science Tip #4: Being a mad scientist is fun; you get to meet interesting people, there's nothing like the je ne sais quoi of carving your name in the moon with a giant laser. (For that matter, giant lasers are just plain cool no matter what you do with them.) But it may have detrimental effects on your social life.

Most groups eventually work through the differences between objects and ADTs, but I can tell they walk away feeling uneasy, as if some familiar signposts now point in different directions.
...or perhaps as if they're just glad to get away with all the right number of appendages.

Now, I'm not saying William Cook is going to destroy the world if Java 8 doesn't get objects right. But if you see him being followed by a giant tongue monster or psychopathic, hyper-intelligent gerbils, don't say I didn't warn you.

Friday, October 28, 2011

Oracle JDBC API changes: things that make me sigh

For reasons that escape me at the moment (something about support from Oracle, I think), we are having to upgrade the Oracle JDBC driver from 10.2.0.4 to 11.2.0.3. Ideally, this would be a simple, transparent change. But it isn't.

A co-worker quickly found an issue with the driver's handling of java.sql.Date and java.sql.Timestamp. Specifically, the previous driver would map SQL DATE columns to java.sql.Date, while the current driver maps SQL DATE columns to java.sql.Timestamp. The details are described by the Oracle JDBC faq, What is going on with DATE and TIMESTAMP? (Note: that entry describes the previous, 10.2.0.4 behavior as the "problem".)

The bottom line is that if you use resultSet.getObject(i) then for a DATE column you will now get a java.sql.Timestamp object where you would previously get a java.sql.Date object. Also, if you use code like

      switch (metaData.getColumnType(i))
      {
      case Types.DATE:
        map.put(name, resultSet.getDate(i));
        break;
      case Types.TIMESTAMP:
        map.put(name, resultSet.getTimestamp(i));
        break;
      ...
You will also get a java.sql.Timestamp instead of a java.sql.date, since the java.sql.Types mapping has changed as well.

Now, I don't know whether or not this is the Right Thing To Do. I don't actually care either way. But the change, even to fix a problem, just makes me want to sob quietly.

And the Java standard library's API is just as wacky as ever.

Monday, July 4, 2011

Snakes! On a Yeti!

Continuing on my Lisp kick, I have been reading Programming Clojure, by Stuart Halloway. But that fact is a red herring for the moment; I have also been looking at a language called Yeti. In order to take a pass at Yeti, I was casually trying to find a good sample program, which I finally found in Programming Clojure: the Snake game from section 6.6.

Yeti is a ML-style programming language for the JVM, featuring Hindley-Milner type inference with structurally-typed records and open variant types similar to O'Caml. It also provides reasonable access to Java and compiles to JVM bytecode. Yeti is similar to Scala, and andrewcooke posted what I believe is a good comparison on Hacker News; the key point is, "scala is baroque; yeti minimal."

As described by Halloway, "The snake game features a player-controlled snake that moves around a game grid hunting for an apple. When your snake eats an apple, it grows longer by a segment, and a new apple appears. If your snake reaches a certain length, you win. But if your snake crosses over its own body, you lose." The implementation is simple Swing and Halloway lists a number of other implementations for comparison:
Bear in mind that this is literally my second Yeti program, following hello-world. I have fiddled with ML a bit and O'Caml a lot (for a while, it was my official favorite programming language). I have also done rather a lot of Java lately, but only a small amount of Swing. Comments and suggestions are more than welcome.

Given that, this is my version of Snake, snake.yeti.

import java.util.Random;
import java.awt: Color, Dimension, Graphics;
import java.awt.event: ActionEvent, KeyEvent, ActionListener, KeyListener;
import javax.swing: JFrame, JPanel, JOptionPane, Timer;

The program begins with a stack of imports, for the Java classes needed later. Each class can be imported separately or multiple classes from the same package can be listed together.

A functional model


width = 75;
height = 50;
pointSize = 10;
turnMillis = 75;
winLength = 5;
dirs =
  [ KeyEvent#VK_LEFT :  {x = -1, y = 0},
    KeyEvent#VK_RIGHT : {x = 1,  y = 0},
    KeyEvent#VK_UP :    {x = 0,  y = -1},
    KeyEvent#VK_DOWN :  {x = 0,  y = 1} ];

These variables (in the mathematical sense) describe the width and height of the game grid, the conversion factor between the grid and pixels, the length of a game time step in milliseconds, and the number of segments required to win. (5 is a "boringly small number suitable for testing.") dirs is a map from numbers, given by static constants in the Java class KeyEvent, to points, as represented by a record or structure with x and y fields. dirs will be used to map key press events to changes in direction for the snake.

Using the Yeti REPL, I captured Yeti's inference of dirs' type:
dirs is hash<number, {`x is number, `y is number}> = [38:{x=0, y=-1},...elided...]
Similar to Java's Map<K,V>, dirs is a map from number (Yeti's only numeric type, as far as I know), representing the key event values, to structurally-typed x,y pairs where each field is also a number.

Because any key event will wind up looking in dirs, it is convenient to set a default for unrecognized key event values using setHashDefault. The first argument to setHashDefault is the map to be updated, while the second is an anonymous function (using Yeti's do notation) which ignores the argument (which is the unrecognized key) and always returns a null direction. (This will have the effect of pausing the game, if your snake has not eaten any apples. If your snake has eaten apples, you will lose immediately. But at least it doesn't throw any exceptions. Don't like it? Submit a bug report.)
setHashDefault dirs do _: {x = 0, y = 0} done;
See below for more about functions and do.

When I originally wrote this, I needed a couple of list utility functions, which I knew might be part of the standard library (the only documentation I had was a cached copy of the Yeti tutorial), or better expressed another way. The lists I am working with are the segments of the snake, which will not be terribly long, so I figured it did not matter much.
// butLast lst =
//     if (length lst) <= 1 then [] else (head lst) :: (butLast (tail lst)) fi;

// member? v lst =
//     if empty? lst then false
//     elif v == (head lst) then true
//     else member? v (tail lst)
//     fi;
butLast copies a list, excluding the last element. member? (and yes, Yeti allows the question mark in function names) tests whether the value v is an element of the list. Both written written using basic recursion, although Yeti is reputed to support better tail call optimization than Scala. Also, at the time, I could not figure out how to do pattern-matching destructuring on lists; instead, I used head, tail, length, and empty?.

Fortunately, Madis pointed out
  1. that the standard library does contain member?, known as "contains", and
  2. that Yeti does do pattern-matching destructuring on lists, very cleanly.
butLast lst = case lst of
    [_] : [];
    h :: t : h :: butLast t;
    _ : [];
    esac;
The underscore '_' is an anonymous variable; a placeholder that requires a value in the destructuring, but which cannot be referenced later. In the first case the underscore indicates that the lst should have exactly one element; the third that anything is acceptable. See below for more on case and destructuring.

The basic form of a function definition in Yeti is
var = do args... : expressions... done
Since the form of the bare lambda expression in Yeti is
do args... : expressions... done
the basic form is just assigning a function expression to a variable. However, because function definitions are fairly common in Yeti code, there is a shorthand syntax illustrated by butLast:
fcn args... = expressions...

Another simple function is addPoints, which adds two x,y points together, producing another point.
addPoints m n = {x = m.x + n.x, y = m.y + n.y};
The type of addPoints is
addPoints is {.x is number, .y is number}
 -> {.x is number, .y is number}
 -> {x is number, y is number} = <snake$addPoints>
The type indicates that addPoints takes two structures containing (at least) numeric fields x and y and returns another structure with fields x and y. Because Yeti uses structural typing, the structure representing points is actually anonymous; any structure containing suitable x and y fields could be added by addPoints.

Because Yeti supports the ML-style definition of new operators, addPoints could have been written as
// (.+.) m n = {x = m.x + n.x, y = m.y + n.y};
The resulting operator could be used like:
> {x=4,y=3} .+. {x=1,y=1}
{x=5, y=4} is {x is number, y is number}
I chose instead to remain consistent with Halloway's code.

pointToScreenRec converts a game board point to a rectangle in screen coordinates for the display functions.
pointToScreenRec pt =
  { x = (pt.x * pointSize),
    y = (pt.y * pointSize),
    w = pointSize,
    h = pointSize };
Once again, this function just builds and returns a new structure, with x, y, width, and height fields. The fields in the argument are accessed pretty traditionally, by a period followed by the field name.

Yet another pseudo-constant is the random number generator used to place apples on the game board.
r = new Random();
r's value will be an instance of java.util.Random, illustrating some of Yeti's Java integration.

The next two functions create apples and snakes, both of which are also structures.
createApple () =
   { location = {x = r#nextInt(width), y = r#nextInt(height)},
     color = new Color(210,50,90) };

createSnake () =
  { body = [{x = 1, y = 1}],
    dir = {x = 1, y = 0},
    color = new Color(15,160,70) };
Both also illustrate Java integration: the location of the apple is set randomly, by calling the nextInt(n) method. '#' replaces '.' in this use for Yeti. As well, the color of both apples and snakes is defined using java.awt.Color.

createApple and createSnake take the special Yeti 'unit', (), as an argument. If they were defined without taking something as an argument, they would be evaluated immediately as variables. The unit argument allows them to be defined as functions while simultaneously indicating that they do not take any interesting arguments. () indicates both a type and the single, uninteresting value of that type.

The function turn is simply used to replace the direction the snake is traveling. Originally, I wrote it as:
// turn snake newdir =
//   { dir = newdir,
//     body = snake.body,
//     color = snake.color };

Madis also pointed out that Yeti
  1. has a with syntax to create a new structure based on an existing one, and
  2. has a shortcut for associating variables and fields within a structure.
As a result, turn is:
turn snake dir = snake with {dir};
The value of turn is a copy of the structure snake, with modifications. The form "{dir}" is equivalent to "dir = dir", meaning that the dir field in the new structure is given the value of the dir parameter.

On the other hand, move is slightly more complex.
move snake grow =
  ( newhead = addPoints (head snake.body) snake.dir;
    newbody = if grow then snake.body else (butLast snake.body) fi;
    snake with {body = newhead :: newbody} );
newhead is computed to be the location of the snake's head plus the current direction. newbody is the either current body of the snake (if the snake is growing a new segment) or the current body without the terminal segment---if the snake is growing, it does so by extending at the head; if the snake is simply moving, a new head segment is added at one end while a tail segment is removed.

Since the move function uses two local variables, newhead and newbody, and thus multiple expressions, the expressions in the function's body are enclosed in parentheses.

This part of the program completes with a number of predicates used to determine if the player has won, lost, or if the snake can eat an apple.
win? snake = length (snake.body) >= winLength;

headOverlapsBody? body = member? (head body) (tail body);

lose? snake = headOverlapsBody? snake.body;

eats? snake apple = (head snake.body) == apple.location;

Mutable state


The state of the game is kept in two mutable variables, snake and apple.
var snake = createSnake ();
var apple = createApple ();
The addition of the 'var' to the definition allows the two variables to be modified by assignment.

There are three functions which update the state of the game.
resetGame () =
  ( snake := createSnake ();
    apple := createApple () );

updateDirection newdir = snake := turn snake newdir;

updatePositions () =
    if eats? snake apple
    then apple := createApple ();
         snake := move snake true;
    else snake := move snake false
    fi;
resetGame returns the game to something like it's initial state (remember, apple placement is random). updateDirection is a simple wrapper for turn, replacing the value of snake with a new snake traveling in the new direction. Finally, updatePositions tests whether the snake and apple are co-located; if they are, a new apple is created and the snake moves and grows a new segment; if they are not, the snake simply moves.

resetGame and updatePositions also take the Yeti unit value to prevent the expression from being evaluated immediately. All three functions return unit:

resetGame is () -> () = <snake$resetGame>
updateDirection is {`x is number, `y is number} -> () = <snake$updateDirection>
updatePositions is () -> () = <snake$updatePositions>
Since () indicates both a type and the single value of that type, these typings indicate that these three functions do not return a useful value; any effect they have must be done via side effects, in this case by updating variables.

GUI


In comparison with the mutable state, the GUI is a bit more complex and fancy, due to its integration with Java Swing.

My first step is to define a type name, point, for the structure containing fields x and y.
typedef point = { x is number, y is number };
The reason I did it is the next function, fillPoint.
fillPoint g pt c is ~Graphics -> point -> ~Color -> () =
   ( {x,y,w,h} = pointToScreenRec pt;
     g#setColor(c);
     g#fillRect(x,y,w,h) );
fillPoint is the first function I have used that needs a type specification. It takes an argument, g, which is a Graphics context (marked with a '~' to indicate it is a raw Java class instead of a Yeti type), a point structure, and a Java Color. This function needs the type declaration because Yeti's type inference is unable to do anything with g, since the only thing I do is call the methods setColor and fillRect. fillPoint is also the first function in which I have used a destructuring bind, which picks apart the structure returned by pointToScreenRec into variables x, y, w, and h. These variables are then used to provide the necessary arguments to Graphics#fillRect. That statement also makes use of the "{x}" == "{x=x}" equivalence shortcut.

I used the point type definition to clean up the fillPoint type declaration. Because it acts as a simple alias, the type of fillPoint reported by the REPL is
fillPoint is ~java.awt.Graphics -> {`x is number, `y is number} -> ~java.awt.Color -> () = <snake$fillPoint>

Next, I have a bug. I left it in the code because it is instructive.
paintApple g a = fillPoint g apple.location apple.color;
paintApple draws an apple using fillPoint, but it should take graphics context and the apple as an argument. Instead, it is using the apple variable. I noticed the type of paintApple is
paintApple is ~java.awt.Graphics -> 'a -> () = <snake$paintApple>
The interesting part is the 'a; that is a raw type variable. Type variables appear with parametric polymorphism, as in Java's List<t> or the equivalent Yeti's list<'a>. However, in this case it indicates something seriously wrong with paintApple since a type variable can take any type. Specifially in this code, it means that the parameter a is never used.

paintSnake does not have the bug.
paintSnake g s =
    for s.body do segment: fillPoint g segment s.color done;
Its type is
paintSnake is ~java.awt.Graphics
 -> {.body is list?<{`x is number, `y is number}>, .color is ~java.awt.Color}
 -> () = <snake$paintSnake>

paintSnake also shows a for loop, the use of the function for to iterate through the segments of the snake, calling the anonymous function do...done to draw each segment.

At this point things get a little hairy. To display the game board and interact with the player, I need to create a subclass of JPanel and an implementation of ActionListener and KeyListener. That is, I need GamePanel.

class GamePanel(JFrame frame) extends JPanel, ActionListener, KeyListener
    void paintComponent(Graphics g)
        super#paintComponent(g);
        g#clearRect(0,0,(width+1) * pointSize,(height+1)*pointSize);
        paintSnake g snake;
        paintApple g apple,

    void actionPerformed(ActionEvent e)
        updatePositions ();
        if lose? snake
        then resetGame();
             JOptionPane#showMessageDialog(frame,"You lose!")
        elif win? snake
        then resetGame();
             JOptionPane#showMessageDialog(frame,"You win!")
        fi;
        this#repaint(),

    void keyPressed(KeyEvent e)
        updateDirection dirs.[e#getKeyCode()],

    Dimension getPreferredSize()
        new Dimension((width + 1) * pointSize,
                      (height + 1) * pointSize),

    void keyReleased(KeyEvent e),

    void keyTyped(KeyEvent e)
end;

GamePanel is a Java class defined in Yeti code. The methods it implements are paintComponent (JPanel), actionPerformed (ActionListener), keyPressed (KeyListener), getPreferredSize (JPanel), keyReleased (KeyListener), and keyTyped (KeyListener).

paintComponent clears the game board's window and draws the snake and the apple. It also shows how to make a call to a superclass' method, the call to the superclass' paintComponent.

actionPerformed is called based on a timer; it updates positions and tests whether the win or loss conditions are satisfied and completes by updating the game board.

keyPressed is used to change the direction of the snake, based on arrow-key input. getPreferredSize returns a Dimension containing the on-screen size of the game board. keyReleased and keyTyped are ignored and do nothing.

An instance of GamePanel is created using the gamePanel function, which accepts a JFrame as the constructor argument.
gamePanel frame = new GamePanel(frame);
Like Scala, Yeti uses an implicit constructor scheme, where the arguments to the constructor are stored as members of the instance. Unlike Scala, bizarre do-what-I-mean semantics do not seem to be involved.

Finally, the game function itself is relatively simple imperative code:
game () =
  ( frame = new JFrame("Snake");
    panel = gamePanel frame;
    timer = new Timer(turnMillis,panel);
    panel#setFocusable(true);
    panel#addKeyListener(panel);
    frame#add(panel);
    frame#pack();
    frame#setVisible(true);
    timer#start() );

Endgame


To play the game, fire up the Yeti REPL, load the module, and invoke game:
$ java -jar yeti.jar
Yeti REPL.

> load snake
...elided...
> game ()

Miscellanea


The remaining part of Yeti which I explored, but did not ultimately use in the snake game, is open variant types. To use these, the record returned by createApple could be tagged with 'Apple':
// createApple () = Apple {
//              location = {x = r#nextInt(width),
//                          y = r#nextInt(height)},
//              color = new Color(210,50,90),
//              };
Using that, I could define a single function, paint, to replace paintApple and paintSnake:
// paint g object =
//     case object of
//         Snake s : for s.body do segment: fillPoint g segment s.color done;
//         Apple a : fillPoint g a.location a.color;
//     esac;
The type of paint is:
paint is
    ~java.awt.Graphics
 ->   Apple {.color is ~java.awt.Color, .location is {`x is number, `y is number}}
    | Snake {.body is list?<{`x is number, `y is number}>, .color is ~java.awt.Color}
 -> () = <snake$paint>
The second argument of paint is either an appropriate structure tagged with Apple or an appropriate structure tagged with Snake; the tag determines which branch is expected.

Conclusion


I like Yeti. It seems a pretty nice representative of the ML family on the JVM. The integration with Java seems to be a little clunky, but not irritatingly so.


Madis made several excellent suggestions about this posting (and code changes to yeti). Thanks!

Sunday, May 22, 2011

De re profanae

According to Wikipedia, Alfréd Rényi, rather than Paul Erdős, said, "a mathematician is a device for turning coffee into theorems." My personal corollary to that is, "a programmer is a device for turning coffee into profanity." In the same vein, some other wit also said, "Profanity is the one language all programmers know best." There are many reasons why invective and informatics go hand-in-hand. I personally do not have a problem with that, though for most of my career I have been the one in the office who cursed the least. My sole reason was that when I do, I want people to pay attention. At the moment, however, I seem to be one of the fouler folks around. This is somewhat unsettling, but it makes for a quieter workplace anyway.

I do not have a problem with profanity, but I do have a problem with euphemisms. Specifically, the term "f-bomb", which seems to be more common than I had realized. A while back, I ran across a story in the Chronicle of Higher Education involving a student at Hinds Community College who received 4/5 of a suspension after uttering the word "fuck" after class. What I found upsetting was not the story (Who expects maturity from one of those diploma mills?), but the response. The title of the story was entertaining, but several of the commentors seemed to be using the term "f-bomb" with a straight face. Most of the Chronicle's readers are teachers and administrators at institutes of higher education; I expect them to know better.

I am afraid I have news for some. "Fuck" is a word. It is not a bomb. It will not explode. It cannot kill anyone. If you, like those Chronicle readers, are dealing with people potentially in military service who are likely to face real explosive devices, it is a bit silly to dance around a word as if it were dangerous.

To bring this back to programming a bit, I am not a really big fan of profanity in code. Again, it does not bother me, but seeing a comment such as "This in fucking stupid!" is less useful to me than seeing a description why it is stupid and letting me exercise my own vocabulary. On the other hand, if you cannot mention "brainfuck" without turning red, I am going to have to suggest that you need to grow up a bit. (I hesitate to mention Scunthorpe, North Lincolnshire, England.) And if you interrupt me while I am battling Java, that will not be Yosemite Sam I am quoting.

This post brought to you by the geniuses behind Java, who have shown their creativity by introducing every possible interface mistake, misdesign, and gratuitous stupidity known to mankind.

Monday, May 9, 2011

Tony Morris on static types

Tony's blog has an interesting discussion of API design and static typing that includes one of the best reasons for very strongly typed interfaces:
Now, why these rules? Well, because if you can achieve the goal of enforcing these rules, then the next phone call that I get in L3 support from an upset client, I can be guaranteed that one of the following are true:
  • I have a bug in my code, in which case, the sooner the call, the better! …unless of course, fixing the bug results in only more bugs — but we have avoided that possibility — hopefully you can see why.
  • The client has misused the API and circumvented the type system. The client has used null, thrown an exception or performed a side-effect within the API or perhaps even used such things as Java reflection or even type-casting or type-casing. Unfortunately, in some environments, there’s not much I can do about enforcing that except impose a de facto rule where you assume non-existence of these possibilities (Just don’t do that!). Hopefully you haven’t yet jumped to the conclusion that any of these things are necessary or even useful — they aren’t.
  • The client is simply mistaken about the merits of their complaint.

The only better reason I have heard of is the one I use: I would rather think hard for an hour or so to come up with an interface I would have a hard time screwing up than spend the next weeks or months trying to remember what is and what is not legitimate. In short, I'm lazy.

What are the rules Morris is talking about? He has an advanced programming course assignment for
students to write an API for the game tic-tac-toe. No need for the computer to tactically play tic-tac-toe — just an API that you can use to model the game itself. You can use any programming language you like, however, I think you will find certain environments to be lacking in the available tools, so I will guide you so that you’re not off somewhere “shaving yaks” so to speak.
The rules of the assignment are:
  • If you write a function, I must be able to call it with the same arguments and always get the same results, forever.
  • If I, as a client of your API, call one of your functions, I should always get a sensible result. Not null or an exception or other backdoors that cause the death of millions of kittens worldwide.
  • If I call move on a tic-tac-toe board, but the game has finished, I should get a compile-time type-error. In other words, calling move on inappropriate game states (i.e. move doesn’t make sense) is disallowed by the types.
  • If I call takeMoveBack on a tic-tac-toe board, but no moves have yet been made, I get a compile-time type-error.
  • If I call whoWonOrDraw on a tic-tac-toe board, but the game hasn’t yet finished, I get a compile-time type-error.
  • I should be able to call various functions on a game board that is in any state of play e.g. isPositionOccupied works for in-play and completed games.
  • It is not possible to play out of turn.

The first rule is a typical virtue of functional programming: referential transparency; a piece of code which violates that rule unnecessarily is much harder to understand than code that obeys it. The second rule is somewhat less typical, but still a well-known virtue: the function should be total. A total function always returns a meaningful value, as opposed to a partial function which is not well-defined on some part of its domain; in other words, if the code calling it compiles without error, a total function always returns a useful value, while a partial function may throw an exception or return a bad value for some arguments.

The third, fourth, fifth, and seventh rules are the interesting ones. (The sixth just says that there may be operations that the other rules don't apply to.) These require encoding a model of the valid operations in the game into the types of the API.

First, a little background. The type of a computational artifact (a variable or function, for example) is actually a predicate in a formal system defined by the type system of the programming language. (Go read up on the Curry-Howard correspondence, if you want. I'll still be here.) Some type systems are strong enough to make very strong, and very useful, statements about the program being written. In this case, Morris wants the API to specify limits on its interface to prevent invalid operations in the game of tic-tac-toe.

Take the third rule: the move(board, position, player) operation should be invalid on a board that is in a finished-game state, for example. In other words, if you define "write" to be limited to code that compiles (and in particular, typechecks), then it should not be possible to write code that attempts to make a move on a finished board. Or, the move operation is governed by a predicate that prevents it from being applied to a finished board. Or, a board is governed by a predicate that prevents it from having a move applied to it.

Now, writing the API and the predicates to satisfy these rules is not exactly easy. But it is possible in at least some programming languages. Specifically, Morris mentions Haskell, Scala, C# and Java. (Morris' API for Java uses Functional Java, if you are interested.) There are a fair number of approaches to doing this in various languages, some of which can be euphemistically called "open research areas"; I would like to focus on a few of the less euphemistic.

I found Morris' post from a link in Vasil Remeniuk's solution in Scala that uses phantom types. I may come back in a bit to see how this looks in Java.

If I am not completely confused, Morris' API uses higher-order functions to convince Java's type system to make the correct promises. Rather than have move(board, position, player) return a new board (to drastically oversimplify what I think Morris has done), move effectively takes three additional objects, each of which is a functor (in C++ lingo) or "command object" (in Patternish), an object that encapsulates a function along with necessary data; that is, a half-assed implementation of a closure, or a first-class functional value. These three function-objects handle three possible cases: the move's position is already occupied, the game is already over, or the move is valid and the game should continue. This approach is uncommon in Java, and certainly looks hideous. But, it works.

The third approach I want to mention actually involves weakening Morris' rules, in order to get an API that is somewhat less uncomfortable. Consider what happens if you make a valid move in tic-tac-toe: the resulting board may represent the end of the game, or it may allow the game to continue. Morris' API specifies that client code has to handle this bifurcation (Hi, Del!) by passing in two different function-objects for the two different cases, because Java's options are limited. (As I understand Morris, anyway.) The alternative I have in mind is to allow a certain minimal bit of non-total functionality, by having the move operation return either an in-progress board or a final board, where the "either" is given by an Either algebraic type:
public abstract class Either<S,T> {
  private static class Left<S,T> extends Either<S,T> {
    private final S s;
    private Left(S s) { this.s = s; }
    @Override public S isLeft()  { return s; }
    @Override public T isRight() { return null; }
  }

  private static class Right<S,T> extends Either<S,T> {
    private final T t;
    private Right(T t) { this.t = t; }
    @Override public S isLeft()  { return null; }
    @Override public T isRight() { return t; }
  }

  public abstract S isLeft();
  public abstract T isRight();

  public static <S,T> Either<S,T> left(S s)  { return new Left<S,T>(s); }
  public static <S,T> Either<S,T> right(T t) { return new Right<S,T>(t); }
}
This class is used something like:

    Either<Integer,String> is1 = Either.left(4);
    Either<Integer,String> is2 = Either.right("four");
    
    System.out.println(is1.isLeft() + " " + is1.isRight());
    System.out.println(is2.isLeft() + " " + is2.isRight());
where the output is:

4 null
null four
Notice that Either encodes a type-level choice; a value of Either contains a value of one of two other types. In the tic-tac-toe case, move can return Either<InProgressBoard,FinishedBoard> and the client code can pick out the right board, at which point it can type-safely take the next step. The non-totality enters because calling isRight on a Left value returns null; client code using that value cannot do anything unsound, but it also will not get a compile-time error. This approach provides an API that more closely resembles normal Java but weakens Morris' rule three to something like "If I call move on a tic-tac-toe board, but the game has finished, I should get a compile-time type-error unless I have done something god-awful stupid. In other words, I am required to check the result and actively be idiotic to get a run-time error." Which is bad, don't get me wrong, but may not be as bad as trying to explain Functional Java to the numbskulls soiling your API by trying to use it.

Morris would disagree, as witnessed by his comments quoted above and by the following, from another post:
Some people like to think “correctness” includes the thoughts of one or more persons in order to make the assessment. For example, one might proclaim, “sure you have a proof of program termination, but that is not the program that I asked for!” I think this is a poor use of the term “correctness” and I am not considering it any further here.
As a bit of a formalist myself, I can certainly feel the attraction to that definition of correctness, but it still leaves one in the unpleasant position of having a rather strange, technical meaning for a commonly-used word; somewhat like the difference between true and formally proven. For myself, I have seen too many comments to the effect of "if we proved everything correct, there wouldn't be any bugs" (for example, see David Harel's Algorithmics) or "[system X] has [huge percentage Y] devoted to handling situations that shouldn't occur" (see David Gries' The Science of Programming); even with software that is correct according to Morris, my to-do list would likely not be much shorter and TCP would still have to send retransmissions.

In any case, Morris has given a really nice challenge as well as an excellent description of why the challenge is worthwhile, and I too look forward to the day when I can be assured that that phone call likely represents genuine bug.

[Edit] Amazingly, a quick-and-dirty translation of Remeniuk's Scala code to Java seems to demonstrate the properties I was looking for. (This code was put together very quickly and does not attempt to satisfy all of Morris' rules. Nor does it meet all of the guarantees made by Remeniuk's code. But it's not bad.)
public class Phantom
{
  static abstract class Mark { }
  static class Nought extends Mark { }
  static class Cross extends Mark { }
  
  /* ... */
    
  static abstract class GameState { }

  static class NotFinished extends GameState { }
  static class Started extends NotFinished { }
  static class NotStarted extends NotFinished { }
  
  static class Finished extends GameState { }
  static class Failed extends Finished { }
  static class WinState extends Finished { }
  static class DrawState extends Finished { }

  static class Position {
    public final int x, y;
    Position(int x, int y) { this.x = x; this.y = y; }
  }
  
  static final class NextTurn<Last extends Mark,Next extends Mark> {
    // Type witnesses: values used only to force specific types.
    public final NextTurn<Nought,Cross> crossesTurn = new NextTurn<Nought, Cross>();
    public final NextTurn<Cross,Nought> noughtsTurn = new NextTurn<Cross, Nought>();
  }
  
  // S: Board's position in the game: NotStarted, Started, etc.
  // M: Last mark to be played.
  // Next: Next mark to be played.
  static class Board<S extends GameState, M extends Mark> {
    <Next extends Mark, S extends Started>
      Either<Board<Finished,Next>,Board<Started,Next>>
      move(Position position, NextTurn<M,Next> nextTurn, S isNotFinished) { /*...*/ }
  }
One thing that differs from the Scala code is that I have had to use explicit type witness values; in Scala I believe they are implicit. Also, the Scala code uses the same function argument approach as Morris', by calling Either.fmap.

This code would be very easy to circumvent, but it does demonstrate some of the desired features:
  public static void main(String[] args) {
    new Board<NotStarted,Nought>()
    .move(new Position(1,1), NextTurn.crossesTurn, new Started())
    .isRight() // The resulting board will not be Finished.
    .move(new Position(1,2), NextTurn.crossesTurn, new Started());
  }
The second move() call indicates that Cross is taking another turn, and fails to compile with the following error:
$ javac -cp src src/tictactoe/Phantom.java
src/tictactoe/Phantom.java:87:
 <Next,S>move(tictactoe.Phantom.Position,
              tictactoe.Phantom.NextTurn<tictactoe.Phantom.Cross,Next>,
              S) in 
    tictactoe.Phantom.Board<tictactoe.Phantom.Started,tictactoe.Phantom.Cross>
  cannot be applied to
    (tictactoe.Phantom.Position,
     tictactoe.Phantom.NextTurn<tictactoe.Phantom.Nought,tictactoe.Phantom.Cross>,
     tictactoe.Phantom.Started)
      .isRight().move(new Position(1,2), new NextTurn<Nought, Cross>(), new Started());
                ^
  1 error

Wednesday, April 13, 2011

SQL Combinators in Java

Combinators are another technique common in functional programming which seems to be significantly less common in object-oriented languages. This is unfortunate, because combinators work very well in an object-oriented design. What I am going to present below is a very simple use of them that hopefully displays some of their features. Other common examples include parsers and various embedded domain specific languages.

Now, I will admit that I am a crazed neo-Luddite. I am proud to say that I hate persistent objects, believe remote procedure calls are evil, and regard most of the enterprise development infrastructure and practices with a considerable amount of disdain. But the bottom line is that I do not use Hibernate, JPA, or any of the other object/relational/database/mapper hoo-has. They, in the immortal words of Del, purport to make something easier that was not difficult in the first place. In doing so, they stack another layer of stuff on top of the actual solution; witness the fun when combining OSGi modularity with Hibernate.

Instead, I am perfectly happy writing SQL statements in my cheerfully modular code, performing just the operations that I need when I need them, and enjoying the control that I get. However, there are times when that is not enough, when plain SQL statements will not do the job. One such case that has come up several times in my career is a semi-general, ad hoc query based on a HTTP request.

For example, consider a web form that displays rows from a database table. This web form submits a number of parameters, and based on those parameters the server-side code formulates a query that returns the appropriate rows, which are then returned to the client. Now, there are two ways to formulate a SQL query: parameterized and non-parameterized. The latter involves building a text string including the SQL and the values to be selected; the values must be appropriately sanitized to avoid SQL injection. I prefer the parameterized option. The thing about the parameterized option is that it needs two passes: one to build the SQL, "SELECT * FROM table WHERE whosit = ?" and one to inject "Fred" to match ? number 1 (remembering the order that the ?'s came in, too). This is where the combinators enter the picture. These combinators provide a way of building the where-clause of a SQL statement in a semi-general way.

To demonstrate, this code selects all of the rows with NAMEs equal to "Doe".
Predicate p = Predicate.stringEquals("NAME", "Doe");
statement = connection.prepareStatement("SELECT * FROM people WHERE " + p.toSql());
p.parameterize(statement);
ResultSet rs = statement.executeQuery();

The following example selects all of the rows with NAMEs equal to "Doe" and ADDRESSes LIKE "Main Street".
Predicate p = Predicate.and();
p.add(Predicate.stringEquals("NAME", "Doe"));
p.add(Predicate.stringLike("ADDRESS", "Main Street"));
statement = connection.prepareStatement("SELECT * FROM people WHERE " + p.toSql());
p.parameterize(statement);
ResultSet rs = statement.executeQuery();

Finally, this code selects all of the rows with NAMES equal to "Doe" and orders the result by "CITY", ascending.
Predicate p = Predicate.orderedByAscending(Predicate.stringEquals("NAME", "Doe"), "CITY");
statement = connection.prepareStatement("SELECT * FROM people WHERE " + p.toSql());
p.parameterize(statement);
ResultSet rs = statement.executeQuery();

The primary abstract interface to a combinator is to evaluate it. (This is one of the reasons they work so well in functional languages.) For my SQL combinator, I wanted a little richer interface:
public abstract class Predicate
{
  /**
   * Return the SQL for the given predicate, including parameter markers (?).
   * @return A String.
   */
  public abstract String toSql();
  
  /**
   * Set the parameters in the {@link PreparedStatement} associated with the parameter markers from the toSQL string.
   * This method must be called after the statement is created from the string and before the query is executed.
   * @param statement {@link PreparedStatement} which should have the values injected into it.
   * @throws SQLException If the parameterization fails.
   */
  public void parameterize(PreparedStatement statement) throws SQLException
  {
    [...]
  }

  /**
   * Test whether the {@link Predicate} represents an empty WHERE clause.
   * 
   * @return True if the predicate represents an empty WHERE clause, false otherwise.
   */
  public boolean isEmpty() { return false; }

[...]

The two key methods are toSql(), which returns the SQL predicate clause, and parameterize(), which accepts a PreparedStatement generated from the SQL and inserts the values. A base-case combinator (ignoring the empty one) is something that compares a value with a column:
  public static class StringEquals extends Predicate
  {
    private final String column;
    private final String value;
    
    public StringEquals(String column, String value)
    {
      this.column = column;
      this.value = value;
    }

    @Override
    public String toSql()
    {
      return String.format("(%s = ?)", column, operator);
    }

    @Override
    protected int setParameter(int index, PreparedStatement statement) throws SQLException
    {
      statement.setString(index, value);
      return index + 1;
    }
  }

The parameterize() method of the Predicate calls setParameter() for the Predicate structure; each setParameter returns the next index value into the SQL statement.
  [...]
  public void parameterize(PreparedStatement statement) throws SQLException
  {
    this.setParameter(1, statement);
  }
  [...]

By ensuring that the order of the ?'s in the SQL is the same as the order that setParameter() is called, the values are associated with the appropriate clauses.

Another combinator class is used to search for timestamps appearing on a given day:
  public static class OnDate extends ColumnComparison
  {
    private final String column;
    private final Date date;
    
    protected OnDate(String column, Date date)
    {
      this.column = column;
      this.date = date;
    }

    @Override
    public String toSql()
    {
      return String.format("(%s BETWEEN ? AND ?)", column);
    }

    @Override
    protected int setParameter(int index, PreparedStatement statement) throws SQLException
    {
      final Calendar calendar = Calendar.getInstance();
      calendar.setTime(date);
      statement.setDate(index, new Date(calendar.getTimeInMillis()));
      calendar.roll(Calendar.DATE, true);
      statement.setDate(index + 1, new Date(calendar.getTimeInMillis()));
      return index + 2;
    }
  }

The interesting part of combinators (and the ad hoc query) is the ability to generate a higher-level Predicate from one or more more-basic predicates. In this case, that is the purpose of And and Or:
  private static class And extends Aggregate { public And() { super(" AND "); } }
  private static class Or  extends Aggregate { public Or() { super(" OR "); } }
which are based on a general Aggregate:
  public abstract static class Aggregate extends Predicate implements Collection
  {
    protected final String join;
    protected final List subPredicates = new ArrayList();
    
    protected Aggregate(String join)
    {
      this.join = join;
    }
    
    @Override
    public String toSql()
    {
      List sqls = new ArrayList(subPredicates.size());
      // Some hot mapping action here.
      for (Predicate sub : subPredicates)
      {
        sqls.add(sub.toSql());
      }
      return StringUtils.join(sqls, join);
    }
    
    @Override
    protected int setParameter(int index, PreparedStatement statement) throws SQLException
    {
      for (Predicate sub : subPredicates)
      {
        index = sub.setParameter(index, statement);
      }
      return index;
    }
  [...]
}

The final piece is the capability of ordering the results of the query.
  public static class Ordered extends Predicate
  {
    private final String sortColumn;
    private final Predicate predicate;
    private final boolean ascending;
    
    protected Ordered(Predicate predicate, boolean ascending, String column)
    {
      this.predicate = predicate;
      this.sortColumn = column;
      this.ascending = ascending;
    }
    
    @Override
    protected int setParameter(int index, PreparedStatement statement) throws SQLException
    {
      return predicate.setParameter(index, statement);
    }

    @Override
    public String toSql()
    {
      // Default: ASC
      final String asc = ascending ? "" : " DESC";
      return predicate.toSql() + " ORDER BY " + sortColumn + asc;
    }
  }
There is a small problem here; once an Ordered has been wrapped around a Predicate, the result is no longer really a Predicate; it admits the same interface, but an Ordered instance cannot legitimately be added to an And or Or collection. I will leave the fix as an exercise for the reader.

To make using Predicates easier, I included a number of static methods that can be used to create individual instances.
  • and() returns an And aggregate which will join its sub-predicates with AND.
  • or() returns an Or aggregate which will join its sub-predicates with OR.
  • stringEquals(String, String) returns a simple StringEquals predicate joining a column with a String via '='; this comparison is case sensitive.
  • stringEqualsIgnoreCase(String, String) returns a simple predicate joining a column with a String via '='; this comparison is case-insensitive.
  • stringLike(String, String) returns a simple predicate joining a column with a SQL pattern expression via 'LIKE'; this comparison is case-insensitive.
  • onDate(String, Date) returns a simple predicate joining a column with a Date; BETWEEN is used to allow any time between DATE and the next day.
  • orderedBy(Predicate, boolean, String) returns an Ordered predicate, ordered as described by the flag and the column names.

The examples I gave are not as syntactically nice as some of the DSLs implementable in Haskell, say. (But in Java, what is?) However, it does allow me to cleanly build up a predicate in stages

Wednesday, March 23, 2011

Quote o' the Day: Agressively stupid

This is from TheServerSide.com's article, Developers still skeptical of Oracle’s stewardship of Java.

James Gosling, considered the father of Java, said during his keynote session at the Java Symposium that it is in Oracle’s own interest to “not be really aggressively stupid.” But he said it’s been clear that Oracle didn’t exactly know what it was getting into with the acquisition of Java.

Not being really aggressively stupid is a problem to which many of us can relate.

Saturday, December 25, 2010

Equational programming

It is Christmas and Dr. McGuire is too busy watching snow fall to come up with a real post, so he is just rehashing a reply to a question on the haskell-beginners mailing list as if it were new. Typical.

Jun HU wrote:
My first language is C, and I've strong intention in learning a pure functional programming language. My very first question is how to think in the functional programming way, anyone has some ideas.

I cannot claim C as my first language, but I do say it is my "natural" language; I have written more code in C than I have in any other, and I've been using it for nigh on 20 years. I also cannot claim to think in the functional programming way, but using O'Caml and Haskell has definitely changed how I program in any language. In fact, I suspect functional programming has had a bigger effect than I had previously considered. But I am definitely not a Haskell guru.

I believe the biggest difference between functional and procedural programming is thinking equationally rather than operationally. With procedural programming as in C (and most object oriented languages, which in my experience are more procedural than not), the tendency is to think operationally: "First the program does this, then that happens, then that,...." Now, back at UT Austin (I grew up in the land of Dijkstra), at the time anyway, they were fond of teaching axiomatic thinking: "At this point in the program text, the state is this; at that point in the text, the state is that...." That is definitely a significant improvement, and I attribute what decent code I have written to that approach, but it definitely is not as natural as either operational or equational thinking.

Thinking operationally (or "pretending to be the computer") is problematic because there are typically an exponential number of paths through a program. As a result, it is impossible to completely understand even middling small program. Further, operational approaches provide no particularly good strategies for developing programs.

If you can get past the initial visceral revulsion at the thought of proving programs correct (Horrors!), thinking axiomatically has significant benefits. On one hand, it reduces the exponential number of paths through the program to a linear number of positions in the program text. For example, if you have a block of program text s which, in a program state p is guaranteed to arrive at another state q, and likewise program text t going from q to r, then you have s;t going from p to r; it doesn't matter what happens in the middle. Further, axiomatic thinking does provide some guidance to writing code: if you think you need a conditional statement, then you want to go from p to some narrow, definite state q; if you need a loop, then you can look for an invariant and a termination condition. Not easy, but there is something

On the other hand, in functional languages (and also when taking functional approaches in a non-functional language), it is very important to think equationally: "This means that; this other thing is that...." This is especially true in Haskell, which is not strict; if you try to think operationally or even axiomatically, you run into difficulties when the state of a non-strict program does not resemble the state of a strict program.

Equational thinking completely does away with the "number of paths" argument. A functional definition is, or should be, small enough to understand directly, based on any underlying definitions. Further, because the state does not change (in operational terms, or equivalently in axiomatic terms, there is only one state), it becomes less important to manipulate the program as a whole. Instead each definition can be dealt with separately.

Now, equational thinking is easy for stuff like

\[\textbf{add_one}\ n = n + 1\]

But it is not immediately apparent how that extends to something that needs recursion, say. The big "Ah-ha!" moment for me was the realization that recursion and mathematical induction are the same thing (ok, it's not a surprise for anyone; the big deal was when I internalized it enough to use the idea to naturally write code). With that, you can deal with anything programmatically equational and recursive as mathematically equational and inductive. The change in thinking is as big as the difference between operational and axiomatic.

Equational thinking extends beyond basic function definitions; I think it is the key to the type system, classes, and most of the other neat language features. At this point, I am still struggling with category theory, although I have been able to make use of monads reasonably elegantly (in my (humble) opinion, of course). I think there is another step there, although I haven't made it.

This is all rather fuzzy and exceptionally difficult to put into words. However, I am willing to change my traditional definition, Programming is applied formal logic, to include at least abstract algebra and category theory at this point, though.