Monday, February 21, 2011

In search of DSL for search trees

A year ago, as I was reading Ginsberg's paper on AlphaBeta pruning, I realized Gnubridge did not implement the "deep" pruning, making our implementation quite inefficient. As I set out to add this new feature, I realized the search code was a mess (errr... has incurred substantial amount of technical debt), and put it off until I had more time to restructure the code. As it turns out, I did not have a substantial block of spare time in over a year, and as a result, Gnubridge had no updates at all. Once again, I've been humbled for not following XP's practice of small steps.

Fear not, this weekend I reentered the ring, with a plan. First, I isolated the AlpaBeta pruning logic from the main search class, DoubleDummySolver. Then I attacked the AlphaBeta pruning itself, only to realize that it was a mess as well. Which brings me to the subject of today's post: working out a Domain Specific Language (DSL) for tree manipulation.

In the pruning tests, there's a need to create scenarios, where a tree of many nodes represents choices at any given point in play. To get a better idea, take a look at the following ASCII diagram for a simple case:

/*          root       W
 *           / \
 *  (max:1) 0   1      S
 *             / \
 *   (max:0) 1_0  1_1  E        
 */

At the root, we have West to move, with two choices of cards to play, represented by nodes, 0 and 1. When choice 0 is made (and perhaps following full evaluation of node 0), we find out that West has scored 1 trick (that's the value of the max function at node 0, from standard computer science terminology). If, on the other hand, West makes choice represented by node 1, the next move belongs to his opponent, South. South, in turn also has two choices, and for ease of navigation, I chose to name them node 1_0, and node 1_1. This is to represent node's location in the tree from root: first take choice 1, then 0, arriving at node 1_0. It's probably not relevant, that at 1_0 and 1_1 East is to move, but what's important, is that after fully evaluating node 1_0, we find out that West has scored 0 tricks (value of max function is 0). It's at this stage that we have enough information to make a decision to forego the evaluation of node 1_1, which is the core idea behind Alpha Beta pruning.

As you can see, this diagram is very rich in information, which makes it difficult to represent in code, which in turn makes it hard to write and debug test cases. We need more than a dozen of these examples to describe AlphaBeta, and the diagrams get more complicated for the "deep" pruning scenarios, which is what I was hoping to implement. To get an idea of what the test cases looked like, see the straightforward representation in code of the above diagram:

 public void testOneLevelAlphaPrune() {
  Node root = new Node(null, WEST.getValue());
  Node node_00 = new Node(root, WEST.getValue());
  Node node_0 = new Node(node_00, SOUTH.getValue());
  node_0.setTricksTaken(Player.WEST_EAST, 1);
  Node node_1 = new Node(node_00, SOUTH.getValue());
  Node node_1_0 = new Node(node_1, EAST.getValue());
  @SuppressWarnings("unused")
  Node node_1_1 = new Node(node_1, EAST.getValue());
  node_1_0.setTricksTaken(Player.WEST_EAST, 0);
  
  AlphaBeta ab = new AlphaBeta();
  ab.prune(node_1_0);
  
  assertTrue(node_1_0.isAlphaPruned());
  assertTrue(node_1.isAlphaPruned());
 }

Isn't it perfectly clear that this code builds a tree as per diagram? Ouch... This suffers from the same problem as the example in the previous post, where the solution was to abstract the test code into Beharior Driven style of GIVEN... WHEN... THEN. Alas, it's not so simple in this case, because we need to construct a tree and populate it with information while at the same time exposing all the relevant details in the test.

So after a day of experimenting with various approaches, I settled for the following abstraction:

 public void testOneLevelAlphaPrune1() {
  givenMax(WEST);
  nodeWithPath("0").withTricksForMax(1).withNextTurn(SOUTH);
  nodeWithPath("1").withNextTurn(SOUTH);
  nodeWithPath("1", "0").withNextTurn(EAST).withTricksForMax(0);
  nodeWithPath("1", "1").withNextTurn(EAST);

  whenPruning(nodeWithPath("1", "0"));

  assertTrue(nodeWithPath("1", "0").isAlphaPruned());
  assertTrue(nodeWithPath("1").isAlphaPruned());
 }

I hope that's readable to developers that come after me, and certainly to me a year from now.

The complete code is in the AlphaBetaTest class for now, but it will probably be pulled into some reusable test class eventually.

Monday, February 15, 2010

BDD style tests for bidding

BDD stands for Behavior Driven Development, and it is one of those 2nd generation agile practices that I found difficult to buy into. BDD style tests follow the pattern: "given (some initial conditions), when (some action is taken), then (expect result)." This template seemed like such a no-brainer, that I could not see how it could warrant a separate name and whole movement formed around it. But then TDD itself (Test Driven Development) also is simple to define, and yet strict adherence to it forces revolutionary changes in a developer's mindset. Take a look at the following test written without BDD, which uses the latest evolution of bidding domain classes to demonstrate when to raise partner's major suit:

public void testRaisePartnersMajorDespiteStrongerMinor() {
  auction.bid(ONE_HEARTS);
  auction.bid(PASS);
  BiddingAgent ba = new BiddingAgent(auction, 
    new Hand("K,10,7,6","A,9,8,3", "A,8,6,4,2", ""));
  assertEquals(THREE_HEARTS, ba.getBid());
}

Now note how much easier it is to understand, when code follows the BDD template:

public void testRaisePartnersMajorDespiteStrongerMinor() {
  givenBidding(ONE_HEARTS, PASS);
  andPlayersCards("K,10,7,6", "A,9,8,3", "A,8,6,4,2", "");
  expectPlayerToBid(THREE_HEARTS);
}

The reason for this leap in expressiveness is that there's certain amount of useless ceremony around the domain objects, and that ceremony detracts from clarity of intent. By enforcing a strict BDD style on the tests, it's harder to be distracted by the ceremony. BDD puts focus on how the domain is modeled in human language, not in the code. The code model is decoupled from the spec by methods describing the context: givenBidding(), andPlayersCards(), and expectPlayerToBid(). One payoff of this approach is that hundreds of unit tests do not have to be modified when some aspects of the model are refactored. Another result is a set of test that can be shown to domain experts for verification. Both of these give boost to productivity.

Tuesday, February 2, 2010

Gnubridge vision statement - draft

Gnubridge has picked up a lot of momentum lately, and so I think it is time to define what it is and what it is going to be over the next few years. In many ways Gnubridge is still very immature, and so its version number is modestly kept below 1.0. This vision statement should help narrow down what it's going to take to make a 1.0 release, by giving the principles by which features should be prioritized or outright rejected. Please help shape this document by giving your input.

Gnubridge is a software program to play contract bridge. It aims to be a player's entry into the world of bridge software. This is accomplished through the following cardinal characteristics:
  • Multiplatform deployment. By using Java and staying platform-agnostic, Gnubridge is available anywhere where Java runs.
  • Gnubridge will be a strong opponent and a strong partner. What's under the hood is more important than the user interface, and enhancements in both bidding and deal playing will be given priority.
  • Simplicity through ease of user experience. It should be ready to go out of the box: no configuration, no questions to answer for the users (like Gnuchess). New options and graphic interface features should be introduced with caution, with original behavior being the default on each upgrade. It is better to support fewer features than to confuse a mainstream user with obscure options and buttons. As a result, Gnubridge should not have many intimidating menus characteristic "rich" interface applications, and advanced options should be hidden far away from the user.
  • Simplicity through deployment. Gnubridge should be distributed and run as a single file (unlike Gnuchess). The project will err on the side of including fewer features rather than making installation and management complicated. Also, no platform-specific installation packages will be supported at this point.
  • Frequent, small, releases. If a piece of functionality provides value to players, it will make a separate release. One or more releases a month should be the norm.
  • No bugs. Bugs will be fixed first. Automated tests will be written to expose bugs.

Gnubridge also has another community it serves - the developers. It should give developers a chance to demonstrate their skills, learn from each other, push the state of the art in game playing algorithms and apply agile development methodology on an open source project. Test driven development, collective code ownership, simplicity of design, and refactoring will be practiced on the project. Automated tools will provide additional feedback on the overall state of code. These will include continuous integration, static analysis, and test coverage.

Here is a brief sketch of what the future looks like.

Version 1.0

Target date: April 1, 2010.
Gameplay to implement most of American Standard convention, as covered in Pavlicek's Bridge basics. Support for doubles and redoubles. Duplicate bridge scoring.
User features: save and load a game. When restarting load a last unfinished game. Replay and rebid a game. Reset score. Force a computer to play before allotted time runs out.
Community: rss feed for Gnubridge updates.
Development: save junit results on CI server. Meet the target code coverage metrics.

Version 2.0

Target date: January 1, 2011.
Gameplay to implement American Standard and most other conventions with options. Bidding to include explanations. Double-dummy solver to include partition search. Double-dummy to include explanation of moves (for analysis mode).
User features: analysis mode allowing user to configure cards in a deal, force that certain cards are played, take back a move, and get computer recommendation.
Community: vote on features, easy feedback, easy debugging.
Development: static analysis tools, produce crap4j metrics, produce Emma plugin metrics, migrate to github, migrate CI to runcoderun.

Version 3.0

Target date: 2012
Network play: play with any number of partners against each other or computer.
Search algorithm no longer knows everyones cards.
Gnubridge enters a tournament against other computer programs.

Sunday, January 10, 2010

Frontline story part 2: expressive coding style in Gnubridge

This was my first take at implementing a portion of overcalls rule in bidding. (At the very bottom of this post is English explanation of the overcalls logic being implemented here)[1]

protected Bid prepareBid() {
  Bid result = null;
  if (calculator.getCombinedPoints() < 13) {
    result = getValidBidFor6LongSuitOrDecent5(1);
  } else if (calculator.getCombinedPoints() < 16) {
    result = getValidBidForSuit5AndLonger(1);
    if (result == null) {
      result = getValidBidFor6LongSuitOrGood5(2);
    }
  } else if (calculator.getCombinedPoints() < 19) {
    result = getValidBidForSuit5AndLonger(1);
    if (result == null) {
      result = getValidBidForSuit5AndLonger(2);
    }
  }
  return result;
}
Here is what I ended up with 2 hours later:
protected Bid prepareBid() {
  if (calculator.getCombinedPoints() < 13) {
    return firstValidBid( //
      bidSuit(1, hand.getSuitsWithAtLeastCards(6)), //
      bidSuit(1, hand.getDecent5LengthSuits()));
  } else if (calculator.getCombinedPoints() < 16) {
   return firstValidBid( //
     bidSuit(1, hand.getSuitsWithAtLeastCards(5)), //
     bidSuit(2, hand.getSuitsWithAtLeastCards(6)), //
     bidSuit(2, hand.getGood5LengthSuits()));
  } else if (calculator.getCombinedPoints() < 19) {
    return firstValidBid( //
      bidSuit(1, hand.getSuitsWithAtLeastCards(5)), //
      bidSuit(2, hand.getSuitsWithAtLeastCards(5)));
  }
  return null;
}
Was it worth the time? The first implementation seems pretty straightforward, but one level deeper, looking at the methods getValidBidFor6LongSuitOrDecent5(int x), etc., there was a lot of duplication and murkiness. Duplication of logic and multiple responsibilities can be inferred from method names. Also, I could not make it clear what the parameter passed to the method was (as you can see below, it was bid level). Take a look at the following two methods, and note how murky they are. Method name is the only thing that can explain what is happening, the code itself is doing too much to make it easy to follow.
private Bid getValidBidForSuit5AndLonger(int maxBidLevelAllowed) {
  List suits5AndLonger = hand.getSuitsWithAtLeastCards(5);
  if (suits5AndLonger.size() > 0 &&
      auction.isValid(new Bid(maxBidLevelAllowed, suits5AndLonger.get(0)))) {
    return new Bid(maxBidLevelAllowed, suits5AndLonger.get(0));
  }
  return null;
}

private Bid getValidBidFor6LongSuitOrDecent5(int maxBidLevelAllowed) {
  List goodSuits5AndLonger = hand.getSuitsWithAtLeastCards(6);
  goodSuits5AndLonger.addAll(hand.getDecent5LengthSuits());
  for (Suit suit : goodSuits5AndLonger) {
    if (auction.isValid(new Bid(maxBidLevelAllowed, suit))) {
      return new Bid(maxBidLevelAllowed, suit);
    }
  }
  return null;
}

The two additional methods not listed, followed the pattern in getValidBidFor6LongSuitOrDecent5(). There's logical duplication in what these methods are doing. It's around placing a valid bid based on a set of suits meeting certain criteria.

This code didn't just get slapped together. Up to this point I was being very meticulous, and have already taken several refactoring passes through it. And yet it was smelling.

I decided to tackle the duplication I noticed, to see whether it'd affect the top level method, prepareBid() and if it'd be overall good or bad for clarity. Whenever I saw Or criterion in method name, I split it into two methods. I then combined the methods by extracting the bid placement logic, and passing the selected suits as parameters to the generic method. The generic method ended up looking a lot like constructor of the Bid class: new Bid(1, CLUBS), which made me think I was on the right path. Take a look:

private Bid bidSuit(int bidLevel, Collection suits) {
  for (Suit suit : suits) {
    if (auction.isValid(new Bid(bidLevel, suit))) {
      return new Bid(bidLevel, suit);
    }
  }
  return null;
}
However, the top level method, prepareBid was now looking somewhat messy:
protected Bid prepareBid() {
  Bid result = null;
  if (calculator.getCombinedPoints() < 13) {
    result = bidSuit(1, hand.getSuitsWithAtLeastCards(6));
    if (result == null) {
      result = bidSuit(1, hand.getDecent5LengthSuits());
    }
  } else if (calculator.getCombinedPoints() < 16) {
    result = bidSuit(1, hand.getSuitsWithAtLeastCards(5));
    if (result == null) {
      result = bidSuit(2, hand.getSuitsWithAtLeastCards(6));
      if (result == null) {
        result = bidSuit(2, hand.getGood5LengthSuits());
      }
    }
  } else if (calculator.getCombinedPoints() < 19) {
    result = bidSuit(1, hand.getSuitsWithAtLeastCards(5));
    if (result == null) {
      result = bidSuit(2, hand.getSuitsWithAtLeastCards(5));
    }
  }
  return result;
}
The nested ifs are a smell. So are the null checks. In this case they indicate logic around choosing the first possible match, and then moving on to another. The same smell present in the original version of the method, quoted at the beginning of this post, but refactoring exposed it completely. At this point it was not difficult to wrap this logic into a separate method. And here is, once again the end result:
protected Bid prepareBid() {
  if (calculator.getCombinedPoints() < 13) {
    return firstValidBid( //
      bidSuit(1, hand.getSuitsWithAtLeastCards(6)), //
      bidSuit(1, hand.getDecent5LengthSuits()));
  } else if (calculator.getCombinedPoints() < 16) {
   return firstValidBid( //
     bidSuit(1, hand.getSuitsWithAtLeastCards(5)), //
     bidSuit(2, hand.getSuitsWithAtLeastCards(6)), //
     bidSuit(2, hand.getGood5LengthSuits()));
  } else if (calculator.getCombinedPoints() < 19) {
    return firstValidBid( //
      bidSuit(1, hand.getSuitsWithAtLeastCards(5)), //
      bidSuit(2, hand.getSuitsWithAtLeastCards(5)));
  }
  return null;
}

private Bid bidSuit(int bidLevel, Collection suits) {
  for (Suit suit : suits) {
    if (auction.isValid(new Bid(bidLevel, suit))) {
      return new Bid(bidLevel, suit);
    }
  }
  return null;
}

private Bid firstValidBid(Bid... bids) {
  for (Bid bid : bids) {
    if (bid != null) {
      return bid;
    }
  }
  return null;
}

The top level method is more readable than the version in the beginning of this post and in the process I got rid of nearly 20 lines of duplicated logic. Was it worth the extra 90 minutes that it took? The standard argument brought up is that 90% of code's lifetime is spent in maintenance mode, so we get a large return on effort investment over time. This may seem a bit academic, so let me give a couple of personal practical reasons.

First, I can now go much faster with the next steps. This rule is far from being done, and lumping along 30% duplicated logic as I was implementing more and more cases would make it harder to maintain and more likely that I'd add even more duplicated logic. Having a clean picture right now lets me easier identify when the code gets messy again, and attack it.

Second, I would like to attract more developers to the project. If a researcher wants to popularize his work, he better invest the time to prepare a good seminar to present that work. The first version of the code was pretty solid, and was passing all the tests. But it is somewhat equivalent to taking lab notes to a seminar and improvising in front of the audience, in hopes of conveying the idea. I have tried explaining complicated problems to a lecture hall of 100+ students, and it takes a lot of preparation to even get to the point where just a fraction of the audience can be reached. A good lecture will take a complicated problem, break it down into simple chunks, and convey the material in a language that the audience can understand. While it takes a long time to prepare, it is very rewarding to the speaker and the audience. I think leaving clean code behind is a necessary step in engaging developers with Gnubridge.


[1] For reference, here's domain expert's description of the logic that the above code implements (Pavlicek's Bridge Basics Online, lesson 7):

[...]you may be able to make a suit overcall. This is done by bidding your suit at the cheapest level possible.

A suit overcall at the one level requires 10-18 points (distributional points included). Further, if your strength is toward the minimum range (10-12), you should have a decent five-card suit — at least Q-J-x-x-x — or any six-card suit.

A suit overcall at the two level requires about 13-18 points, and here the suit quality is even more important. If your strength is minimum (13-15), you should have a good five-card suit — at least A-Q-J-x-x or K-Q-10-x-x — or any six-card suit.

Sunday, December 13, 2009

Frontline story from programming bidding conventions

When working on bidding overcalls last night I had one of those moments where I had to pause and take a look back. I was working on implementing the following from Pavlicek's excellent online beginners lessons (lesson 7):

If your right-hand opponent opens the bidding one of a suit and you have a five-card or longer suit (excluding the enemy suit), you may be able to make a suit overcall.

This plain English text turns out to be very information-dense.  The first part of the sentence, "If your right-hand opponent opens the bidding one of a suit," implies that one of the following bidding scenarios took place:
  • ONE-SUIT
  • PASS, ONE-SUIT
  • PASS, PASS, ONE-SUIT
 When trying to get that in code, I ended up quickly in a tangled web of statements like


if (bidCount==1 &&
    getCalls().get(0).getBid().getTrump().isSuit() &&
    getCalls().get(0).getBid().getValue() == 1
   ) {
     return true;
}


... and that was just the simplest case. Ever since I read in Kerevsky's book a story about Ward Cunningham's writing of a date constructor as november(20, 2005)[1], I have been a lot more careful about writing readable code even in obscure places. I rewrote the above a few times, and finally settled for the following:

public boolean mayOvercall() {
  if (bidCount == 1) {
    if (is1Suit(0)) {
      return true;
    }
  } else if (bidCount == 2) {
    if (isPass(0) && is1Suit(1)) {
      return true;
    }
  } else if (bidCount == 3) {
    if (isPass(0) && isPass(1) && is1Suit(2)) {
      return true;
    }
  }
  return false;
}

private boolean isPass(int i) {
  return PASS.equals(calls.get(i).getBid());
}

This is nothing terribly fancy, no domain specific language, just some appropriately named methods. The whole experience took about half an hour, and if you go back to the original specification, it is much clearer now why Gnubridge is still far from supporting Standard American bidding convention. Human language, especially when spoken by an expert, is incredibly dense. This is just an illustration of what has been a common experience to me in translating from user stories to code.

Another realization I had was how much easier it would have been to just go ahead and code it into a convoluted web of nested IFs and method calls. This was a small piece of a small bidding rule, and my time on the project is limited. I'm sure Gnubridge code is littered with this sort of unreadable garbage, and it was just a lucky call that made me rewrite it somewhat clearer. It seems like combating this sort of unreadable code is an uphill battle. I know we have coding standards, static analysis tools, peer review, etc. at our disposal to maintain code quality. And yet, based on this example, the code seems to naturally tend towards a chaotic mess. The expressive and readable code that comes from proper division of responsibilities is just an ephemeral occurrence, an unstable equilibrium about to roll back into chaos soon after attentive eye of a developer focuses away.


[1] Kerevsky, Refactoring to Patterns, chapter 2

Friday, November 27, 2009

What's NOT a good pruning strategy in Bridge - Part 2

In the previous post I have presented a counter-example to the following heuristic for selecting a move:
Among all cards that are impossible to win the current winning card, only the one with smallest rank is selected. [1]
My counter example only applied to the trick to be taken by partner. Last night I let Gnubridge look for counter examples when the trick was being taken by the opponent, and fairly quickly it came back with the following (Hearts are trump):


If East applies the strategy in question, he will play 5H, and arrive at the following position, after North plays 2H on the next trick:


East will take the next trick with 9H, but the last trick will be captured by South.

The correct play to North's JH in the first example would have been 9H, leading to the last two tricks being captured by East/West. Therefore, the above strategy is flawed, and should not be used in reducing the number of positions to analyze by double-dummy solvers.

As I understand this strategy is often taught to beginning bridge players, and it is still quite relevant in single dummy bridge, it's just that it is not always optimal. If a player possesses knowledge of distribution of all the cards remaining in play, he should avoid this mental shortcut in his analysis.

----
[1] Building a Fast Double-Dummy Bridge Solver – Ming-Sheng Chang, Ming-sheng Chang Phd Student - 1996

Sunday, October 25, 2009

What's NOT a good pruning strategy in Bridge - Part 1

In my spare time I look at academic papers for clues to optimize Gnubridge's double-dummy solver. In a frequently quoted 1996 paper, Chang proposes this pruning strategy:

Among all cards that are impossible to win the current winning card, only the one with smallest rank is selected. [1]
I have seen this strategy advocated in one other paper [2]. While it seems intuitive, it is flawed.

I implemented it in Gnubridge, and the stochastic equivalency checker quickly found counter examples. Take a look at the following position. West is to play and the trump is NT.


 If West follows the pruning strategy's advice, he'll play 4H and the pair will eventually lose the last trick when West plays 2D. If West correctly played 8H, the pair would win all tricks, by simply running the remaining hearts on East's hand.

Now the question remains - can this strategy be amended to only apply to cases when the opponent is winning the current trick and his high card cannot be beaten?