Introducing “BlueZebra” a GPL command-line Reversi AI

blueZebra
Today, we’re releasing some Open Source freeware! As part of creating our next game (Reversi for PC, Mac, and Linux on Steam), we wanted to make use of some of the great AI that’s already been written for Reversi over the years.

One of the most respected Reversi AI programs is WZebra by Gunnar Andersson. WZebra hasn’t been updated since around 2005, but it was released under GPL and included a command-line tool for solving the end of a bunch of games (scrZebra) as well as some other analysis tasks. What it didn’t include, was a way to access WZebra’s AI via command-line to easily get the best move for a specific board layout. That’s exactly what BlueZebra is: BlueLine’s customization to allow command-line access to WZebra AI.

Here is a zip of the project: download BlueZebra*. Since it is based on a GPL project, it is itself released under a GPL license. To make changes, use the Visual Studio solution and recompile. We also updated the Makefile to work with modern systems (it was made when everything was 32 bit) and made the Makefile compile on OSX, not just Linux.

Run “blueZebra.exe ?” to get help info for each of the parameters, but to give you an idea of how you can pass everything in & get back a move from the AI, here is example usage:

prompt> blueZebra.exe -cli -b 1 -e 0 -line 2 -scores 0 -depth 24 26 28 -board -----X------X------XOX-----OO-----OXO------X-------------------- -turn O
c6

“c6” is the move that the AI returned for White to play.

*: hashes of the zip file…
md5: af7733545b7bb21aed9399c5f3f08f6d
sha1: 30b7d0379f318c6974460d066944a11fcc3215fe

Adding some randomness to turn-based AI (2 of 2)

We just released a big update to Hive which included some great visual changes as well as a number of AI improvements. One of the interesting AI changes was the addition of a bit more randomness to the AI’s decisions, with minimal impact on the skill of the AI. This is the 2nd part in a 2-post series on the topic. If you haven’t read the first post of the series, I recommend that you at least skim it first.

Randomness on all moves

Even with a large number of random openings, it would be possible to run into similar situations down the road that would play out time and time again (Battlestar Galactica style). Since this repetition would lead to predictability but most randomness decreases the skill-level of the AI, we had to strike a balance. In order to make it so that full games diverge from each other significantly, but without weakening the AI too much, I came up with the concept of a “randomness quotient”. I don’t know if this already exists in Game AI or not, but it seems to be a good solution and that was the most appropriate name I could think of for it.

“Randomness Quotient” explained

The “randomness quotient” is a setting that will increase the randomness of the choices as its value decreases. Specifically, if the randomness quotient (which we will represent with the variable “Q”) is 7, then the engine will choose a ply other than the best move about 1/7 times (which is “1/Q” times). Of those instances where the best-scored ply is not chosen, then the second-best move will be chosen 1/Q times. Therefore, the odds of getting to the second best move and still choosing to consider the third-best scored move is ((1 / Q) / Q) which is the same as (1 / (Q ^ [move-number])). To figure out the opposite number (the odds of choosing move number N, rather than the odds that we’ll skip past it) we’d use a numerator of (Q-1). For example, the 8th best move would be chosen once in approximately ((Q-1) / (Q ^ 8)) moves. The more general equation is: ((Q-1) / (Q^N)) where Q is the randomness quotient, and N is the move number (as ranked by the heuristic evaluation method so that move 1 scores the best, move 2 scored the second best, etc.).

Even with a fairly high amount of randomness – such as a Q of 2, remember: low Q means high randomness – we would only see a move as bad as the 8th move in 1 out of every (1/(2^8))= 256 moves.

These occasional less-than-stellar moves are essentially simulating a human player occasionally losing concentration.

In case the math was a bit confusing there, we can show the effects of this algorithm visually. Each bar in the chart is the move-number where the moves are ranked from the best-scoring on the left to the worst-scoring on the right.

randomnessQuotient_2In the first chart, there is a very low randomness quotient of “2” which is NOT recommended in realistic play, but it’s still a great example to visualize how this works. You’ll notice the first move is chosen 1/2 of the time. Since the other 1/2 of the moves are also cut in half, then the second-best move (move index 1) is only chosen in 1/4 of the instances.

randomnessQuotient_7In this second chart, we have a higher randomness quotient of 7 which represents a more realistic setting. This setting doesn’t add a ton of randomness to any given move, but over the length of a normal game, this should introduce enough randomness to steer any similar games apart from each other. As you may have recognized, these charts are an example of exponential decay.

In our code, we used lower Q values to give more randomness to weaker levels of AI to “nerf” them a bit, while giving them more variation at the same time. The higher levels of AI still have a modestly high randomness quotient since we want a little randomness – currently Q is around 8 or 9 – but we may continue to tweak that from experience.

Randomness Quotient pseudocode

This is almost the exact code from our Minimax engine where we implement randomness based on the Randomness Quotient. It’s not that much code!

// At this point, the Minimax engine has scored some plies
// and stored them and their scores in topLevelScoresByPly.
List<KeyValuePair<Ply, double>> plyPairList = topLevelScoresByPly.ToList();

// Sort the plies descending by their scores (best score at index 0).
plyPairList.Sort((x, y) => y.Value.CompareTo(x.Value));

// Apply the RandomnessQuotient so that the best-scored ply is not guaranteed to be chosen, but is still
// exponentially more likely than the second which is exponentially more likely than the third, etc.
int plyIndex = 0;
while ((plyIndex < (plyPairList.Count - 1)) && (0 == randGen.Next((int)RandomnessQuotient)))
{
    plyIndex++;
}
if (plyIndex > 0)
{
    Logger.log(LogLevel.INFO, "RANDOMNESS! Skipping the best (0th) ply and chosing ply index " + plyIndex +" instead.");
}

// Grab the ply that the Randomness Quotient has chosen.
theChosenPly = plyPairList[plyIndex].Key;
chosenPlyScore = plyPairList[plyIndex].Value;

Conclusion

The addition of randomness to the AI engine has greatly increased the replay-value of the AI in Hive. Hopefully this concept can be useful to some other developers as well. This change was released as a free update two days ago, along with a significant batch of visual updates and about a half-dozen other AI improvements that we’ve made over the past week or so. If you have the game, check them out… if you haven’t bought Hive yet, please support us by grabbing your copy on Steam today!

As always, please let us know if you have any feedback about the changes in Hive, or if you have any questions about this post! If you’ve implemented randomness in your own AI in another interesting way, please leave a comment for the other readers – and myself – to learn from.

Thanks!

Adding some randomness to turn-based AI (1 of 2)

We make a habit of asking great Hive players to give us feedback on the AI for our Steam version of Hive . Now that the AI has become pretty formidable, a suggestion we started hearing a few times was that the AI wasn’t random enough. This allowed players to basically memorize outcomes and they found themselves replaying any moves that worked in the past, rather than playing to win.

As I mentioned in my ECGC 2013 talk, an important concern in “real-world”/commercial game AI – as opposed to just getting optimal results in a lab – is to make sure that your AI teaches players to get better at the game rather than how to get better at just beating your AI.

When to use Randomness

In perfect strategy games (those with no luck), a completely random move is highly unlikely to be good. Conversely, even with great AI: if the AI is imperfect, then having little-to-no-randomness makes it easy for players to memorize and exploit any weaknesses. Your first reaction might be to just assume we should make perfect AI. However, for sufficiently-complex games such as Hive and Chess, the possible outcomes are more numerous than the atoms in the universe, so we’re likely to be relegated to imperfect AI forever (or at least until we have decent quantum-computers).

Not only does complete randomness lead to weak decisions, but almost* any randomness generally leads to slightly worse decisions. If you aren’t using randomness, you always chose the best-scoring move. Due to this trade-off of efficacy vs. randomness, I would not recommend adding much randomness to your AI until it is already very good.

Randomness in Openings

Players who played dozens of games in a row against our AI quickly became cognizant of patterns and vulnerabilities in the AI’s openings. Once they discovered a way to make the AI have a weak opening they found themselves constantly replaying those openings and regardless of what happened after, the AI couldn’t recover from that bad of a start.

Since variation in openings was crucial to preventing the AI from getting in a rut, we added a fair amount of randomness.

How it was before:

Originally, we used a weighted probability to figure out which Hive tile to place. The second ply was always placing a Queen, off-center of the first piece (not inline). The third move and beyond were all calculated by our minimax engine.

How it is now:

Hive "C", "I", "L", and "Z" shaped openings.

Hive “C”, “I”, “L”, and “Z” shaped openings.

We use a weighted probability for the first and second pieces. The second piece is still quite likely to be a Queen (it’s a very solid choice) but it won’t always be a Queen. The location of the second placement is now completely random. In Randy Ingersoll’s book, Play Hive Like A Champion, he named the openings based on the shape of the tiles. Normal openings can be laid-out like a “C”, “I”, “L” or “Z” (there are also “F” and “J” openings, which are just rotated versions of the “L”). If a player chooses not to place their queen by the second move, then the opening is called an “X”. The “X” isn’t considered a very good opening, but all of the possible openings can now be done by the slightly random AI; the weightings are just tuned so that it will be pretty rare for the AI to play “X”.

In addition to the decent randomness on the second move, we also expanded it so that the third-move (which is computed using the minimax engine) has a high degree of randomness. 1/2 of the time, the best scoring move will not be chosen, and in 1/2 of the those instances, the second best scoring move won’t be chosen either. The next post will explain more about this type of randomness.

Randomness on all other moves!

This is a little more complicated/mathematical/technical so I’ve split it into another post to avoid tiring you out with this already-long post! It should be posted in the next day or two. edit: it’s posted now.

Conclusion

The addition of randomness to the AI engine has greatly increased the replay-value of the AI in Hive. Hopefully this concept can be useful to some other developers as well. This change is going to be released as a free update in the next couple of days, along with a significant batch of visual updates and about a half-dozen other AI improvements that we’ve made over the past week or so. Keep your eye out for the update on Steam! EDIT: The update has been released!

As always, please let us know if you have any feedback about the changes in Hive, or if you have any questions about this post! If you’ve implemented randomness in your own AI in another interesting way, please leave a comment for the other readers – and myself – to learn from.

Thanks!

*: “Almost” because you can get a small amount of randomness from randomly choosing between any moves that tie for the “best move”, but that’s unlikely to happen very often and will still yield nearly-identical moves in many cases – eg: moving one pawn in Chess instead of its mirror opposite.

Visually debugging a Minimax AI engine

While creating the Steam version of the popular board game Hive, it became clear that normal methods of on-screen debugging weren’t going to be in-depth enough.

Alpha-beta pruning of a small Minimax game tree.

Alpha-beta pruning of a small Minimax game tree.

Most AI for 2-player abstract-strategy games probably uses the tried-and-true Minimax algorithm. It works just how you’d think AI would work: you look at a tree of all of the possible board positions that you could get to (and all positions you get to from there, etc.) and score them. It can get a little more complex on top of that, but the basics are really straightforward.

While the Minimax algorithm is very general-purpose, the scoring function that you use to evaluate each position is game-specific. Debugging that scoring function (called a “heuristic evaluation function” in technical lingo) can be tricky, especially since it is very hard to see an entire game-tree at once. The small tree in the first picture above is used for teaching minimax, but is unrealistically small for real games. As an example, even a very simple game like Tic-Tac-Toe would have 3 possible moves right away (it would be 9 moves, but the board is symmetrical, so there are only 3 actual different moves). Looking at this tic-tac-toe tree, you can see that even this extremely-simple game’s tree grows quite large by the third layer.

Complexity of real Game Trees

Tic-Tac-Toe game tree

First two plies of Tic-Tac-Toe game tree

Let’s put that in perspective compared to other games: The number of moves that can be done per level is called a “branching factor“. In the tic-tac-toe example, the branching factor at the start of the game is 3. After moving, the branching factor is 2, 5, or 5, depending on which move is made. In chess, your first move can be one of 20 possibilities: 8 pawns (each with 2 different moves) and 2 knights which have 2 possible moves each. From there, the possibilities change very quickly. Due to this variability, when discussing games we tend to focus mainly on the average branching factor. The average branching factor of tic-tac-toe is 4, for chess it’s 35, the value for Hive is currently unknown but I’d estimate it between 40 and 50.

The need for a good visualizer

For a game such as Hive, to have the AI look 3 levels deep, you’d have (50^3) = 125,000 nodes on the third level of the tree. Even if we limit the branching factor (which we do in our AI) the number of nodes is quite large. At the time of this writing one of our AI levels limits to a branching factor of 30. So (30^3) = 27,000 nodes. Needless to say, that tree won’t fit on most computer screens, so it would be hard to debug it all at once.

Therefore, we need a different way to visualize what’s going on. When debugging a heuristic evaluation function, it’s important to know the score throughout the tree, how the score worked its way up each level, and it also helps to be able to see a detailed view of how the AI scored the nodes. Keep in mind that only the scores on the bottom level of the tree are actually used to be the final scores of the path to that node. However, when we limit the branching-factor, that involves giving a preliminary score to each node and sorting all of the sibling nodes in any given level of the tree, before traversing to their children. This way, even though many moves may be skipped in a given level of the tree, the odds are high that the most important moves are being evaluated. Due to this pre-scoring, it is helpful to have detailed scoring information on every node in the tree.

In addition to seeing the nodes in a current level, it would be helpful to have pointers to which layers of the tree are maximization or minimization steps so that you don’t have to keep as much information in your head. This way, you can examine any node in the tree and tell where it got its score from (eg: it’s children) and how its score is being used by its parent-node if it has one.

Our solution

Minimax AI Visualizer

Minimax AI Visualizer – click to enlarge

We wanted to be able to run the AI in our game’s debug-mode, then create a log-file and view it easily. After looking around, it seemed that a very simple solution would be to dump some JSON in a format that the Javascript InfoVis Toolkit (JIT) could load, then create a tool to render a “Space Tree” which expands and collapses as needed.

→→ Check out our Minimax AI Visualizer Tool in action. ←←

I started with the SpaceTree demo from JIT and just modified it from there. Features:

  • Loads data from a JSON file by default, but you can paste new JSON into a text-box to create a new tree (or use the text-box modify the existing tree).
  • Colorized to easily show which steps are Maximize steps or Minimize steps
  • Each node says what ply it represents to get to that node, and the final score that node ended up with.
  • The mouse scroll-wheel lets you zoom in and out
  • Hovering over a node brings up a detailed tool-tip bubble with the breakdown of each of the scoring-factors that were used to come up with the pre-scoring for a node – or the actual scoring, in the event that it’s a leaf-node.
  • Hovering over the Root Node brings up a tool-tip bubble with detailed info on the entire run of the AI: how many nodes were evaluated, how long the AI ran, how many total prune events there were, etc..
  • Each time there is a prune event (from the alpha-beta pruning), there will be one node which indicates the entire number of sibling nodes that were pruned at once.

If you want to use the same tool to visualize the progress of your own AI, all you need to do is have your code output JSON in the same format that’s used in the textarea below the graph, then paste it into that textarea and hit the “Load Tree” button.

Outcome

Being able to more quickly track-down some of the weird decisions that the AI was making, let us drastically improve the AI in only a few days of work. The example that’s embedded in the Visualizer is from before most of the changes, but that shouldn’t matter since it’s just showing how it works. We’ve had some very good players helping us debug it, and one of the recent World Champions said that the top level of AI made him force a draw. We’re getting there!

If you want to see the AI in action, check out Hive on Steam.

If you have any questions about the Visualizer or about our AI, let me know in the comments!