Skip to content
29 March 32011 / Robin Wellner

Writing a Tic-Tac-Toe Minimax

Last Sunday, I decided to write an implementation of tic-tac-toe, complete with computer opponent. I ended up implementing a minimax algorithm, which worked well enough. The relatively small amount of states in tic-tac-toe (3⁹ = 19 683, but this includes unreachable states) made this rather simple.

I pitted two minimax players against each other. Predictably enough, it resulted in a draw. Always the same, since the algorithm is deterministic:

X X O
O O X
X O X

The total running time was consistently just under 3 seconds.

That got me thinking, surely a heuristic could do things faster? So I drew up some heuristic rules the enhanced computer player would use:

  1. If the board is empty, just play the top-left corner.
  2. If you can win, play the winning move.
  3. If your opponent can win, prevent them from winning.
  4. If you can find a nice corner you can use to force a win (a number means it is empty, a period means it doesn’t matter), that is:
    X 2 3
    . . 6
    . . X

    or

    X 2 3
    . 5 .
    X . .

    (possibly rotated), do that move.

  5. If the opponent has above opportunity, prevent them from doing so.

I tried the heuristics, and sure enough, they helped. Drasticly. But only when the heuristic went first (total time: just over 0.2 seconds). Otherwise, the timing didn’t change at all. Not even just a little bit. They did nothing.

After a bit of experimenting — not too much, you can probably tell what’s going on here just by reading the previous paragraph, I just wanted to confirm it — I found out what was going on:

None of the heuristics did anything. Except for the one for the empty field. You see, the further you get in the game, the smaller the number of possible future game states gets. Reduction early on is likely to matter more. Especially if you can throw away 8/9 of the search space.

The nature of minimax did the rest: if you can win in one move, it will already be selected immediately by the algorithm. If your opponent can win in one move, the algorithm will find you lose quickly after choosing anything but that one soon enough. And those fancy heuristics for finding ways to trick the opponent into inevitable failure won’t be needed, because the opponent won’t let you get that far (remember, this is minimax vs minimax).

So I removed all but the first heuristic.

I put it up on GitHub as a Gist. If you scroll up, you can see the old implementation of the heuristic (much longer, a lot less elegant, and not a bit faster than the new one).

What have we learned? I’d say the lesson is that, when relying on a minimax as fall-back, you only need to implement heuristics for the opening moves, which will dramatically decrease the search space.

Leave a comment