Houdini 1.5a Defeats Rybka 4: 23.5-16.5
More on this site, somewhere. Assuming this result is meaningful*, it's not clear if this is good news or bad news for chess fans. Superficially, it's great news, because it costs money to buy Rybka while anyone can freely download Houdini. BUT: There seems to be a lot of cannibalizing going on among engines** (certainly such accusations are widespread), and if it's true that the free programs are ripping off code and concepts from the for-profit engines, it's likely that sooner or later the for-profit people will simply leave. Then the cannibals will have to do their own work, and with no financial incentive or notoriety to inspire them, the field will stagnate. Then it's not only a pity for the legitimate programmers whose work and financial opportunities were stolen, but for the broad chess community as well.
* There are reasons why it may not be so significant: the computers may have used truncated or common books, and a 40 game match, while not trivial, doesn't guarantee that Houdini is the stronger engine. Still, the evidence that's there, of whatever quality, is in Houdini's favor.
** As far as I know, no one has accused Houdini of pirating code from a closed-source engines, but its author has acknowledged being influenced by engines that have allegedly done so.
Reader Comments (12)
One reason I don't think these tournaments show as much as people think is because of how many games are decided by one engine blundering due to wanting to continue the game after 49 moves of shuffling pieces in a drawn position. I suspect we would see a closer match and more draws if the engines were adjusted to have a lower, more accurate "contempt" factor (given that the opponent is another chess engine of similar strength).
Also, having followed the tournament, the computers did use a truncated book; I'm not sure how things were selected, but it was something like 10 or 12 book moves then real thinking. One thing that does make it fairer, though, is that each engine played both the white and black side of each of the selected openings.
I'm not too worried about chess engines stagnating, since beyond Houdini there are other strong engines that (as far as I know) haven't been accused of any plagiarism, such as Stockfish.
Is it really true the "field would stagnate"? There are plenty of examples of open-source not for profit projects that have proved more popular than paid ones, out grown them and survived. MySQL, the open source database purchased by Oracle, grew to be one of the most popular online databases. Wordpress beat all comers as the most popular blogging platform, it's developers making money in other creative ways from it's popularity.
Whether Houdini is truly the better engine remains to be seen but just because it's free and influenced by commercial programs by no means spells disaster for chess engines and the community as a whole.
Sean
Anything that keeps these beasts from solving chess is good news ! :). I'd love to see the programmers move on to other activities, leave my chess alone!
I was more impressed by Houdini's play than by the score. It sacrificed pawns for initiative in almost every game.
"and if it's true that the free programs are ripping off code and concepts from the for-profit engines"
Interestingly, it may be the other way around!
http://www.talkchess.com/forum/viewtopic.php?t=37762
[DM: As mentioned two weeks ago, in this post: http://www.thechessmind.net/blog/2011/1/24/more-computer-chess-controversy.html. But that kind of theft (if theft there be) won't have the negative consequences mentioned in this post.]
@ Sean: There might be a way for engines to keep developing in a meaningful way, but the chess community doesn't seem to have discovered it yet. ChessBase, for instance, used to have absolute dominance with Fritz and Junior, but first Rybka and then quite a few more engines have opened up a pretty healthy gap. ChessBase continues to do fine, but not by devoting big resources to engine wars but by becoming a source for instructional DVDs. Even Rybka may be jumping ship a bit, as their new thing is selling time on their 40- and 100-core clusters.
The result confirms what has been established before on various rating lists (IPON, CEGT) and by various long time control matches. Houdini 1.5 is at least 50 Elo stronger than Rybka 4 when both are playing on identical hardware.
The match also demonstrates that good engine matches are at least as interesting as good human matches. There were some very exciting games with amazing multiple pawn sacrifices.
A final word on the "cannibalizing": inasmuch as no one has accused Houdini of pirating code from a closed-source engines (see your foot note), it's a bit sad that 80% of your blog post is devoted to this issue, instead of discussing the much more interesting chess aspects of the match.
[DM: Sorry, I'm not with you on this one. There are interesting games in any event, and unlike tournaments with the top players, many if not most of my readers can set up engine-engine matches any time they like. If a game "wows" me, I'll often mention it, but the "cannibalizing" question is an interesting one. Note too that this is not dispensed with by noting that no one has accused Houdini of directly pirating code, because (a) there's still the larger issue (and if my conjecture is correct, engine-engine games won't progress as quickly and interestingly as you may like) and (b) Houdini may be the "nephew" of pirates.]
Yet another change on the engine throne, yet another disappointment when it comes to endgame evaluation.
[DM: Not disputing this, but maybe you can provide an example?]
When I downloaded Houdini 1.5a, I set blitz match vs. Stockfish 1.9. At the end of one game, this position appear:
8/5b2/8/3k4/8/5PpP/6P1/6K1 w - - 0 0
I've notice huge difference in evaluation. Stockfish gave -5.89, and Houdini -0.02, so I've take closer look at it.. This is of course dead draw. Then I run Rybka4 for few minutes, and she also fail to evaluate correctly (-86.76 or something like that). I thought: Wooow, is this it? Then I decided to make some more tests, so I set this trivial draw:
8/3k4/5P2/p2pP1p1/P1pP2Pp/2P4P/5K2/8 w - - 0 0
Now Houdini thinks it's +2.20, but Stockfish needs only few seconds to 0.00
So, why Houdini can solve first position with such ease, but not the other? Vice versa for Stockfish.
It seems that at least one person associated with Rybka has accused Houdini of wrongdoing. (See related thread on Chessvibes.) While I remain neutral (and indeed manifestly ignorant) about the veracity of such allegations or speculations, I applaud Dennis's compelling exposition of the most likely ramifications of pirating.
On the purely chessic aspects of this match, it seems there is much that is remarkable, such as the high percent of decisive games, the contrasting evalustions by the programs, etc. I've only gone though a couple of games in great detail so far, but it is clear that there a lot to be gleaned there.
First Houdini and Rybka had won their way to this match by placing clear first and second, respectfully, in an eight engine double round robin tournament with six other very strong engines including latest versions of Hiarcs, Shredder and Stockfish. As others have noted these are clearly the two best competitive engines out there based on hundreds if not thousands of games. Do they have flaws? Yes they do. Human can at times see clear winning positions that they do not.
However they first game of the match was as good as it gets when Houdini sac's three pawns for a positional advantage that it converts to a wonderful victory. I have search my big database (over 4 million games) and can not find a human equivalent. I think it should be included as one of the great master pieces of all time. But you be the judge.
[Event "TCEC - Elite Match - S1"]
[Site "http://www.tcec-chess.org"]
[Date "2011.01.28"]
[Round "1.1"]
[White "Rybka 4.0"]
[Black "Houdini 1.5a"]
[Result "0-1"]
[ECO "B22"]
[PlyCount "106"]
[EventDate "2011.??.??"]
1. e4 c5 2. c3 Nf6 3. e5 Nd5 4. Nf3 Nc6 5. Bc4 Nb6 6. Bb3 c4 7. Bc2 Qc7 8. Qe2
g5 9. e6 dxe6 10. Nxg5 Qe5 11. d4 Qxe2+ 12.Kxe2 e5 13. dxe5 Nxe5 14. Nxh7(Novelty) Bg7
15. Ng5 Bd7 16. Na3 Nd3 17. Bxd3 cxd3+ 18. Kxd3 Na4 19. f3 a5 20. Ne4
f5 21. Nf2 b5 22. Nc2 b4 23. cxb4 Kf7 (White is three pawns up, the queens are off the board and black has no direct attack on the white king, yet black is winning) 24. bxa5 Rxa5 25. Kd2 Rd8 26. Nb4 Re5 27.Nfd3 Bb5 28. Re1 Nc5 29. Rxe5 Bxe5 30. f4 Bf6 31. Ke1 Nxd3+ (black begins to win back all the material) 32. Nxd3 Bxd3 33.a4 Rc8 34. a5 Rc2 35. Bd2 Rxb2 36. a6 (the passed pawn is not fastest enough to give white a chance for a draw) Be4 37. Ra3 Bxg2 38. a7 Rb1+ 39. Ke2 Ba8 40. Be1 Bd4 41. Ra2 Rb3 42. Bg3 Ke6 43. Kf1 Bc5 44. Ke2 Kd7 45. Kf1 Rb4 46. Ke1Bd6 47. Kf2 Bxf4 48. h4 Bh6 49. Kf1 Rb1+ 50. Be1 e5 51. h5 f4 52. Rd2+ Kc7 53.
Rc2+ Kb6 0-1
I was awed that an engine could create such a game against an near equal opponent.
[DM: While I wouldn't rave about it quite to the extent that you did, it was a fine game, and I'll be annotating it for the TCEC page soon.]
[(Snip!) Dear troll: If you want to have a conversation, offer your comments in a way that demonstrates basic respect for other human beings. A reminder on one point, btw: the recent consensus about Rybka's (allegedly, inappropriately) taking Fruit code emerged *after* I wrote this post.]