< Earlier Kibitzing · PAGE 3 OF 15 ·
Later Kibitzing> |
Aug-23-15 | | NeverAgain: *** Case 3 ***
The position after <26.Ne6> in
Fischer vs Larsen, 1970
 click for larger view
There's an interesting discussion about the relative merits of <26...Qc4> and <26...Qa7> on the last page, starting with
Fischer vs Larsen, 1970 (kibitz #92) Some quotes, edited down to the relevant parts:
<Howard: This just in.....according to the new book Larsen: Move by Move, Larsen apparently missed a quicker win on the 26th move. Rather than play 26...Qc4 he should have played the "impossible to find" (as the book puts it) 26...Qa7 !!.> <Retireborn:Larsen's 26...Qc4 is strong as 27.Rxd6 b3 28.c3 Rxa4 gives Black a winning attack; however after 27.b3 Black is forced to exchange queens.
┊
26...Qa7! is even stronger as it protects g7 and avoids a queen exchange; after 27.Rxd6 Bxe6 28.Rxe6 b3! Black's attack wins with a Q check on e3, while after 27.b3 Re8 works without any problems.
┊
It's very logical, but as you say difficult to see!> <Retireborn:Interestingly Houdini prefers 26...Qc4 at first, takes it a minute or so to see that Qa7 is even better!> <DWINS: Stockfish 6 agrees with Larsen and finds that 26...Qc4 is significantly stronger than 26...Qa7.
┊
I let it run to depth 30 and it evaluates 26...Qc4 as giving Black an advantage of -1.70, while 26...Qa7 yields an advantage of -0.77> <Nerwal: Indeed but this is a weird case where Stockfish is completely wrong : it doesn't see 26... ♕a7 27. ♕g6 ♕e3+ 28. ♔b1 ♗xa4 as completely winning for black unless you feed it to it (29. ♘xg7 ♗xc2+ leads to a sweet checkmate sequence, for instance). Meanwhile even Houdini 1.5 finds this almost instantly.> <DWINS: You're absolutely right! I have noticed this a few times before. I wonder if the developers have seen this type of behavior from Stockfish?> Kind of alarming, to say the least.
[to be continued] |
|
Aug-23-15 | | NeverAgain: *** Case 4 ***
The position that arises in the following line in English Opening after
<1.Nf3 c5 2.c4 Nf6 3.Nc3 d5 4.cxd5 Nxd5 5.g3 g6 6.Bg2 Bg7 7.Qa4+ Nc6 8.Ng5 e6 9.Nge4 Nb6 10.Qb5 c4 11.Na4>
 click for larger view
WinKing's SF6 analysis of 11...Bd7:
<Analysis by Stockfish 6 64: depth=30
┊
1. (-0.51): 11...Bd7 12.Nd6+ Kf8 13.Nxb7 Nd4 14.Qa5 Qb8>
Wesley So (kibitz #191215) This is what I get when I let SF6 ponder the position after <14...Qb8> on my desktop box: = (-0.30) Depth: 36/53 00:08:01 886mN
<15.Nbc5 Bc6 16.Bxc6 Nxc6 17.Nxb6 Nxa5 18.Nbd7+ Ke7 19.Nxb8 Raxb8 20.0-0 Rhc8 21.Na4 Nc6 22.Rb1 Ke8 23.Kg2 Nb4 24.Nc3 f5 25.Rd1 Rb7 26.d4 cxd3 27.exd3 Rd7 28.Be3 Nxd3 29.Bxa7 Ra8 30.Be3 Nxb2 31.Rxd7 Kxd7 32.Rxb2 Bxc3 33.Rb7+ Ke8 34.Rxh7> So, Stockfish follows its own recommendation for seven moves and discovers that almost half the supposed advantage magically dissipates, leaving Black with a = instead of an edge. For comparison, Komodo 6: = (0.00) Depth: 28 00:16:18 850mN
<15.Nbc5 Bc6 16.Bxc6 Nxc6 17.Nxb6 Nxa5 18.Nbd7+ Ke7 19.Nxb8 Raxb8 20.0-0 Bxb2 21.Rb1 Bxc1 22.Rfxc1 Rhd8 23.Rxb8 Rxb8 24.Rc2 Rb4 25.f4 f5 26.Kf2 Kd6 27.d4 Kd5 28.e3 Rb1 29.Na4 Kd6 30.Nc3 Rh1 31.Kf3 Rf1+ 32.Ke2 Rh1 33.h4 Rh2+ 34.Kd1 Rh1+ 35.Kd2 Rg1 36.Nb5+ Kd5> I think that at these point we can already draw some conclusions (see next post). * Additional notes *
The numbers listed in Case 2 seem to confirm the following statement from IM Erik Kislik's article, linked earlier in this thread: <Stockfish uses depth-based LMR (which you can read about on chessprogramming wikispaces if you’re interested), which means that the greater the depth, the more selective they become. This is one explanation for why Stockfish reaches greater depths, and why each additional ply (a single move for one side) is much less than an additional ply for Komodo. Roughly speaking, beyond around 20 ply, each Komodo ply is almost like 2 Stockfish plies.> http://www.chessdom.com/im-erik-kis... The link to the Q&A session with Larry Kaufman:
http://www.qualitychess.co.uk/blog/... |
|
Aug-23-15 | | NeverAgain: *** Conclusions / TL;DR ***
I have strong reservations about continuing to use Stockfish 6 as my main analysis tool. It appears that this program's main focus is competition with other engines rather than deep game analysis. Based on the findings presented in the previous three posts I would consider 35-ply the minimum search depth needed to produce meaningful results. If anyone here frequents Stockfish forums or, better yet, is in direct contact with its developers, please pass a link this thread on. I would also appreciate if anyone with the latest Komodo 9 could test some or all of the above cases with it and report on his findings here. |
|
Aug-24-15
 | | Ron: <NeverAgain: *** Case 3 ***
The position after <26.Ne6> in Fischer vs Larsen, 1970 ....>Very interesting! I put that position into my Stockfish, and it too gives as its first choice 26. ... Qc4 But plugging in the line that you gave, starting with 26. .. Qe7, gives a stronger evaluation for Black. I corroborate your finding here. |
|
Aug-24-15 | | NeverAgain: Thank you for checking it out, Ron.
SF6 lists <26...Qa7> as the second choice as soon as you start the analysis, so I decided to put it into Infinite Mode and see if SF6 changes its mind after a while and got this: 26...Qa7 (-1.80 ++) Depth: 40/69 00:30:12 3791mN I don't know if I should call this is good news or bad. At this rate, the analysis of *one* average game of 40 moves would take almost 24 hours. |
|
Aug-24-15 | | NeverAgain: Just tested with Komodo 6 - quite similar results: 26...Qa7 (-1.33 ++) Depth: 28 00:32:56 2247mN |
|
Aug-28-15 | | NeverAgain: Cases 3 and 4 tested with Komodo 9 on the same setup. Yap-Tal:
45.a3 b3 = (0.19 ++) Depth: 32 00:12:39 1125mN - 8min faster Fischer-Larsen:
26...Qa7 (-1.63 ++) depth: 29 00:12:02 1070mN - 20 min faster |
|
Aug-28-15 | | zanzibar: Here's a recent position from So--Aronian's Sinquefield game: After 23.Qb1
 click for larger viewAronian plays 23...b3 which, at 20-ply isn't even one of leading three candidate moves for Stockfish 6. Critter 16.a, however, notes 23...b3 as clearly best: Critter 1.6a:
23...b3 (-2.64/20)
23...Qf4 (-1.75/20)
23...Bxh5 (-1.17/20)
Stockfish 6:
23...Bxh5 (-1.59/20)
23...Bf5 (-1.50/20)
23...Bb6 (-1.06/20)
In fact, the eval of 23...Bxh5 by Stockfish changed when I reran it to get all three lines at 20-ply depth. So, I too am becoming increasingly suspicious of Stockfish 6. Details:
( [Critter 1.6a 32-bit] 20:-2.64 23...b3 24.Nxb3 Bb6 25.a5 Rc2 26.Qxc2 d3 27.axb6 dxc2 28.h3 e4 29.Re1 e3 30.Rxe3 Qxb6 31.Kg1 Qxb3 32.Rxb3 c1=Q+ 33.Kh2 Qc2 34.Rg3 Qxf2 35.hxg4 Qxb2 36.Rd1 Qe5 37.Rd3 f5 38.Re3 Qc7 39.Rc3 Qd8 40.gxf5 Rxf5 41.h6 Rh5+ 42.Bh3 Rxh6 ) ( [Critter 1.6a 32-bit] 20:-1.75 23...Qf4 24.Qd3 Bb6 25.a5 Ba7 26.Kg1 Rc2 27.Ne4 Rxb2 28.Bg3 Qe3+ 29.Qxe3 dxe3 30.h3 Bxh5 31.Bxe5 Rc2 32.Re1 b3 33.Ng3 Bg6 34.d6 b2 35.Kh2 f6 36.Bf4 Rd2 37.Bxe3 Bxe3 38.Rxe3 b1=Q 39.Rxb1 Bxb1 ) ( [Critter 1.6a 32-bit] 20:-1.17 23...Bxh5 24.Qd3 Bg6 25.Qg3 Rc2 26.Ne4 Qe7 27.h4 Rxb2 28.h5 Bf5 29.h6 g6 30.Rh5 Bxe4 31.Bxe4 Bc7 32.Rc1 Ra2 33.Kg2 Rxa4 34.Qg5 Qxg5+ 35.Rxg5 Bd6 36.Rc6 f6 37.Rg3 f5 38.Bxf5 Rxf5 39.Rxd6 ) ( [Stockfish 6 64 POPCNT] 20:-1.59 23...Bxh5 24.Kg1 Qh6 25.Nf1 b3 26.Qf5 Rfe8 27.Ra3 Bd1 28.Ra1 Bc2 29.Qh3 Qxh3 30.Bxh3 f5 31.Ng3 g6 32.Be1 Bxe1 33.Rxe1 e4 34.Kf2 Rcd8 35.Bg2 Rxd5 36.h4 ) ( [Stockfish 6 64 POPCNT] 20:-1.50 23...Bf5 24.Ne4 Qh6 25.h4 Qf4 26.Kg1 b3 27.Qd3 Rc2 28.Qg3 Qxg3 29.Nxg3 Bg4 30.Rb1 Bb4 31.d6 Bxd6 32.Bb7 Rb8 33.Bxa6 f5 34.h6 gxh6 ) ( [Stockfish 6 64 POPCNT] 20:-1.06 23...Bb6 24.a5 Ba7 25.Kg1 Bxh5 26.b3 Qf4 27.Nc4 Bf3 28.h4 e4 29.Qf1 Bc5 30.Bxf3 Qxf3 31.Rh3 Qh5 32.Qg2 e3 33.Be1 d3 ) |
|
Aug-28-15 | | Howard: So it appears that 26...Qa7! wins for
Black after all, correct ?
Very impressive ! |
|
Aug-28-15
 | | AylerKupp: <<NeverAgain> I have strong reservations about continuing to use Stockfish 6 as my main analysis tool. It appears that this program's main focus is competition with other engines rather than deep game analysis.> I agree with much of what you say about Stockfish 6 and I also consider d=35 to be the minimum search depth for which I would consider posting or paying serious attention to any Stockfish analysis. And, whenever possible, I prefer to wait somewhat longer and wait for d>40. I shake my head in disbelief when sites like http://grandchesstour.com/2015-sinq... which use the chess24 presentation software bother to show Stockfish 6 analysis with search depths in the high10s and low 20s; utterly meaningless IMO. I suspect that the reasons for Stockfish's "offbeat" evaluations is its very aggressive pruning of its search tree. The more aggressive that an engine prunes its search tree the greater the chances that it will miss better moves, particularly at lower search plies. At the higher search plies, as more branches are included in its search, some of these missed better moves are uncovered. For some time I have advocated the use of multiple engines in analysis rather than relying on the results from a single engine (any engine, not just Stockfish) and an aggregation of their evaluations and rankings. See, for example, Kasparov vs Karpov, 1985 (kibitz #25) and The World vs Naiditsch, 2014 (kibitz #16890). Initially I just averaged their evaluations but latter I weighted as well, using the ratings from the CCRL 40/40 engine tournaments (any other tournament such as CEGT would work as well) to give greater weight to the evaluations by the higher rated engines than to the evaluations of the lower rated engines prior to averaging them, although in practice this usually didn't make much difference when compared to the unweighted averaging. In the upcoming Chessgames Challenge: Team White vs Team Black, 2015 (which I urge you join) I will attempt to come up with more sophisticated evaluation (rating) and ranking aggregations. The goal is to try to reduce engine biases and come up with an evaluation that is closer to the "truth" than the evaluation from any single engine. So, if you do decide to join the game and are unfortunate enough to be assigned to Team White, you will be able to suffer from my posts like the rest of Team White. I do think that you were too harsh in criticizing Stockfish as having as its main focus competition with other engines. I think that pretty much all engines do that, with a possibly notable exception of IvanHoe which has separate compilations for Mode_Game_Play and Mode_Analysis. After all, engines (particularly the commercial ones) earn their reputation by defeating other engines in tournaments, and I think that this is what the majority of potential (paying) customers are interested in. Besides, engine vs. engine competition provides a mechanism for indisputably determining the best engines after a sufficient number of games are played in order to make the results statistically significant, whereas there is no comparable method to determine which engine produces the best analysis. I wish I could take you up on your request to duplicate your analysis of the various cases using the latest Komodo (I have the almost-most-recent Komodo 9.1) but I am currently busy with other things and simply don't have the time. Maybe I will have some time in the not too distant future. |
|
Aug-29-15 | | NeverAgain: <zanzibar: So, I too am becoming increasingly suspicious of Stockfish 6.>
Welcome to the club. Of course, d=20 is quite useless with SF6 - even WinKing's quick analysis of games in progress uses d=30. <Howard: So it appears that 26...Qa7! wins for
Black after all, correct?> The question wasn't whether it wins, really. The question was whether it is superior to the game continuation <26...Qc4>. The engines picking it over <26...Qc4> at deeper plies seems to corroborate that it is, and you get the credit for bringing <26...Qa7> to everyone's attention. <AylerKupp: I agree with much of what you say about Stockfish 6 and I also consider d=35 to be the minimum search depth for which I would consider posting or paying serious attention to any Stockfish analysis. And, whenever possible, I prefer to wait somewhat longer and wait for d>40.> d=40 seems to yield good results indeed, but at 30min to one hour per move its practical usefulness is limited. There is an interesting thread on ChessPub forums "My trusty engine tells me Chess Truth":
http://www.chesspub.com/cgi-bin/che... It features a nice collection of engine-buster positions. In particular I find this one a fascinating case:
 click for larger viewAny first-category player should be able to see after a few minutes that with <1.Ba4+! Kxa4▢ 2.b3+ Kb5▢ 3.c4+ Kc6▢ 4.d5+ Kd7▢ 5.e6+ Kxd8 6.f5> White seals the pawn chains and achieves a dead-drawn position:
 click for larger view
Despite being two Rooks and a Bishop to the good, Black has no way to break through the locked pawn chains. For all practical purposes, Black and White could be in parallel universes. Now the disturbing thing is that almost none of the engines I fed the initial position to can find the draw. Even after given the position in the second diagram they persist in claiming a decisive advantage (over -10.00) for Black. That includes Komodo 9, Stockfish 6, Rybka 4 and Houdini 4. Only Deep Shredder 12 (from 2010!) finds <1.Ba4!> and the [= 0.00] eval, and in less than 10 seconds, too. |
|
Aug-30-15 | | NeverAgain: Another fascinating thread, this time on rybkaforums: "2 Questions from an amatuer" - the relevant discussion starts at
http://rybkaforum.net/cgi-bin/rybka...  click for larger viewThere's a forced win for White that starts with this sequence:
<14.Bxh7+ Kh7 15.Ng5+ Kg8 16. Qe2 g6 17.Rh3> None of the current strongest engines can find the key move <17.Rh3>, even when spoon-fed <14.Bxh7+>. |
|
Aug-30-15 | | NeverAgain: I missed the last post of that thread. SF6 finds the moves at d=41 (70min on an 8-core laptop). There's also a Stockfish fork specially for analysis - DeepFishMZ http://immortalchess.net/forum/show... bins for the latest version (4.1):
https://www.sendspace.com/file/w195ye
I gave it the Fischer-Larsen position above as a test. Two hours later it reached d=29 and 20,000mN and was still stuck on 26...Qc4. Next was the locked-pawn-chain draw from two posts above. In the second diagram it reached d=49 and 5330mN in 30 min with the eval frozen at exactly -14.53. I really don't see a point in this SF version, other than impressing ppl with crazy high mN numbers. |
|
Aug-30-15
 | | AylerKupp: <<NeverAgain> d=40 seems to yield good results indeed, but at 30min to one hour per move its practical usefulness is limited.> Oh, I forgot to mention what to me was so second nature that I forgot. I don't use Stockfish or any other engine to play games at classical time controls, I only use it (and other engines) to conduct analyses as part of a correspondence time limit game, typically 2 days per move, or to conduct analyses of game positions that interest me or requests by other posters. And I often run engines overnight, so time is not that much of an issue, particularly when I sleep late. :-) . So having Stockfish reach d=40 or higher is usually not a problem. And I posted some engine analysis of your "the sealer and the sweeper" from Hans Kmoch's "Pawn Power in Chess" some time ago. I have always enjoyed this problem. Needless to say, no engine found the drawing continuation; perhaps they were programmed to try to avoid draws, even with the Contempt parameter at its default setting. Even starting the analysis after 4.d5+ Kd7 the engines preferred the losing 5.Bxe7 rather than the drawing 5.e6+. |
|
Aug-30-15
 | | Ron: Concerning 'engine buster positions', I have a conjecture, perhaps it could be tested, that Rybka Randomizer would correctly evaluate such positions. Rybka Randomizer uses the Monte Carlo Method to determine the best move. Its method it to play many random games. The first move of these random games that has the highest winning percentage is considered the best move. For example, I saw a Rybka Randomizer on the initial position. E4 got a 59 percent winning percentage. H4 got a 50 percent winning percentage. This happens to be aligned with chess theory; H4 throws away White's advantage of the first move. |
|
Sep-05-15
 | | AylerKupp: <Ron> Accuracy of Monte Carlo methods are highly dependent on how “many” games are played. And they initially quickly converge on the correct value, but the final digits of accuracy require many, many more games. So, since many games are required, I would assume that a fairly fast time control was used by the Rybka Randomizer. And in that case I am not sure how those results would correlate with classical or correspondence level time controls. Besides, it is not necessary. There are databases containing millions of games. In two that I have access to, 365chess and ChessTempo, each with > 2M games, 1.e4 has a White winning % of 52.2% and 53.5% respectively. <chessgames.com>’s Opening Explorer, with maybe 1M games, 1.e4 has a White winning % of 54.1%. All of these results are in the ballpark of the White winning % calculated by the Rybka Randomizer. But you have to be careful when you’re looking at small samples. 1.h4 has less than 200 games in any of these databases, yet the White winning percentages range from 46.7% (365chess.com) to 50% (Opening Explorer). Still, reasonably close to the White winning % calculated by the Rybka Randomizer. And going by strict winning % the “best” opening move seems to be 1.Na3, with winning %s ranging from 52.2% (365chess.com), 66.7% (ChessTempo), to 80.0%(!) (Opening Explorer). So probabilistic methods have their limitations. |
|
Sep-06-15
 | | tamar: Head over to TCEC at chessdom for Houdini-Stockfish showdown going on now. |
|
Sep-10-15 | | Eti fan: Amazing Stockfish finishes with 11,0/11 http://www.chessdom.com/stockfish-w... |
|
Sep-11-15 | | SChesshevsky: I really enjoy analyzing computer chess games, especially computer v. computer games, and here's my impressions of Stockfish's play relative to his computer buddies. Though with the understanding that I have no clue as to the technicals involved with the chess programming or hardware. It appears that Stockfish tends to play more strategically and positionally than maybe some of the other programs. Could be called almost human like. It doesn't seem to like having hanging pieces and wildly unbalanced positions. I thought a good example was its win v. Komodo which might've been as classical as something Fischer would've done: Stockfish vs Komodo, 2014 Another example might be its grinding win v. Komodo in a Grunfeld Exchange: Stockfish vs Komodo, 2014 But the downside of the strategic tendency is that Stockfish might be skimping on tactical calculation. An interesting example is his Grunfeld Exchange loss v. Komodo where White took a seemingly more aggressive "computer" tactical path (when compared to the previous game) to gain a quick passed pawn and used it to go two up with big advantage: Komodo vs Stockfish, 2014
Another example where Stockfish might've been out calculated is Komodo's winning deep series of exchanges starting with the Rook sac at 29. Rxe5: Komodo vs Stockfish, 2014 I'm not sure this is how Stockfish works but maybe its method is to calculate until it finds a position it deems "better" or at least best for the situation based on strategic and positional parameters and then leaves it at that without spending the time or energy to calculate deeper. If this efficiency is the case, it might take something like a computer v. computer blitz tourney to see really just how good Stockfish's analysis is. Though I guess the game time would have to be really quick. Seconds or even milliseconds? What are usual computer v. computer game time controls or averages? Is computer v. computer blitz even possible? |
|
Sep-11-15
 | | AylerKupp: <SChesshevsky> Chess Engine Primer (part 1 of 2) To answer your last question first, yes, computer v. computer blitz is both possible and common. For example see http://www.husvankempen.de/nunn/bli... which monthly updates the results of computer v. computer tournament played at 40/4 time controls; i.e. 40 moves in 4 minutes. Other time controls are possible; the same site holds computer v. computer tournaments at 40/120 time controls; 40 moves in 2 hours. Another popular and comprehensive site is http://computerchess.org.uk/ccrl/40... which holds 40/40 computer v. computer tournaments. As far as how chess engines work, I’ll try to give you a simplified description. Chess engines have two main components, the evaluation function and the search function. The evaluation function calculates what the engine considers the position to be worth taking into considerations factors typically used by humans; material (of course!), king safety, space, etc. It also has the relative importance (weights) for each of these factors. One of the things that makes each engine play differently is that they each have different evaluation functions, both weights and factors. Therefore the differences in engine playing styles depend on what their weights and factors in their evaluation function are; one engine might give relatively high weights to positional factors and another engine might give relatively high weights to tactical factors. And these weights and factors may also differ depending on the phase of the game. For example, Stockfish has 2 evaluation functions, one for the opening/middlegame and one for the endgame. It transitions between the two evaluation functions according to the number of pieces remaining on the board and, if it is in the gray zone between middlegame and endgame, it calculates a weighted average of the 2 evaluation functions according to the number of pieces remaining on the board. By tradition, the engines’ evaluation function calculates a number which is supposed to indicate the value of the position in centipawns (one hundredth of a pawn), so an evaluation of the form [+1.50] indicates that the engine considers that White is the equivalent of 1 ½ pawns ahead and an evaluation of [-2.75] indicates that the engine considers that Black is the equivalent of 2 ¾ pawns ahead. These are just conveniences for calculations of course, us humans don’t think in terms of ½ or ¾ pawns ahead. The search function builds a tree of candidate moves and evaluates the position following each of those moves. For example, the initial position has 20 legal moves, 16 by the pawns and 4 for the knights. The computer will evaluate moves like 1.e4, 1.d4, and 1.Nf3 higher than moves like 1.a3, 1.h3, and 1.Nh3 since the first set will result in higher mobility and space control, particularly in the center, than the other 3 moves. But that’s just the first move by White. The engine will then evaluate the position after each of Black’s 20 legal first moves, then evaluate the position after each of White’s legal second moves, etc. What results is a tree of lines with each node of the tree representing a position and the evaluation of that position. Using what is called the minimax algorithm, the engine will select the set of moves which represent the highest evaluations for positions reached by both White and Black. This is referred to as the <Principal Variation> and represents best play by both sides. |
|
Sep-11-15
 | | AylerKupp: <SChesshevsky> Chess Engine Primer (part 2 of 2) The problem with this simpleminded approach is that the number of possible positions quickly increases to the point that it is impossible for even the world’s fastest computers to create a multi-level tree and evaluate every possible position in a reasonable amount of time. So the search function “prunes” the tree by eliminating branches that will not be selected because they don’t contain moves with high evaluations, the so-called alpha-beta pruning. They will also use <heuristics> (educated guesses) to select for initial evaluation those moves that they think will results in the best lines, not all that differently than what humans do. This reduces the number of positions to be evaluated by literally billions, allowing engines to search deeper levels of the search tree in a reasonable amount of time. And the more aggressive their search tree pruning is (along with the efficiency of their evaluation function), the deeper an engine can search. Stockfish probably has the most aggressive search tree pruning of any of the major engines, allowing it to search the deepest in a given amount of time. The tradeoff is that as a result it may prune branches of the tree that contain the best line. So far it seems to be working well for it, certainly in the current TCEC tournament. This is a gross oversimplification of how chess engines work but hopefully you get the idea. Each engine has different evaluation functions and search tree pruning heuristics and that’s why their playing styles (and their playing strength!) is different. There are also many other factors that affect how well the engine plays. As far as hardware is concerned, the biggest difference is the number of processors or cores available to the engine to run in. A computer with only a single core will typically run slower than a computer with 16 cores, although not 1/16 as fast since there are inter-core communications overhead between the cores when each core is searching and pruning a different branch of the tree. And things like the amount of memory available and the computer’s clock rate affect the engine’s efficiency. If you enjoy analyzing computer chess games you might enjoy it even more if you have a deeper understanding of how chess engines work. The WWW has many sites which explain this at various levels of detail. Enjoy! |
|
Sep-11-15 | | SChesshevsky: <AylerKupp... computer v. computer blitz is both possible and common. For example see http://www.husvankempen.de/nunn/bli... which monthly updates the results of computer v. computer tournament played at 40/4 time controls; i.e. 40 moves in 4 minutes. Other time controls are possible; the same site holds computer v. computer tournaments at 40/120 time controls; 40 moves in 2 hours. Another popular and comprehensive site is http://computerchess.org.uk/ccrl/40... which holds 40/40 computer v. computer tournaments.> Thanks for the info. Related to computer time controls, what has always puzzled me is that in human v. computer matches it appeared the machines and human got pretty much equivalent time. Frankly, that made no sense to me. A machine that can calculate and evaluate easily thousands or even millions of positions in a half-dozen seconds would seem to be coming into the match with an enormous "physical" advantage. One that a human would find just about impossible to match. So if the goal is to see who is the better chess player, rather than who can simply calculate out further, it would seem to be necessary to at least somewhat limit the machine's foresight advantage to create an equal playing field. One way would be to limit the machines choices I guess. Maybe limiting the computers choices to one of the first 6 to 12 avenues it goes down and limiting the depth of those avenues. But maybe limiting the computer's time drastically would accomplish the same thing. For instance, say for arguments sake a computer has 500 times more calculating power than a human. An equivalent computer time for 40 moves in 120 minutes would appear to be roughly 15 seconds (120/500x 60 sec). Say that's too harsh. I would think 2 minutes for 40 moves should be more than enough time for a computer who knows how to play plus its calculating power (versus one that simply knows how to calculate) to show its stuff versus a human. Related to the same concept, using human time controls for computer v. computer games seems to make even less sense if one is looking for which program has the better chess understanding vs who just calculates better. Seems strange that as computers get faster and more powerful, their time controls seem to generally equal those of the lame humans. Is it that the programs can't function properly at the much quicker "computer human equivalent" speeds? I wonder what would be the chances of an Anand or Nakamura with 40/120 time vs a top notch program with say 40/2? |
|
Sep-11-15
 | | AylerKupp: <SChesshevsky> Funny that you mentioned that, I have been thinking about the same thing, trying to equalize computer vs. human strength by limiting the computer’s time. That’s the simplest time to reduce its search depth capability. After all, if the main advantage of a computer vs. a human is the computer’s ability to calculate faster, then it would seem that the most straightforward way to try to limit the playing field is to restrict the computer’s time. And as both computer hardware gets faster and computer software more clever, the time can be reduced further. Up to now I think that most if not all attempts at computer/human strength equalization is to give the human odds. Witness the recent Nakamura vs. Stockfish matches when Stockfish gave Nakamura pawn odds yet scored one win and one draw. But giving time odds is much more flexible since the differences can be made to be whatever we choose them to be. But you can’t make the computer give the human 1.5 pawn odds. The question then is, how much of a time advantage should the human be given? I think I have a way to figure that out. Although computer and human ratings are not directly comparable, Stockfish 6 is currently rated about 3300 and Nakamura about 2800, a 500-point difference. Set up a series of matches with Stockfish 6 and an engine rated about 500 points less, e.g. The Baron 3.29 rated 2818 in the latest CCRL engine tournament. Play a series of matches giving The Baron 2 hours and Stockfish 6 1 hour. If The Baron wins, play another match giving the Baron 2 hours and Stockfish 1.5 hours. If the Baron loses, play another match giving the Baron 2 hours and Stockfish 0.5 hours. Repeat, adjusting the times, until each engine’s winning % is a statistically significant 50%. That’s a starting point. Then refine the process by trying a human vs. Stockfish contest with those time controls and repeating the process until the human’s winning % is a statistically significant 50%. This approach at least has the advantage of eliminating human playing involvement during what I think would be the most time-consuming aspect, determining the time control that would likely give the human and the computer even chances. |
|
Sep-11-15
 | | AylerKupp: <SChesshevsky> BTW, I was just recording the results of the latest CEGT 40/20 engine tournament and you might be interested in seeing the ratings of Houdini 1.5a as the number of cores it runs with is decreased. Houdini 1.5ax64 6 CPU (#23), 3130
Houdini 1.5ax64 4 CPU (#27), 3116
Houdini 1.5ax64 2 CPU (#37), 3078
Houdini 1.5ax64 1 CPU (#65), 3021
As you can see, there is a relatively big rating increase when you go from 1 core to 2 cores, a smaller rating increase when you go from 2 cores to 4 cores, and an even smaller rating increase when you go from 4 cores to 6 cores. It will be interesting to eventually find out Houdini 1.5a’s rating when running on 8 cores but as you can see, doubling the number of cores does not result in a doubling of playing strength as measured by ratings. |
|
Sep-13-15 | | SChesshevsky: <AylerKupp... I have been thinking about the same thing, trying to equalize computer vs. human strength by limiting the computer’s time. That’s the simplest time to reduce its search depth capability. After all, if the main advantage of a computer vs. a human is the computer’s ability to calculate faster...> Thanks again for your well constructed thoughts and ideas on the computer chess subject. I think the amount of time a computer gets is a very important concept and one that doesn't appear to get all that much attention. It could be especially important in determining which software might offer the best help in human chess analysis which relates to many of the comments above as to possible Stockfish defects. The question for me is, as a human, do I want a program that suggests moves generally based on "out of human" parameters of calculation or a computer that is calibrated more to human chess concepts. Which would be more helpful for my personal chess improvement? It might be that greatly reducing the time a machine can calculate (possibly based on parameter calculations as you suggest) is the only way to get a glimpse as to which program has the sounder and more helpful "chess sense" versus sheer calculating power. Here are some totally unscientific examples that suggest computers with significantly less time may not be nearly as fearsome as at human classic or even rapid time controls. https://www.youtube.com/watch?v=9_4...
Stockfish-Rybka, appears that the stronger program could be much stronger at blitz times. Error by the weaker, possibly time related, seems to cause major damage relatively quickly. https://www.youtube.com/watch?v=Qgh...
Komodo-Stockfish, interesting Stockfish goes with the very unpopular French Defense (as it did v. Nakamura??) I haven't looked at the game closely yet but wondering if time pressures had some effect on either's play? https://www.youtube.com/watch?v=rdy...
ICC's Chess Beast - Nakamura
By the ratings on the board, both appear roughly equal with Nakamura apparently with a lot of experience v. a computer in blitz. Another example of maybe a time rush causes an inaccuracy by the machine that busts him relatively quickly. If any of computer time importance idea is correct, I can see why program manufacturers might want to avoid the issue but wondering if the idea could be the weakness a top GM might need for a human v. computer match win? |
|
 |
 |
< Earlier Kibitzing · PAGE 3 OF 15 ·
Later Kibitzing> |
|
|
|