|
< Earlier Kibitzing · PAGE 20 OF 57 ·
Later Kibitzing> |
| Sep-19-09 | | nimh: < The worse your opponent plays, the easier it is to find perfect moves.> Wrong.
The difficulty to play accurately is dependent of positions and its features, not opponent. To putit robustly: less tactics - less errors; less material - less errors; less forcedness - less errors. What kind of positions occur in a game is indeed partly a result of opponent's choice and playing style, but it would be a fallacy to reason that it is somehow tied to the level of opponent. When you opponent makes a bad move, you win not because it boosts your playing level, but rather that the very blunder he made has drawn in a position clearly favourable to you. Winning superior positions is usually like a taking a candy from a child. How accurately one is able to make moves on board is subject to his knowledge, memory and brain power. Accidentally, perhaps even surprisingly, the same factors are in action at constituting one's play skill (rating). |
|
Sep-19-09
 | | alexmagnus: <nimh> Different opponents make one face different positions. In an extreme case, if the opponent hangs a piece, every beginner finds the perfect move and captures it... That is why I said bad moves of the opponent make finding perfect moves easier. It doesn't boost your playing level in terms of skill, but it boosts your "playing level" in terms of error percentage. |
|
Sep-19-09
 | | alexmagnus: < but it would be a fallacy to reason that it is somehow tied to the level of opponent. > It isn't a fallacy. There are not many ways of "reasonable" coming to a certain position. While endgames indeed are universal, middlegame positions may very well vary depending on opponent's level of play (level, not style! Style makes a difference too but two players on a different level with the same style will get to different positions). |
|
Sep-19-09
 | | alexmagnus: < In an extreme case, if the opponent hangs a piece, every beginner finds the perfect move and captures it... > Additionally, you rarely see a GM hanging a piece i.e. a GM will rarely give an opportunity to a beginner to find a perfect move without thinking while a beginner will give plenty of these opportunities to another beginner. |
|
Sep-19-09
 | | alexmagnus: Another factor is pressure. Ability of the opponent to disturb your plans. A bad player cannot disturb the plans of a good player and so the good player plays perfectly.But in a game between two good players the plans always have to be revised, each revision being a potential mistake source. And a very good player will not only disturb good player's plans but also 1)force his own plan to work and 2)make moves which disturb player's own plan in the way the good player doesn't notice (because of being too deep and/or too unstandard).. |
|
| Sep-19-09 | | whatthefat: <alexmagnus: <whatthefat> Unlike your thought experiment, in reality chess progress is not simultaneous. Improvement in chess is a chain reaction, a continuous process.> Of course it is - I acknowledged that explicitly even - so a 2700 in the year 2008 is very likely of a similar strength to a 2700 in 2009 (correcting for inflation). But a rating system alone can never allow you to compare the objective strength of say a 2700 in 2009 to a 2700 in 1970. |
|
| Sep-19-09 | | whatthefat: By the way, you remember my bell curve example before, where the gap between #100 and #1 decreased as the playing pool became larger? Well, I'm pretty sure that if the top end of the distribution tails off like an exponential instead, then the gap remains constant as the playing pool becomes larger.
By that I mean, if prob(r) is the rating probability distribution function, and if for large r, prob(r) is approximately proportional to exp(-kr), then the rating gap between player #a and player #b (where a and b are sufficiently large), will remain approximately constant. This would explain Sonas' result. The proof is like so:
Suppose we have n total players, and the rating of player #a is r_a, and the rating of player #b is r_b. Let's use the notation Int[x,y] to mean the intergral of prob(r) from x to y. Then, Int[r_a,inf] = a/n, and Int[r_b,inf] = b/n. Now suppose we double the increase the size of the playing pool by a factor f. Now let's call the rating of player #a, s_a, and the rating of player #b, s_b. Then Int[s_a,inf] = a/fn, and Int[s_b,inf] = b/fn. So Int[r_a,inf] / Int[r_b,inf] = Int[s_a,inf] / Int[s_b,inf]. If the rating gap is to remain constant, then we require r_a - r_b = s_a - s_b. Substituting prob(r) = Aexp(-kr), where A and k are constants, yields this result. So what this says is, that if the upper tail of the distribution scales as approximately exp(-kr) then Sonas' relation should hold. It would also explain why his curves jump off their approximately constant value once he makes player #a sufficiently lowly ranked, since the distribution will cease to follow this shape as the middle of the distribution is approached. |
|
Sep-20-09
 | | alexmagnus: Hm, the difference is then indeen indpendent from n (ln(a/b)/k if I calculated corrctly). But why on earth should the distribution become exponential when getting into higher regions? By the way, if this is correct then my Delo becomes justified again :). At least if used with the intention I used it - comparing dominance on the elite level. <Of course it is - I acknowledged that explicitly even - so a 2700 in the year 2008 is very likely of a similar strength to a 2700 in 2009 (correcting for inflation). But a rating system alone can never allow you to compare the objective strength of say a 2700 in 2009 to a 2700 in 1970> Why not? To be uncomparable, there have to be some "simultaneous explosions" in the progress of chess. |
|
Sep-20-09
 | | alexmagnus: <By the way, if this is correct then my Delo becomes justified again :). At least if used with the intention I used it - comparing dominance on the elite level.> It also explains why the all-time Delo top list has so many players from 1972 but no prevalence of any other year: 1972 the sufficient number of players (so that the distribution becomes exponential at #100 or lower) was not yet reached... |
|
Sep-20-09
 | | alexmagnus: If we start in 1976, of the pre-1976 players on the list with the "1972 bug" only two (Portisch and Petrosian) "survive" in the top-20. Here it is, the top-20 Delo from 1976 on: 1. Kasparov (Jan 2000) 2764
2. Karpov(Jan 1978, Jan 1980, Jan 1982) 2725
3. Kramnik 2714 (Oct 2001)
4-5. Tal (Jan 1980) 2705
4-5. Anand (Jul 1998) 2705
6. Korchnoi (Jan 1979, Jan 1980) 2695
7. Topalov (Jul 2006) 2694
8. Ivanchuk (Jul 1991) 2685
9. Morozevich (Jul 1999) 2667
10-11. Shirov (Jul 1994) 2665
10-11. Kamsky (Jul 1996) 2665
12. Adams (Jul 2000, Oct 2000) 2661
13. Timman (Jan 1982) 2660
14-17. Portisch (Jan 1980, Jan 1981) 2655
14-17. Ljubojevic (Jan 1983) 2655
14-17. Gelfand (Jan 1991) 2655
14-17. Leko (Oct 2000) 2655
18. Carlsen (Oct 2008) 2653
19. Petrosian (Jan 1977) 2650
20. Svidler (Jan 2006) 2647 |
|
Sep-20-09
 | | alexmagnus: (1976 was chosen as the start year because it looks like it was the first year in which the necessary distribution was reached). |
|
| Sep-20-09 | | nimh: <Different opponents make one face different positions.> I was talking about different <type> of positions, not different positions. <In an extreme case, if the opponent hangs a piece, every beginner finds the perfect move and captures it...> But what if hanging a piece is a trap? In that case blindly capturing it would be an idiocy. So the things aren't so simple. It wouldn't happen very often too. Also I think we should focus on general, not extreme cases. And you also seem not being aware that absolute errors are pretty useless in calculating the level of play. A player who mostly plays easy positions and has a low average error figure, eg Capablanca, would be punished according to the criterion of <expected error>. <That is why I said bad moves of the opponent make finding perfect moves easier.> Bad moves make one's position worse, the opponent will be closer to victory, that's what makes a weaker player's life harder, nothing else.
A good position doesn't make it possible finding perfect moves, a human almost never finds them. Do you think, the suddenly get 32-piece TB-s in their disposal? <It doesn't boost your playing level in terms of skill, but it boosts your "playing level" in terms of error percentage. > Playing level and skill are undoubtedly the same. There can't be any other reasons why some players are better than the others.
All cases of absolute error boost by simpler positions are nullified effectively by taking difficulties of positions into account, so your statement is invalid.
Also, I didn't use the notion of 'error percentage' in my study, neither Bratko & Guid, and Charles Sullivan.
It's average evaluation difference between actually played moves and computer's choices. <It isn't a fallacy. There are not many ways of "reasonable" coming to a certain position.> There are many ways of coming to same type of positions. The accuracy of play of all players reacts similarily to the changes of the types of positions. It doesn't matter whether the rating is 1000 or 3000, tactical positions for example will be always more difficult for both. <While endgames indeed are universal, middlegame positions may very well vary depending on opponent's level of play> Middlegame positions are also universally divided.
<(level, not style! Style makes a difference too but two players on a different level with the same style will get to different positions).> They come to basically the same type of positions.
<Additionally, you rarely see a GM hanging a piece i.e. a GM will rarely give an opportunity to a beginner to find a perfect move without thinking while a beginner will give plenty of these opportunities to another beginner.> Beginner also rarely gives a GM chances to find perfect moves. The fact a beginner plays a bad move doesn't mean necessarily that a GM will play the rest of game absolutely correctly, rather he just keeps playing on his usual level, while a beginner, due to his weak understanding of the game and insufficient calculating speed and precision, keeps making subpar moves. No wonder a GM wins. |
|
| Sep-20-09 | | nimh: <Another factor is pressure.> There certainly is pressure, but it's the perssure of being obliged to play accurately. Whenever a beginner confronts a GM, he feels and has to feel pressure, the psychological pressure. He knows he will have to give everything he has and there will be no room for making inaccuracies.
But why would he be playing weaker suddenly? If he gets in the same type of positions as he usually does against players of his calibre, he's supposed to play as well. The difference being that in compensation for weak moves against a weak opponent he also gets replied by weak moves, whereas against a GM he would inevitablly pushed into the corner of mate. <Ability of the opponent to disturb your plans. A bad player cannot disturb the plans of a good player and so the good player plays perfectly.> Humans never play perfectly, sheesh how come you don't know such an elementary axiom? Good player plays seemingly perfectly because the countermoves by a weak player aren't backed by adequate accuracy. <But in a game between two good players the plans always have to be revised, each revision being a potential mistake source.> Mistakes can also be latent in original, supposedly 'good' plans, especially when a player is too deeply convinced by the indestructibility of his plan and doesn't spot a clever tactical shot by his opponent. <And a very good player will not only disturb good player's plans but also 1)force his own plan to work and 2)make moves which disturb player's own plan in the way the good player doesn't notice (because of being too deep and/or too unstandard).. > This also applies to a weak player against a very weak player. Plans of weaker player are doomed to fail because they aren't able to form nor carry them out with enough accuracy.
Again it has nothing to do with the level of the opponent. |
|
| Sep-20-09 | | whatthefat: <alexmagnus: Hm, the difference is then indeen indpendent from n (ln(a/b)/k if I calculated corrctly). But why on earth should the distribution become exponential when getting into higher regions?
By the way, if this is correct then my Delo becomes justified again :). At least if used with the intention I used it - comparing dominance on the elite level. > I guess the distribution must be of the general form prob(r) = A(r)exp(-kr) where A(r) becomes approximately constant as r becomes sufficiently large. I don't know my PDFs well enough to say which it could be though! <By the way, if this is correct then my Delo becomes justified again :). At least if used with the intention I used it - comparing dominance on the elite level. > That's right, it would justify your approach. I've emailed Sonas to see if he can verify the exponential distribution of the upper tail. <<Of course it is - I acknowledged that explicitly even - so a 2700 in the year 2008 is very likely of a similar strength to a 2700 in 2009 (correcting for inflation). But a rating system alone can never allow you to compare the objective strength of say a 2700 in 2009 to a 2700 in 1970>Why not? To be uncomparable, there have to be some "simultaneous explosions" in the progress of chess.> I don't think there needs to be a discrete jump in playing ability for there to be a problem. If objective playing ability increases at a continuous rate of x rating points per year (supposing we've already corrected for inflation), we can't compare eras that are decades apart without knowing the precise value of x. As a result, we're limited to comparing relative strength, or relative dominance. |
|
| Sep-21-09 | | metatron2: <alexmagnus: As I said, the lists are intended to compare dominance, not chess strength [..]it makes sense to chose a constant number of players (100) for the definition of "elite" [..] Of course this comparison is unfair (the better general the level of played chess on the elite level becomes, the harder it is to dominate it)> I understand that your Delo list represents distances from #100 player and not absolute strengths, but I don't see how this changes the problem of working with absolute ranks that I mentioned: If the gap between #100 and #1 decreases (mainly) because of the tendency of the gaps to decrease as the number of players in the pool increase, then comparing distances from absolute rank (#100) is incorrect, since newer players are being panelized simply because they played in a larger pool. It should be based on percentage of the pool size rather then an absolute elite group size, but this one is also complex to implement since the ratio: (elite group size)/(pool size) doesn't seem to be constant over time. Even if we didn't have this problem, I think that the problem with your Delo list is not only that it ignores the fact that it's harder to dominate as the general level increases (that you mentioned), but also that the definition of "dominance" should be domination <over time>: Temporal peak rating are vastly affected by the player's form, the type and form of the opposition he faced during the relevant period, etc. And so you should include the time factor there as well. In other words, there is still a lot of work to be done for the Delo list to represent "dominance" in a fair way. <whatthefat: if the top end of the distribution tails off like an exponential instead, then the gap remains constant as the playing pool becomes larger. [..] This would explain Sonas' result.> Actually Sonas didn't say that the gap between #1 and #100 remains constant, he was talking about the gap between #100 and some specific much lower ranks. If the distribution's tail was indeed exponential, it would have meant that the gaps between <any> 2 players in the top 100 remains constant over time, and not just between #1 and #100, which is certainly not the case. But this was a very impressive result you presented, from the mathematical point of view... |
|
| Sep-21-09 | | metatron2: <whatthefat: Consider this thought experiment: Tomorrow, all chess players become objectively 400 Elo points stronger.> As <alex> said, this is not the way it works in real life, the entire rating system is based on the assumption of continuity, without sudden jumps. One of the common claims supporting the "rating is not supposed to measure strengths of different eras" Myth, is that the players pool changes over time. But again the continuity assumption covers this case as well, since the pool changes in a <continuous> way over time, without "jumps". At the core of the rating system lies the continuity assumption: At any given period of time, each active player plays only with a tiny potion of the player's pool, and yet it was proven times and again that his rating remains valid for the <entire> pool. The same principle should hold for pools that change (in a continuous way) over time. <If objective playing ability increases at a continuous rate of x rating points per year (supposing we've already corrected for inflation), we can't compare eras that are decades apart without knowing the precise value of x. As a result, we're limited to comparing relative strength, or relative dominance.> The point is that the playing ability changes in a continuous way, but <not uniformly> over the entire pool. There are always total-patzers beginners that are not affected by the general level increase, and in most cases, older players are not progressing as well, so there are always enough "anchors" to rating values from older lists, verifying that average rating change over time actually reflects average change of strength over time. That is the essence of "continuity" in this context. As an example: If some older player had 2400 in the 1997 rating list, and his level reached plateau then, while the pool's level increased by 20 points on average between 1997 and 2006, that player should still be rated around 2400 in 2006, but his overall rank should drop, meaning that compared with the pool he declined even though his absolute strength did not change. |
|
Sep-21-09
 | | alexmagnus: <Even if we didn't have this problem, I think that the problem with your Delo list is not only that it ignores the fact that it's harder to dominate as the general level increases (that you mentioned), but also that the definition of "dominance" should be domination <over time>: Temporal peak rating are vastly affected by the player's form, the type and form of the opposition he faced during the relevant period, etc. And so you should include the time factor there as well.> Well, for this kind of things one can make a similar thing to what Sonas did with Chessmetrics: lists of 1-year, 5-year, 10-year etc. peaks. But the absolute peak is of relevance too, it just tells how dominant one was at his peak. Players like Fischer would quickly go down in multi-year-comparisons, still it doesn't take away that in the short period of his domination he dominated very strongly... <If the distribution's tail was indeed exponential, it would have meant that the gaps between <any> 2 players in the top 100 remains constant over time, and not just between #1 and #100, which is certainly not the case> Unfortunately I don't have the lists below top-100. I'd like to compare the distance on each 100er step (i.e. btwen 100 and 200, between 200 and 300 etc.) over time. If it turns out to be (relatively) constant on each step then there is no argument against my approach. As for elite size vs. size of the player pool: that depends on what one sees as the elite. F.x. the number of players reaching certain stages in the WC cycle is constant (assuming the cycle is played by the same rules all the time). |
|
| Sep-21-09 | | whatthefat: <metatron2: If the distribution's tail was indeed exponential, it would have meant that the gaps between <any> 2 players in the top 100 remains constant over time, and not just between #1 and #100, which is certainly not the case. But this was a very impressive result you presented, from the mathematical point of view...> That's right, and hopefully Sonas can confirm whether that's the case. If you read the article, you'll see that he in fact found that all of the following gaps are relatively constant for a prolonged period: * #100 to #500 (from 1975 to now)
* #100 to #1000 (from 1980 to now)
* #100 to #5000 (from 1988 to now)
* #100 to #10000 (from 1993 to now)
The fact that the lower rank he uses, the more recently the gap asymptoted would be consistent with this exponential tail, so long as the size of playing pool is continually increasing (which is very likely the case). <The point is that the playing ability changes in a continuous way, but <not uniformly> over the entire pool. There are always total-patzers beginners that are not affected by the general level increase, and in most cases, older players are not progressing as well, so there are always enough "anchors" to rating values from older lists, verifying that average rating change over time actually reflects average change of strength over time. That is the essence of "continuity" in this context.As an example: If some older player had 2400 in the 1997 rating list, and his level reached plateau then, while the pool's level increased by 20 points on average between 1997 and 2006, that player should still be rated around 2400 in 2006, but his overall rank should drop, meaning that compared with the pool he declined even though his absolute strength did not change.> The points you make are generally true, but incredibly difficult to extract from a real rating system using real human players, who do not ever sit on a constant level for good (even leaving aside the large fluctuations in form that affect all players). What's more, even complete novices today are familiar with basic ideas such as the point value of pieces that were not commonplace 100 years ago, so without performing a carefully designed experiment it is difficult to use the level of a novice as a normalization method. I think the advent of engines provides the greatest promise of objectively comparing playing strength between eras. If a given engine is found to play on a prescribed set of hardware at a particular level today, then it can be tested again in say 10 years to see if it performs just as well (I'm not sure whether anyone is currently performing such an experiment, but if they intend to, they should start now!). Players from the past cannot be tested this way of course, but their games can be analyzed by an engine instead (one of the beauties of chess is that a game 100 years old can be analyzed just as easily as one from yesterday). |
|
Sep-21-09
 | | alexmagnus: Analysis by an engine has a flaw that the engine only shows how bad the game is but not how good it is. As I said above, an error-free game is not yet a perfect game. An engine cannot analyze psychology of moves, neither it can exactly analyze their complexity. |
|
| Sep-21-09 | | nimh: Human chess is so weak that bad sides of that alone are enough to distinguish better and worse play. |
|
Sep-21-09
 | | alexmagnus: <nimh> how about moves "in the gray zone"? Tal's sacrifices and Anderssen's 17.Bd6 in the Immortal Game come to mind. Both are incorrect yet receive a ! or even !! (in case of Bd6) from commentators... |
|
| Sep-21-09 | | nimh: Anderssen's 18.Bd6 was given exclamation marks by Löwenthal and Steinitz who didn't have a computer. Gray moves do not occur very often. As long as the bias is very slight and regular and we know to take it into account, there's nothing left to complain. |
|
Oct-03-09
 | | alexmagnus: Here the average ages of top 10 in different times will be listed soon (when my browser starts working properly; atm I have to write from an Internet cafe...). |
|
| Oct-04-09 | | frogbert: <so long as the size of playing pool is continually increasing (which is very likely the case).> whatthefat, i think one important question is what we should consider to be the player pool <of interest>. personally i would like to consider the activity of the players involved, and make figures of how much of the activity is performed by which part of the pool, in terms of both strength/rating and age (maybe both absolute and age "in the system"). ideally, even though there are great individual differences, it would be interesting to try to make figures of the improvement, sustain, and decay rate on average, for various "player type profiles" (those who actually become "really good" might display a different tendency, compared to those who "only" get their im title and then don't progress anymore) - based on the actual rating numbers (no "adjustments" applied). with a little programming (already implemented) it's quite trivial to follow an exact group of players, like all players rated in some rating range in some list, and compare to <the same players> (not those that happen to be in that range in the other list) in some later list. then this could be analyzed and broken down based on "career stage" patterns like age, system age, activity and so on, and compared to what one would expect given various assumptions about "typical development" (improve/sustain/decay), inflation, and so on. i have detailed data including number of games played for each player all the way back to 1990 - the rating lists i have from the 80s and 70s typically only list the rating numbers and nothing else. this might not matter, since the really major explosion in the number of players came even later - there were "only" around 9500 players in the fide system of july 1990, and of those ca. 4600 were marked as "active" by fide. today, the figures are 10x these sizes. however, fide's inactive flag isn't set unless the player has been inactive for a really long period (first 12 months with 0-3 games played to enter "inactive state", then 7-8 lists - which used to mean 2 years - in order to get the "permanently inactive" flag). i would like to try using more aggressive "activity" requirements, like 10-20 games per 12 months or so. rationale? i would think that the group of players we're interested in monitoring are those who actually play chess and hence make sure the rating system is meaningfully adjusted. the completely inactive players have no real influence on the rating dynamics anyway, and hence it doesn't make too much sense to count them in the pool (size) either. |
|
Oct-05-09
 | | alexmagnus: <the completely inactive players have no real influence on the rating dynamics anyway> Unless one accepts the argument of inflation proponents that if the rating one leaves the system with a lower rating than the initial rating then an inflationary effect happens. But I don't like that argument anyway - in my opinion, the level of activity has no influence on ratings (in the "entire system") at all. |
|
 |
 |
|
< Earlier Kibitzing · PAGE 20 OF 57 ·
Later Kibitzing> |
|
|
|