< Earlier Kibitzing · PAGE 4 OF 7 ·
|Sep-28-19|| ||beatgiant: For example: the winner of the previous cycle's World Cup is invited to the current World Cup. You consider this an "indirect effect." Games played in Dec. 2018 influence a player's rating in Feb. 2019. You consider this a "direct effect." |
How did you reach those conclusions? The example clearly shows that recency of the effect is not the issue for you.
|Sep-29-19|| ||beatgiant: For the record, here are the qualification paths to the Candidates, plus dates of information used (month and year when I knew them or could quickly find them). I haven't traced all the paths fully back (for example, criteria for invitations to the zonals) but a good guess is that they often used ratings that predated the events.|
1. Loser of previous world championship (Nov. 2018).
2. World Cup. Invitees were: World Champion (Nov. 2018), 4 previous World Cup semifinalists (Sept. 2017), Women's Champion (Nov. 2018), Junior Champions (2017 and Sept. 2018), 22 from European Championship (2018), 24 from European Championship (2019), 4 from American Continental Championship (2018), 4 from
American Continental Championship (2019), 5 from Asian Continental Championship (2018), 5 from Asian Continental Championship (2019), 2 from African Continental Championship (2019), 26 from zonal championships (I'm unsure of date range), 18 by average rating (Aug. 2018 to July 2019), 1 from ACP tour (I'm unsure of date range) and 9 wildcards (I'm unsure of criteria). Invitees who declined were either replaced with the next runner-up in the same event, or from the runners-up on the average rating list.
3. FIDE Grand Swiss. 100 by average rating (July 2018 to June 2019), Women's Champion (Nov. 2018), Under 20 Champion (Sept. 2018), Over 50 Champion and Over 65 Champion (both Nov. 2018), 12 top finishers from the Continental Championships (2019), 1 top finisher from the ACP tour (not sure of the included date range), 3 wildcards (I'm not sure of wildcard criteria).
4. FIDE Grand Prix. 20 by average rating (Feb. 2018-Jan. 2019), 2 wildcards (I'm not sure of wildcard criteria).
5. Average rating (Feb. 2019-Jan. 2020).
6. Wildcard. Must play in two of the three (World Cup, Grand Swiss, Grand Prix - 2019), and be a runner-up in one of them or be in top 10 by average rating (Feb. 2019-Jan. 2020).
In terms of the currently known qualifiers, Caruana qualified based on WCC (Nov. 2018), Ding and Radjabov qualified based on World Cup to which Ding qualified based previous World Cup (Sept. 2017) and Radjabov qualified based on average rating (Aug. 2018-July 2019).
|Sep-29-19|| ||beatgiant: <AylerKupp>
So, most of the qualification paths do, in fact, depend on results from before Jan. 2019. But to get any further, we'd have to agree on how to measure the strength of the dependencies. Otherwise, you want to blow off all the dependencies except those for average ratings by labeling them as "indirect."
|Sep-29-19|| ||beatgiant: To me the bottom line is:
You think it's reasonable to expect a would-be World Championship contender to <put the time and effort spent by the players to be good enough to be eligible to participate in the Candidates Tournament> before the start of the cycle, in the case of all criteria except rating.
And I really don't understand your reason for making that distinction.
|Sep-30-19|| ||AylerKupp: <beatgiant> It's not that I don't want to consider the preconditions of the contestants in the World Cup among other events, it's just that I don't consider being a top player a <direct> precondition. It's just implied, no more than being able to breathe or think should be considered preconditions to becoming a top player. And that's what I thought we, or at least I, were talking about, <direct> preconditions. Otherwise where to you stop in considering what could be a "precondition" to becoming a top player? Being born?|
As far as the previous cycle's World Cup player being invited to the next World Cup I consider that a possible precondition to qualify <for the World Cup>, not to qualify <for the Candidates Tournament>, and I thought the latter is what we were talking about. Clearly being a recognized top player is a pre-condition to being invited to play in all these tournaments that could result in having a <direct> precondition to participate in the Candidates Tournament. But not, IMO, a <direct> precondition.
So we also disagree in believing that most of the qualification paths for the 2020 Candidates Tournament depend <directly> on results from before Jan-2019. I don't think that we can agree on the strength of the dependencies since I don't think that there are any <direct> dependencies because I don't think that being a top player is a <direct> dependency. So, other than your loaded term for me to " blow off" all the dependencies except those for average rating, we basically agree; I just think that being a top player is an <indirect> dependency and not a <direct> dependency for qualifying for the Candidates Tournament.
The reason that I make a distinction between qualifying for the Candidates Tournament via ratings compared to all the other requirements for qualifying is that qualification via ratings is <explicitly> indicated as a possible method for qualifying for the Candidates Tournament, and the rating for the first month of the average rating period is determined <explicitly> by the player's rating prior to the first month of the average rating period per the formulas for calculating the player's rating for the prior month. None of the other qualification criteria, except possibly the selection of a player to participate in the Candidates Tournament by its organizer <explicitly> requires dependency (i.e. ratings) outside the average rating calculation period.
But even this method provides 3 alternatives to selection of a player for inclusion in the Candidates Tournament provided that he finishes sufficiently high in the World Cup, the Grand Prix, and the Grand Swiss. So inclusion into the Candidates Tournament by this method <could> be based on events (ratings) occurring prior to 2019 but does not <require> it. But selection of a player for inclusion into the Candidates Tournament by rating does require it.
So that's how I make my distinction; selection of inclusion into the Candidates method by rating requires events <directly> happening outside the 2019 period and none of the other methods do. And, like I said, requiring a would-be World Championship contender putting in the time and effort to become a good enough player to be considered for and qualifying for inclusion into the Candidates tournament is an <implicit> requirement; like breathing, thinking, and being born, not an <explicit> one as described in the Rules and Regulations for the Candidates Tournament.
And that's my bottom line.
|Sep-30-19|| ||beatgiant: <AylerKupp>
OK let's spell it out for each qualification path.
Would you agree that "loser of previous World Championship" depends directly, explicitly, completely on events before Jan. 2019? If not, I'd consider further discussion useless, because I see absolutely nothing that any person can do after Jan. 2019 that would change this outcome.
Winner of the World Cup: The 128 players who got invited were clearly not anywhere close to "the 128 strongest players" by whatever means we might agree to define that. So I disagree that <being a top player> was sufficient to compete under that criterion. Most of the invitees had to accomplish specific things pre-2019 to avoid "being eliminated in round 0" (i.e. get invited).
Maybe this would hinge on how you'd define "being a top player." It might help if you'd post your definition.
FIDE Grand Swiss: For this one, your point is arguable, as long as you accept in principle the idea of average ratings, which were the main criterion.
If you concede that, we would then be discussing what's the appropriate period to use for average ratings, rather than the validity of average ratings themselves (in contrast, <devere> denies the validity of average ratings in principle, for any purpose).
FIDE Grand Prix: This one was a mixture of average ratings and pre-2019 accomplishments. So the discussions around World Cup and Grand Swiss apply similarly to this one.
|Sep-30-19|| ||beatgiant: And then let's contrast all that with the average rating qualification path.|
This criterion uses ratings from Feb. 2019 to Jan. 2020. Yes, to have a realistic shot at this, a player's rating has to be near the top in Jan. 2019, a dependency on events before then. Beyond that, the player must maintain that high position while satisfying the minimum activity requirements.
Having a rating near the top of the list in Jan. 2019, to you, is too much to ask of someone who wants to try for the world championship based on rating? Again perhaps it comes down to your definition of the concept, "being a top player."
|Oct-01-19|| ||AylerKupp: <beatgiant> Yes, of course I'll agree that losing the previous WCC depended on events before Jan-2019. Focusing on the average rating calculations I just overlooked that. My bad. So that leaves 5 of the 7 players qualifying for the Candidates Tournament based <directly> on events occurring after Jan-2019.|
Since you asked, I would define a "top player" as a player good enough to be considered for participating in the <Candidates Tournament>. Participating in one of the qualifying events for the Candidates Tournament does not necessarily make you a "top player". I think that you'll agree that Shaun Press, rated at 1954, was selected to participate in the World Cup but it would take a great deal of wishful thinking to consider him a "top player", even if he had managed to beat Ding Liren by a score 2-0.
Or, if you want to quantify it, I would think that any player rated 2700+ could be considered a "top player". But 2700 is an arbitrary number, a case could easily be made only for someone rated 2750+ or 2600+.
As far as the FIDE Grand Swiss qualification for it would not automatically make you a "top player" either. But it's unlikely that anyone but a "top player" will win it, the requirement for qualifying for the Candidates Tournament.
But I think that you're deviating from what I think is the difference in our positions. I think that other than qualification for the Candidates Tournament by average rating and, now, by losing the previous WCC match, all the players likely qualified <directly> based on events happening before 2019. Even the selection of the at large player falls in this category since it's only a constraint that the player selected must be among those with the 10-highest average ratings, that player must also have participated in at least 2 of the Candidates Tournament qualifying events after Jan-2019.
I think that you think that the time an effort spent by some persons to become good players means that this was based on events happening before Jan-2019. That might be true, but those players did not likely qualify <directly> for the Candidates Tournament based on events happening after Jan-2019.
Look, it's clear that we have a difference of opinion on this issue. I don't think it's productive to spend any more time discussing it. If you want to stick to your opinion you're entitled to do so, just as I am entitled to stick to mine. They're just opinions after all. We'll just have to agree to disagree and move on.
|Oct-01-19|| ||beatgiant: <AylerKupp>
I think you mixed up <before> and <after> in your post, but I understood.
We don't only have differing opinions, we also start from differing premises.
What's the big difference for you, in principal, between "becoming good players" and "achieving high ratings"? Aren't the realistic candidates precisely the same as the people with ratings near the top of the list as of Jan. 2019?
And to what extent is the average rating criterion influenced by pre-2019 events? It tapers quickly if a player is active enough. But if you don't get invited to the World Cup, you definitely won't win the World Cup, which sounds much more like an explicit condition to me.
But, yes there's quite a full list of other areas for discussion, so I'll tackle one of those next, as soon as time permits.
|Oct-02-19|| ||AylerKupp: <beatgiant> I don't see any difference either between becoming a good (maybe I should have said great) player and achieving highor top ratings. Only the good players will achieve high ratings and only the great players will achieve top ratings.|
Yes, the realistic candidates will be likely be precisely the same as those with ratings near the top of the list as of Jan-2019, barring the highly unusual exception of a breakthrough performance by a less-than-top young player that maybe, like Fischer, "all of a sudden he just got good". But, like a Fischer, this doesn't happen very often.
But, as <devere> has pointed out and I have created scenarios to show, and verified by tracking the performance of Caruana, So, and Kramnik in terms of average ratings prior to the 2018 Candidates Tournament, having the highest rating in Jan-2019 gives that player a great advantage in the average ratings calculation race. It does <not> necessarily taper quickly because of the averaging; the player with the higher initial rating will always have an advantage, and that player's initial rating (like Ding Liren's at the beginning of this cycle) will have a dominating advantage, even in the case of a total collapse during the year.
Actually most of the players (92) qualifying for this World Cup were determined by their performance in zonal events and only 18 qualified by average rating. Eight others qualified by virtue of titles won in previous events. The reverse is true for the FIDE Grand Prix and the FIDE Swiss, most of the players qualifying for those two events qualify on the basis of rating.
And true, if you don't get invited to the World Cup you definitely won't win it. But again, I thought that we were talking about qualifying for the Candidates Tournament, not the World Cup. So a player that doesn't qualify for the World Cup is <indirectly> not qualified for the Candidates Tournament whereas a player that doesn't finish in the top 2 in the World Cup is <directly> not qualified for the Candidates Tournament. To me that makes a difference, to you maybe not.
That's why, besides taking into account the <explicit> qualification criteria, I make a distinction between <indirect> events (achieving a high rating, qualifying for qualifying events, etc.) that occurred before Jan-2019 and <direct> events (top performance in qualifying events occurring after Jan-2019, although the Jan-2019 date is somewhat arbitrary, it's just practical. Also, if you want to have the <current> (i.e. at the start of an event) best players participating in the Candidates Tournament and the WCC match, then recent performances (the qualifying events for the Candidates Tournament) are a better indicator than events that happened earlier (e.g. the qualifying events to the qualifying events, averaged ratings). But, of course, that's just an indicator, not a guarantee. It just increases the chances that the participants in the Candidates Tournament and the WCC Challenger are the best current players.
And, BTW, in case you have any doubt, I enjoy discussing things with you. You have your opinions and you back them up with solid reasoning and/or facts. That's unlike many others who post here who express an opinion and either try to pass it up as fact or refuse to indicate the basis for their opinion. You are like a breath of fresh air.
Speaking of other, although related, areas of discussion, have you had a chance to look at my suggestion of replacing the average rating qualification criteria for the Candidates Tournament with a TPR-like calculation? If you remove the rating term from the TPR calculation then it becomes only a calculation involving actual results between a player and his opponents. Truly memoryless since it does not involve calculations outside the PR calculation period.
Not that I think it will ever happen or will make any difference, since I think that selecting participants for the Candidates Tournament on the basis of average ratings is on the way out, witnessed by the reduction of the number of players selected this way from 2 to 1. I think that for the next WCC cycle, particularly if this year's FIDE Swiss is a success, that the top 2 players in the FIDE Swiss will qualify for the next Candidates Tournament. But that, of course, this is just my opinion. :-)
|Apr-15-21|| ||alexmagnus: <AylerKupp>
As promised, here is a list of all Armageddon games from the World Cups and pre-split FIDE World Championships:
2002 (4 white wins, 1 black win - note that it started in 2001):
Gyimesi-Conquest (rd.1, white win, game not on CG.com)
Sutovsky vs A Zapata, 2001 (white win, marked as blitz here, but the 7th game in 2002 was Armageddon, as easily confirmed by the final score 4:3 available in any source)
L Bruzon Batista vs Nisipeanu, 2001 (white win, same)
Ehlvest vs Smirin, 2001 (white win, same)
Topalov vs Shirov, 2001 (black win, same)
2005 (3 white, 4 black):
Zvjaginsev vs Shulman, 2005 (draw)
Navara vs P Nikolic, 2005 (draw)
Z Izoria vs Erenburg, 2005 (black win)
E Najer vs Mamedyarov, 2005 (white win)
Khalifman vs Shulman, 2005 (draw)
Tiviakov vs Korneev, 2005 (white win)
Van Wely vs A Moiseenko, 2005
2007 (3 white, 1 black):
M Roiz vs V Akobian, 2007 (white win)
Vitiugov vs Sakaev, 2007 (black win)
Zhou Jianchao vs A Volokitin, 2007 (white win)
K Georgiev vs Kasimdzhanov, 2007 (white win)
2009 had no Armageddon games
2011 (2 white, 0 black):
Y Drozdovskij vs Motylev, 2011 (white win)
L Dominguez vs I Lysyj, 2011 (white win)
2013 (2 white, 1 black):
Tomashevsky vs A Ramirez Alvarez, 2013 (white win)
H Melkumyan vs Granda Zuniga, 2013 (black win)
Dubov vs Ponomariov, 2013 (white win)
2015 (1 white, 2 black)
M Bartel vs G Sargissian, 2015 (draw)
Adams vs V Laznicka, 2015 (white win)
Nepomniachtchi vs Nakamura, 2015 (black win)
2017 (1 white, 0 black)
Aronian vs Vachier-Lagrave, 2017 (white win)
2019 (1 white, 1 black)
E Najer vs A Giri, 2019 (black win)
Yu Yangyi vs Vitiugov, 2019 (white win)
Still accusing me of being a liar? Or can you apologize like a man?
|Apr-15-21|| ||alexmagnus: For completeness, here the same for the women's tournaments:|
2004 (1 white, 4 black):
J Dworakowska vs Zhaoqin Peng, 2004 (draw)
T Nguyen vs K Kachiani-Gersinska, 2004 (draw)
Xu Yuanyuan vs T Vasilevich, 2004 (black win)
L Mkrtchian vs M Socko, 2004 (white win)
E Atalik vs Lagno, 2004 (black win)
2006 had no Armageddon games
2008 (2 white, 0 black):
E Paehtz vs I Kadimova, 2008 (white win)
M Socko vs S Foisor, 2008 (white win)
2010 (0 white, 1 black)
M Muzychuk vs D Harika, 2010 (black win)
2012 (0 white, 1 black)
C Foisor vs M Muzychuk, 2012 (draw)
2015 (1 white, 1 black)
M A Gomes vs T Kosintseva, 2015 (black win)
Goryachkina vs L Mkrtchian, 2015 (white win)
2017 (1 white, 3 black)
A Bodnaruk vs M Hejazipour, 2017 (white win)
A Ushenina vs Tan Zhongyi, 2017 (draw)
N Batsiashvili vs Khurtsidze, 2017 (black win)
D Harika vs Tan Zhongyi, 2017 (black win)
2018 (1 white, 0 black)
Lagno vs N Pogonina, 2018 (white win)
|Apr-15-21|| ||alexmagnus: The overall score for women btw is 10-6 for black, not 10-5 as said on that page, miscounted. But the individual numbers are all correct.|
|Apr-15-21|| ||alexmagnus: So, <AylerKupp>, next time you are doubting any of my statements, ask for a source politely. Because I always do have one. No need to accuse me of making things up, saying I'm presenting "information" in quotes, and getting a presumption of incredibility.|
I don't know what exactly made you think of myself as not credible and accuse me of lying with such a sure attitude. All facts I've ever presented are easily fact checked within minutes to hours. The Armageddon records I mentioned today for example: just open the tournaments' Wikipedia page and search for matches that ended had an odd number of games. Then find the corresponding games to see who had white. You found it "incredible" I did it in one day? It actually took me less, as half of that calculation I've already done previously (see my other post, where I said white had more wins <last time I counted>.)
I <never>, I repeat, <never> make things up. If you have a hard time believing s fact, ask politely for a source (though as I said, those facts are usually very easily fact checked) Don't talk to me like I'm some escaped serial killer, I've done literally nothing to deserve such an attitude.
Maybe if you change your attitude to a bit more objectivity, you'll enjoy discussions with me too.
|Apr-16-21|| ||alexmagnus: Actually I noticed that I missed one tournament, men's FIDE 2004 - and that one featured lots of Armageddon games, mostly white wins...|
Here it is (9 white, 3 black):
G Sargissian vs Tiviakov, 2004 (black win)
Ni Hua vs Vladimirov, 2004 (white win)
R Felgaer vs Jobava, 2004 (white win)
Mamedyarov vs Lputian, 2004 (white win)
J Ye vs Ni Hua, 2004 (white win)
Radjabov vs P H Nielsen, 2004 (white win)
L Dominguez vs V Malakhov, 2004 (white win)
H Hamdouchi vs Kudrin, 2004 (white win)
P Smirnov vs Aronian, 2004 (white win)
Dreev vs Sakaev, 2004 (white win)
Nisipeanu vs Kharlov, 2004 (black win)
L Dominguez vs Radjabov, 2004 (draw)
|Apr-16-21|| ||alexmagnus: Addendum: after being somewhat puzzled by the heavy white tilt in 2004, I looked up what the time control was.|
Turns out, in 2002, 2004, 2005 and 2007 they played not 5 vs 4 but 6 vs 5. In 2009 it was changed to 5 vs 4.
That makes the men's score 26-13 for white <with> these four tournaments and 7-4 for white without them.
|Apr-16-21|| ||AylerKupp: <alexmagnus> Well, this time you were right and I was wrong, and for that I profusely apologize. But consider that when you make a statement that I believe to be wrong base on the data that I have (and we each had different data), without any substantiation, that does not inspire confidence in your statement. That doesn't excuse my statements, but it did raise a red flag to which I inappropriately responded in a knee-jerk reaction.|
And I did try to verify your statements by looking at the Wikipedia pages for the FIDE Championship tournaments as well as a sampling of others for quite some time. I couldn't figure out how to get information about Armageddon games, something that you were able to figure out. So, since I spent some time doing so and couldn't figure out how to do it, I didn't see how you could do the same thing for a much greater number of games in such a short period of time. The fact that you were more clever than I was didn't enter my mind. And I'm sorry for that instance of conceited behavior which I don't think (I may be wrong) that I typically have.
The fact that you didn't mention any of the Norway tournament games which I think provided the most recent and ample data on Armageddon games also raised a red flag. Why you didn't is not important, what's important to realize is that we were both looking at small samples of data which, when analyzed, led us to different conclusions. And while that doesn't excuse my accusations, at least I hope that you realize that I believed that I did have some justification for what I said.
Maybe some day someone will generate a database of all Armageddon games and then we'll know for sure. But for now, I still believe that a 5:4 White Time Control Ratio is not fair and favors Black. And given the lack of data, the most telling reason is that the top level players, when given a choice of colors in Armageddon games, tend to prefer Black. They should know best; after all, for us this is a matter of curiosity and nothing more, for them it's their livelihood.
And I'm sorry but you do have a history of making statements without any apparent justification. For example, you recently indicated that "simulating a human Armageddon game with engines is like simulating bird flight with airplanes – not worth the paper it is written on." I know that this is just your opinion and as I've said many times, we're all entitled to have one. Why do you feel that way? I know that you don't think that you have to justify it and you're correct in that, but I just don't understand your reasons for thinking that. Again, that doesn't justify my statements.
|Apr-16-21|| ||alexmagnus: Ok <AylerKupp>, first, apology accepted. I just hope this kind of angry rants like the one you priduced yesterday won't repeat - as I said, if I say some factual statement (that is, not an opinion), I don't make it up. And if you ask for a source or some other kind of a confirmation, I'll gladly provide it (a pity one cannot provide pictures here when the source is a book, which I occasionally cite too). |
As for Armageddon. The reason I didn't mention Norway is precisely the format of that tournament. In a knockout (where Armageddon is typically used), losing an Armageddon means elimination. In Norway is was just another game, one of many. It creates a different level of pressure.
Pressure and psychology is also the reason why I rejected that paper in such a harsh way: no matter how you tweak the engine, you cannot perfectly simulate human decisions made under pressure and in a "must win"/"draw is enough" situation. It's my opinion, not a fact - but I think, in sports - <any> sports, chess including - psychology plays just as large of a role as pure skill (the latter <can> be simulated by a computer).
My data suggested white is favored. And I also explained why - contrary to these data - players tend to choose black (even in the World Cup itself) - overestimating their own ability to draw. This, in turn, comes from the fact that they easily get draws in "normal" tournament situations, when they need a draw and the opponent has a mindset of "a win would be nice, but I don't mind a draw". This kind of a "must draw" situation is much more common than the Armageddon situation, where the opponent <must> win. Which creates a skewed view on one's drawing chances.
As I said, this is an opinion. But at least I hope now you understand what this opinion is based on - both in terms of its factual basis <and> the "subjective" part of it.
|Apr-16-21|| ||beatgiant: <AylerKupp>,<alexmagnus>
First, a big bow to <AylerKupp> for the rare event of making a public apology on chessgames.com. I can easily point to other recent examples where I think this should have happened but it did not, even among kibitzers whom are otherwise well behaved.|
As for overestimating our own or underestimating other people's understanding, that's just an occupational hazard as a kibitzer. I've often been wrong and learned new things from other people.
This discussion itself is actually highly interesting and worthy of a good quality discussion at length.
I think it's not obvious what would constitute "fairness" in a situation like this. Can we really convert the advantages of draw odds and time odds into some common metric based on game statistics?
An academic researcher might argue that, for the measurement to be valid, we need to remove all the extraneous factors (e.g. players' disparate skill levels at speed chess and at drawing with Black, sporting and psychological effects, etc.) and that is the attraction of calibrating with engines.
But all those factors make a big difference in practice, and that is the attraction of looking at the historical data for real world application.
And individual differences among players mean that the break even point of time odds versus draw odds will vary a lot from one player to the next, which is the attraction of auction-based methods.
So there are good arguments for all of those approaches. And it partly comes down to value judgements about what we mean when we use the term "fairness."
|Apr-17-21|| ||AylerKupp: <alexmagnus> First, thank you for accepting my apology. I don't see your distinction between Armageddon games played in a knockout tournament as tiebreakers and Armageddon games played in a non-knockout tournament as tiebreakers. Sure, the results of Armageddon games played in a knockout tournament have greater consequences but it's still an Armageddon game with White having additional time and Black having draw odds. And both players are under the same pressure.|
<no matter how you tweak the engine, you cannot perfectly simulate human decisions made under pressure and in a "must win"/"draw is enough" situation.>
I agree, that's why when I suggested using engines vs. engine games to initially <estimate> what a fair White Time Control Ratio (WCTR) should be. See, for example, World Cup (2019) (kibitz #798), Norway Chess (2019) (kibitz #176) and Norway Chess (2019) (kibitz #134). I defined a "fair" WCTR at one in which the chances of either player winning under Black draw odds condition is roughly 50/50.
The main reason for using engines vs. engine games to obtain an initial estimate of a fair WCTR is merely convenience. A large number of games would be needed to establish statistically significant results and various WCTR ratios would need to be tried. But engines don't get tired and they don't object to having to play a large number of games in succession so running these games would not require a lot of human effort.
Then, after the best initial estimate has been determined, then like all chess engine-related results, it needs to be validated. So only then should a human vs. human Armageddon/Sudden Death (I prefer the latter term, it seems more descriptive and in line with the term used in similar situations in other sports) set of games would be played. And because they would likely all be short duration games, likely played at Blitz time controls, they should take a relatively small amount of time an human effort. Should the results be consistent with the results obtained in engine vs. engine tournaments, then the resulting WCTR could be more accurately used in tiebreaks.
Who knows, maybe FIDE will establish Sudden Death (their term) games as an official category and publish ratings for them just like they started to do for Blitz and Rapid games several years ago. But I doubt it, although it would be interesting if they did since it would provide a more objective indication as to who are the best Armageddon players.
I should add that I'm not in favor of using faster time controls, whether both players have the same time available or different times. IMO games played at Classic/Standard (FIDE uses the later terminology) is a different game than games played at faster time controls, so I don't consider their results relevant in determining who was the better player in an event held using slower time controls. But that's a different subject.
BTW, as I was looking for the links used above I came across the following one, Norway Chess (2020) (kibitz #343), which reminded me that the Norway tournaments used a 10:7 (1.43) WCTR rather than the more usual 5:4 WCTR (1.25) so White had a greater time advantage than usual. So the results in the Norway tournaments are not directly comparable to the results in the World Cup and other tournaments. Still, even with White's greater time advantage, Black won the majority of the games.
|Apr-17-21|| ||AylerKupp: <beatgiant> Well, FWIW, I defined what I mean by "fairness" above and, while not perfect, I think that it's at least reasonable. And while "fairness" means that, on the <average>, White and Black will <most likely> each win 50% of the time, for any given game or even a small number of games, it doesn't mean that. The better player with the attributes you mentioned (disparate skill levels at speed chess, drawing with Black, etc.) should still be more likely to win any given game.|
|Apr-17-21|| ||beatgiant: <AylerKupp>
We have at least three different definitions of fairness here, and those are driving the three different approaches to achieve it.
There's the definition you stated, which is (correct me if wrong) "a setting is fair when, in practice, 50% of the games played with that setting are won by each color among a certain population of players."
With this definition, it means we may have to do a new survey every few years to recalibrate as the ratio drifts over time as players evolve their skills. For example, maybe we should hold special calibration tournaments to generate a large enough sample of games to test various settings.
There's the assumption in the paper with the computer simulations, which is something like "Based on a standard idealization of a chess player, establish the value in elo points of given time odds and the draw odds. The setting is fair when those values are equal."
With this definition, calibration is cheap and easily repeatable, but the idealization can be too far from what's really possible for humans. To get this right, maybe we'd really want to develop special purpose software to much more closely model human chess behavior. (And before you ask, sorry I've already got another project to work on this summer ;-)
And finally, there's the economic approach, where the definition is something like, "The rule is fair if neither player in the game would prefer to switch colors."
That has the advantages that no calibration is required and it's customized to each pair of players. But it means every Armageddon would have slightly different time limits, adding administrative complexity and making it harder to compare performances (in case we ever wanted to have something like an Armageddon rating).
|Apr-18-21|| ||alexmagnus: <IMO games played at Classic/Standard (FIDE uses the later terminology) is a different game than games played at faster time controls, so I don't consider their results relevant in determining who was the better player in an event held using slower time controls.>|
In a knockout there is no other choice though. And as the record shows, World Cup winners actually do well in the Candidates (Gelfand and Karjakin even won the Candidates, though in Gelfand's case it was of course to his advantage that the Candidates themselves were a knockout too), so it appears that the World Cup, for all its "randomness" involving mini-matches and quick time controls, is a fair and valid qualification tournament. Also, knockouts seem to be not really "random", given how many players reached the final or semifinal multiple times.
The subject of using fast time controls to decide a classical event is of course somewhat controversial. But I think this controversy arises more from traditionalism. Rapid (and even blitz) chess surely have more to do with classical chess than a penalty shootout has to do with soccer. It's closer to the overtime in basketball.
And, finally, there is this historical tidbit: no one has yet gained the world championship title (as opposed to defended it) if he played even one rapid game <on his entire way to the title> - that is, in the entire qualification chain down to the earliest tournament to which they qualified by rating or invitation. Of course, it is bound to happen some day, but it just shows that quick time control games are not as influential in the WC cycle as people often make them to be.
|Apr-18-21|| ||beatgiant: <alexmagnus>
<In a knockout there is no other choice though.>
I've proposed before that one could devise a form of chess with all draw rules replaced with rules giving the win to one of the players: no draw by agreement, a player repeating a game position for a third time loses, a player who first makes 50 moves without a pawn move or capture loses, the last player to have sufficient mating material wins, the player giving stalemate wins.
In this version, unlike Armageddon, it's not at all obvious a priori whether the starting situation is better for White or for Black. At first glance, the change in the 50 moves rule seems to favor Black, but the change in the insufficient material rule seems to favor White. If it quickly does become obvious that the rules strongly favor one of the sides (e.g. lopsided results for one color in computer vs computer matches), then tweak the rules until that stops being true. (Here, I'm making a hand-waving assumption that it will be possible to do that.)
If it's not clear which side has the advantage in the starting position, then we wouldn't need to compensate the other side with time odds and the whole discussion about setting a fair tradeoff for that would become moot. And as an interesting by-product, the new game would have an entirely different endgame theory, so we'd open up a whole new field of discovery.
|Apr-18-21|| ||beatgiant: <AylerKupp>
About averaging of ratings over time:
What <devere> originally claimed was that ratings are cumulative ("like the score during a baseball game" was the analogy) and this is why we can't average them. That's clearly wrong and I already rebutted it in great detail.
I do acknowledge that if the minimum activity requirement is too low, then average ratings will unfairly favor those who sit on their early leads.
But, I claim that if the activity requirement is sufficient to get an accurate rating each month (I'm not sure what the number is, but for the sake of argument suppose it is 20 games per month), then there's absolutely nothing wrong with using the averages.
Before I put in a substantial amount of work discussing it point by point, I'd like to start by clarifying where precisely we agree and disagree.
Do you agree that ratings are not cumulative like the score during a baseball game? Or you think they are?
Do you agree that if the minimum activity requirement were high enough, there would be nothing wrong with averaging them? Or you still think it would unfairly favor those starting out with a high rating?
< Earlier Kibitzing · PAGE 4 OF 7 ·