The chess games of Arpad Elo

Members · Prefs · Laboratory · Collections · Openings · Endgames · Sacrifices · History · Search Kibitzing · Kibitzer's Café · Chessforums · Tournament Index · Players ·

Arpad Elo

A Elo

Number of games in database: 22
Years covered: 1935 to 1957
Overall record: +4 -15 =3 (25.0%)*
* Overall winning percentage = (wins+draws/2) / total games.

Repertoire Explorer
Most played openings
C27 Vienna Game (2 games)

Search Sacrifice Explorer for Arpad Elo
Search Google for Arpad Elo

ARPAD ELO
(born Aug-25-1903, died Nov-05-1992, 89 years old) Hungary (federation/nationality United States of America) PRONUNCIATION:
[what is this?]

Arpad Emrick Elo, born in Hungary in 1903, emigrated to the US and became a professor of physics at Marquette University in Milwaukee, Wisconsin. Many times Wisconsin State Chess Champion, he is best known for developing a mathematical rating system for players that was universally adopted in 1970.
Wikipedia article: Arpad Elo

Last updated: 2021-01-25 20:47:36

Try our new games table.

page 1 of 1; 22 games

Game		Result	Moves	Year	Event/Locale	Opening
1. A Elo vs G Eastman		1-0	27	1935	36th ACF Congress. Prelim C	C27 Vienna Game
2. A Elo vs T Barron		1-0	21	1935	36th ACF Congress. Prelim C	B00 Uncommon King's Pawn Opening
3. A Elo vs A Simonson		0-1	37	1935	36th ACF Congress. Championship Final	C11 French
4. A Elo vs Fine		½-½	58	1935	36th ACF Congress. Championship Final	C01 French, Exchange
5. A Dake vs A Elo		1-0	53	1935	36th ACF Congress. Championship Final	A09 Reti Opening
6. W Ruth vs A Elo		1-0	30	1935	36th ACF Congress. Championship Final	A45 Queen's Pawn Game
7. Kashdan vs A Elo		1-0	24	1935	36th ACF Congress. Championship Final	D49 Queen's Gambit Declined Semi-Slav, Meran
8. A Elo vs Santasiere		0-1	52	1935	36th ACF Congress. Championship Final	B29 Sicilian, Nimzovich-Rubinstein
9. A Elo vs A Dake		½-½	31	1936	Milwaukee City Championship	B84 Sicilian, Scheveningen
10. Santasiere vs A Elo		1-0	39	1937	ACF Congress	A21 English
11. A Elo vs P Litwinsky		1-0	31	1937	38th ACF Congress. Preliminary 4	B54 Sicilian
12. A Elo vs E Marchand		0-1	33	1937	ACF Congress	B73 Sicilian, Dragon, Classical
13. A Elo vs A Roddy		0-1	41	1940	41st US Open. Prelim 2	B58 Sicilian
14. A Elo vs Fine		½-½	23	1940	41st US Open. Prelim 2	C27 Vienna Game
15. A Elo vs Fine		0-1	35	1940	41st US Open	B20 Sicilian
16. J C Thompson vs A Elo		1-0	38	1940	41st US Open	C86 Ruy Lopez, Worrall Attack
17. A Elo vs H Burdge		1-0	23	1940	41st US Open	D51 Queen's Gambit Declined
18. A Elo vs H Steiner		0-1	33	1940	41st US Open	A04 Reti Opening
19. W Shipman vs A Elo		1-0	27	1946	47th US Open	C39 King's Gambit Accepted
20. O Ulvestad vs A Elo		1-0	19	1946	47th US Open	E10 Queen's Pawn Game
21. I A Horowitz vs A Elo		1-0	41	1953	54th US Open	D32 Queen's Gambit Declined, Tarrasch
22. A Elo vs Fischer		0-1	49	1957	New Western Open	B93 Sicilian, Najdorf, 6.f4


page 1 of 1; 22 games

REFINE SEARCH: White wins (1-0) | Black wins (0-1) | Draws (1/2-1/2) | Elo wins | Elo loses

< Earlier Kibitzing · PAGE 13 OF 13 · Later Kibitzing>
Mar-24-16		offramp: <Gregor Samsa Mendel: <offramp>--Apparently we Yanks have mootated the meaning of mootness: https://en.wikipedia.org/wiki/Mootn... That is bizarre and disturbing. In means that British and American people reading the same text will have opposite views on what has been written. I will think it means "open to debate" and a yank will think it means "pointless to debate". Perhaps it is clearer if one uses a word such as "irrelevant", although that is less pretentious.

Mar-24-16		zanzibar: I think we should table this dangerous discussion.

Mar-25-16		AylerKupp: <Tiggler> I wouldn't be too hard on Dr. Elo. After all, he was working at a time when there wasn't an easy access to computers and, whatever there was, was expensive. So it's natural for Elo to make many assumptions and simplifications in order to make the calculations easier. Still, using ¡Ì2= 0.7 instead of 0.707 seems excessive, as that would make the SD = 404 (exactly) instead of 400. And 404 should not be that much more difficult to use in the calculations as 400. For another view on the accuracy of the Elo tables, see http://recherche.enac.fr/~alliot/el.... Your comment about Elo's assumption that the standard deviation of the distribution is the same for all players gave me pause for some thought. Clearly that was a necessary simplification for Elo but it would seem possible today to calculate a performance distribution for each rated player (or at least the top ones), and use each player's distribution in calculating their tailored t-distribution (perhaps another good use of the letter "t"!). I don't consider this concept absurd at all. For example, based on the current Candidates Tournament, I would assume that the SD in Giri's performance (all draws) distribution would be much different than Nakamura's or Anand's (5 decisive games each). Of course, I don't now if it would make a significant difference in the results. The reason I've been considering all of this is that I'm trying to develop a predictor for game results in the Candidates Tournament for User: golden executive contest. I had been doing reasonable well (my goal was a 75% correct prediction) until the last round (70%), when I used my "hunch" instead of some of the model's predictions and I was wrong while the model was right. My enthusiasm is greatly tempered by the realization that if I had simply predicted that each game would end in a draw I would have been correct 71.1% of the time, even with the Nakamura ¨C Anand result included.

Mar-25-16		offramp: It's all moot, isn't it?

Mar-26-16		Tiggler: <AylerKupp> Sorry to be pedantic (though you would not be the one who would complain of this), but you cannot have tailored t-distributions for each player. The t-distribution is for the difference of two samples from the same normal distribution. <offramp> Yes indeed, quite moot: worthy of debate.

Mar-27-16		luftforlife: In American usage, the adjective "moot" enjoys three denotations: first, "open to question; subject to discussion; debatable; unsettled"; also, "subjected to discussion; controversial, disputed"; second, "deprived of practical significance; made abstract or purely academic"; third, "concerned with a hypothetical situation." Webster's Third New International Dictionary (Springfield, Mass.: Merriam-Webster Inc. 1993), 1468. The second denotation does not connote, and is neither equivalent with, nor tantamount to, irrelevance per se (for such a moot point retains its academic relevance, its fitness for abstract consideration, or both), but rather connotes a change in status that can, in the legal context at least, lead to a change in treatment -- to unfitness for further consideration, thwarting and thereby pretermitting practical, concrete, specific, and final resolution, disposition, or decision, of a case turning on, and fatally infected by, such a moot point -- due to limitations of power.

Mar-27-16		perfidious: <luftforlife> Used as an adjective, you are correct; however, that is not the full story. While as a noun, the word is comparatively uncommonly used, as a verb that is not the case, though of course Over Here 'debate' is much more often employed. http://www.merriam-webster.com/dict...

Mar-27-16		luftforlife: <perfidious>: Thanks for your comment, and I take your point. I focussed on the American adjectival form and usage chiefly to point up (and to contrast with irrelevance per se) the American denotation "deprived of practical significance" -- a necessary and sensible accretion to meaning as it has arisen and as it has been applied as a term of art by our Supreme Court in its construction of our Constitution and its limitations on the federal judicial power, but one that has, on our shores, overspilled the narrow confines of that usage, and that has, in more general American usage, come to acquire connotations that dull, obscure, and even subvert not only the other American adjectival denotations, but also the essential and vital British origins, meanings, and past and present uses of the word in all its forms. I appreciate <offramp's> incisive comments and reminders in this regard. Your comment and the others above my own are illuminating and edifying. Kind regards.

Aug-25-16		TheFocus: Happy birthday, Arpad Elo.

Sep-30-16		alexmagnus: The average rating of women's top 100 is now the lowest since it is published as top 100 (and not as top 50). 11 points below all-time high from April 2015. (the even lower number from July 2013 on the FIDE site is wrong - in that month, FIDE accidentally published top 120 instead of top 100). Open top 100 on the other hand is extremely stable in recent years.

May-07-17		AylerKupp: Calculating P(Win), P(Draw), and P(Loss) – Articles found (part 1 of 2) The FIDE scoring tables (https://www.fide.com/fide/handbook...., Table 8.1b) indicate the probability of a win [ P(W/D) ] OR a loss [ P(L/D) ] for a player based on the rating difference between that player and his opponent. In certain situations it is desired to determine the probability of a win, a draw, or a loss [ P(W), P(D), P(L) ] for a player, again based on the rating difference between that player and his opponent. This is surprisingly not easy to do, and additional information is needed. In his book, "The Rating of Chess players – Past & Present", he says "All data entering the rating system consist of total points scored in actually played game ... Discrimination as to how any point score is composed between wins, draws, and losses is beside the point." I think that the real reason that he ignored the effect of draws is that he developed his system when there weren't any cheap computers easily available to do calculations, and no on-line game databases containing the necessary information. And so he remarked that "Any consideration of draws in rating theory requires information on the probabilities of draws, as well as wins and losses, between individual players, information which is not readily available. Its accumulation would be inordinately laborious, and there has been little demand for it." Well, maybe not then. Over a period of time I've been able to come up with only 3 articles describing methods for attempting to extract P(D) from P(W/D) and P(L/D), and in one of them the author threw up his hands when he realized that he needed additional information. These articles are: 1. "Individual Chess Game Probabilities based on Match Results". Written by Charles Roberson in 2012, the link no longer works. His method relates the P(W), P(D), and P(L) to the probabilities of a player winning or losing a match. I don't find it very convincing because he makes statements like "E(D) = Expected game draw percentage = Match play probability of losing" without substantiation, and when you plot P(W), P(D), and P(L) on the same chart you get a very sharp slope change in P(W) and P(L) that just doesn't look right. 2. "Bayesian Elo Rating" by Remi Coulom, written in 2004. (https://www.remi-coulom.fr/Bayesian...). His method calculates P(W), P(D), and P(L) using Bayes' Theorem by choosing a prior likelihood distribution over Elo ratings and computing a posterior distribution as a function of the observed results. Whatever that means. Seriously, as Dr. Elo said, calculating P(W), P(D), P(L) from P(W/D), P(L/D) requires additional information, and the author estimates a Draw Likelihood by simulation. I think that this number represents the Expected Drawing Percentage [ E(D%) ] x standard deviation (SD) but I'm not sure. With E(D%) known then E(W%) and E(L%) can be calculated and from them P(W) and P(L). With P(W) and P(L) known then P(D) = 1 – P(W) – P(L) since all results are mutually exclusive. When you plot P(W), P(D), and P(L) you get nice smooth curves which is what you hope for. I think. As a bonus, the article addresses and quantifies White's opening advantage, something that neither Dr. Elo nor FIDE address, although Dr. Elo mentioned it in his book, dismissing it with the comment "Any incorporation of colors into the rating system, however, would again inordinately expand the bookkeeping requirements with small prospect of any utility for it, in the final analysis." IMO, wrong again, Dr. Elo, even though it's understandable given the lack of accessible computers when he developed his system.

May-07-17		AylerKupp: Calculating P(Win), P(Draw), and P(Loss) – Articles found (part 2 of 2) 3. "How to calculate probabilities of Win, draw and loss based on the ELO system" written in 2014 (https://math.stackexchange.com/ques...) with no user name given. The author attempts to calculate (PD) by looking at the expected score (EA, EB) in a game between 2 players (A and B) and, since FIDE considers a draw to be 1/2 White win and 1/2 Black win, the formulas: EA = P(A wins) + 1/2P(Draw) + 0P(A loses) = P(A wins) + 1/2P(Draw) EB = P(B wins) + 1/2P(Draw) + 0P(B loses) = P(B wins) + 1/2P(Draw) But then he realized that he needed additional information (which he would have known had he read Dr. Elo's book) and gave up, asking for suggestions. Which he didn't get. Still, I used his method to calculate P(W) and P(L) using the P(D) calculated in articles 1 and 2 above. But no new information, the chart using P(D) from Article 1 looks just like the chart in Article 1 and the chart using the P(D) calculated using the P(D) from Article 2 looks just like the chart in Article 2. I've created a spreadsheet describing the above in more detail as well as additional information such as: 1. The Percentage Expectancy Table (which is the same as FIDE's table 1b called the Scoring Probability) listed in Dr. Elo's book is wrong if a SD = 200 is used as Dr. Elo indicates he used. However, as user <Tiggler> pointed out, if a SD = 2000/7 is used instead, then the numbers match perfectly. I suppose another simplification made by Dr. Elo. 2. The FIDE Scoring Probability table (as well as Dr. Elo's Percentage Expectancy Table) only has 2 significant digits. As a result, each P(W/D) covers a range of rating differentials (RDiffs). It's easier to deal with probabilities if each rating differential has a unique probability associated with it, and this is listed in one of the spreadsheet tabs. Five significant digits are needed in order to uniquely associate each RDiff with a probability. 3. The probabilities calculated using the Match Results method are listed and plotted. 4. The probabilities calculated using the Bayes method are listed and plotted. This one is particularly interesting because you can see the effect of incorporating White's first move advantage into the probabilities. It also shows how to incorporate different White Advantage and Draw Likelihoods using data derived from the Opening Explorer database, the ChessTempo database, or any other database that provided a percentage of White wins, draws, and losses. You can download this spreadsheet from http://www.mediafire.com/file/m2skk.... The file is about 2.4 GB. You will need Excel 2003 or later to view it.

May-09-17		AylerKupp: Calculating P(Win), P(Draw), and P(Loss) – The Area method I was not satisfied with the results obtained by attempting to calculate P(W), P(D), and P(L) based on the articles I found. The Bayesian method seemed the most promising since it yielded the expected, or at least hoped-for, smooth curves. But, since neither the games database used nor the simulation was made available, the probabilities could not be modified to reflect the different P(W)s, P(D)s, and P(L)s at different player levels (both players rated 2200+, both players rated 2300+, etc.), since the EloDraw parameter was not known. And it was also not clear to me how the factor to incorporate White's opening advantage (EloAdvantage) was calculated. Besides, the resulting P(D) simply seemed too low, particularly at the higher player rating levels. Then I had an epiphany. P(D) is the area under the P(D) curve, and the Draw percentage is based on the ratio of this area to the total area, i.e. A[ P(D) ]% = A[ P(D) ]% / ( A[ P(W) ]% + A[ P(D) ]% + A[ P(L) ]% ) So I could iterate and find the value of EloDraw that resulted in A[ P(D) ]% being equal to the observed in the games database filtered to include only the player rating levels desired. And A[ P(D) ] was easy to calculate since we are effectively dealing with the discrete probability distribution of a random variable (i.e. the results of games), it was just the sum of all the P(D)s x 1601 (the spread of the distribution, + 800 + 1 in this case), since the width of each value of the sample is = 1. And the spread is not actually needed since we are calculating ratios, so the spread cancels out. Then, once the value of EloDraw is known, P(W) and P(L) can be calculated. I've updated the spreadsheet to add the description of the Area method and a tab to calculate and plot P(W), P(D), and P(L) for the set of ChessTempo win, draw, and loss percentages corresponding to both players rated 2200+ and 2600+. You can download this updated spreadsheet from here: http://www.mediafire.com/file/syrgd.... To make the distinction clearer, I changed the names of the parameters EloAdvantage and EloDraw to WhiteAdvantage and DrawLikelihood respectively, since they no longer have anything to do with Elo distributions, including FIDE's P(W/D) and P(L/D). Using this method you can calculate the P(W)s, P(D)s, and P(L)s using the White win, draw, and loss (Black win) percentages from any games database and using any probability distribution that you think is the most accurate.

May-11-17		AylerKupp: OK, FWIW, I downloaded the ChessTempo data for games where both players were rated 2700+ (a very time consuming procedure, effectively 29 screen captures) and I got some interesting results: Total number of games = 14,502 (an increase of about 180 games since 2-05-17) White wins 4,167 games (28.7%) Draws = 7,603 games (52.4%) White loses 2,732 games (18.8%) White's advantage = 9.9% I filtered the data according to information in the Event column and I discarded games earlier than 1990 to be consistent with the KingBase data. This are the numbers for different types of games: Classic 8,666 games (59.8%) Blitz 3,205 games (22.1%) Rapid 1,776 games (12.2%) Blindfold 639 games (4.4%) Exhibition 2 games (<0.1%) Simultaneous 1 game (<0.1%) Too Old 213 games (1.5%) For Classic time control games only, here are the statistics: White wins 2,141 games (24.7%) Draws = 5,322 games (61.4%) (!) White loses 1,203 games (13.9%) White's advantage = 10.8% So the incidence of draws for Classic time control games when both players are rated 2700+ is greater than when all games are considered. Which makes sense; I would think (I didn't calculate it) that the likelihood of errors is higher at faster time controls, never mind blindfold games. As a check, here are the statistics for the recently completed Gashimov Memorial: Total number of games = 45 White won 9 games (20.0%) Draws = 29 games (64.4%) White loses 7 games (15.6%) White's advantage = 4.4% Not too inconsistent, keeping in mind that this is a very small number of games so a substantial deviation from the means is expected. I doubt that I'll repeat it with the data for players rated 2600+ since there are 67,506 of those games and that would require 135 screen captures! I think I'll wait until I set up the KingBase data.

Mar-07-19		Sally Simpson: * ‘I met the eponymous professor [Arpad E. Elo] during the chess olympics at Nice in 1974. He was besieged with requests by players wanting the rules bent to accommodate their own requests for international titles. When the last of the supplicants had gone, Professor Elo said to me: “I think I have created a monster.” I think so too.’ (Bill Hartson NOW! magazine 1-7 August 1980, page 82.) C.N. 6742 *

May-13-20		MissScarlett: La Crosse Tribune, August 17th 1928, p.6: <Moon Establishes His Innocence In Trial For Assault Milwaukee, Wis. — (AP) — The moon and its phases Friday freed Paul Saunders from conviction on a charge carrying a maximum sentence of 30 years in the state prison. Through the testimony of W.P. Stewart, federal meteorologist and Arpad E. Elo, astronomist at Marquette university, Saunders proved to the satisfaction of Municipal Judge George Shaughnessy that the darkness of the night made identification of the assailant of Mrs. Jessie Forbes impossible. Mrs. Forbes complained that she was attacked by a pajama clad prowler in the bedroom of her home and identified Saunders, declaring she caught a glimpse of his face by moonlight. Saunders called upon Stewart and Elo to establish that on the night in question, the moon was in its last quarter and was so low in the sky that combined with the trees near the Forbes home shut off any light. Saunders' wife testified he was home that night, and Judge Shaughnessy then ordered a verdict of “not guilty.”> Frankly, a ridiculous defence. I'd have blamed a one-armed man and legged it.

Nov-17-21		Whitehat1963: One of my quibbles with Elo ratings is that it doesn’t take circumstances into consideration. For example, if two players are mathematically eliminated from placing high in an important tournament, and are playing in the last or next to last round, they will most likely draw, regardless of their respective ratings. Also, if a higher-rated player is playing black, he is far more likely to accept a draw offer than if he is playing white, especially in the late rounds of a tournament. I don’t know that a mathematical formula can account for such circumstances, but failing to account for such circumstances makes Elo ratings far from perfect.

Nov-17-21		Whitehat1963: Another problem with Elo ratings is that someone like, say, Alireza Firouzja can increasing his rating by playing aggressively against a slew of lower-rated players but playing solely for draws against higher-rated players. There are no guarantees of defeating ANY player, of course, but playing very well against a bunch of lower-rated players is far easier than beating, or even drawing, against players rated in the top five or 10 in the world.

Nov-18-21		keypusher: <whitehat1963> <Another problem with Elo ratings is that someone like, say, Alireza Firouzja can increasing his rating by playing aggressively against a slew of lower-rated players but playing solely for draws against higher-rated players. > Firouzja doesn't do that, but anyway: if you can beat people rated 2600 and draw with people rated 2750, your rating ought to be 2750. If your rating is anything other than (roughly) 2750, the rating system is screwed up. There are only two ways to get a high Elo rating: be really strong, or hide a copy of Stockfish in your shoe. <I don’t know that a mathematical formula can account for such circumstances, but failing to account for such circumstances makes Elo ratings far from perfect.> Luckily there are precisely zero people in the history of the world who think Elo ratings are perfect.

Nov-18-21		perfidious: Dr Elo had it about right at Nice 1974: 'I think I have created a monster '.

Jan-27-22		Whitehat1963: What would happen to players’ Elo ratings if draws were not part of the equation? I am thinking about Carlsen at the Tata Steel tournament right now. It is an elite tournament. He has four wins and six draws, no losses. His Elo rating on the live list has increased by one point. In terms of his rating, it is hardly worth the risk to play at all.

Aug-25-22		Captain Hindsight: Elo is old news at Tinder.

Dec-08-23		Caissanist: FIDE set to make major changes to the ratings algorithms on January 1. A lot of the changes appear to be pushed by Jeff Sonas, the statistician behind Chessmetrics: https://www.chess.com/news/view/fid... .

Dec-08-23		0ZeR0: <Caissanist> Very informative article, thanks for sharing. I'm no statistician but it will be interesting to see how the proposed changes will effect the rating list.

Dec-08-23		sudoplatov: I developed a simple rating system for the NFL. It does assume an essentially level playing field; it also helps if the system is closed. It's surprisingly hard to program (too much input data) and I haven't extended it. Method. At the beginning of a season (the lack of continuity of team membership makes it less useful over several years), each team has a rating of zero. After each round (Thursday to Monday in the NFL) a team gets 2 points for a win and -2 for a loss, 0 for a tie. To account for strength-of-schedule, each team is given a bonus of 1 for each win the teams the given team beat and -1 for each loss from each team it loses to. Each game played may affect every other team indirectly. I got about 70% from mid-season on. Nate Silver (I think) has an article on randomness in the NFL indicating that about 50% of a team's result is "random." I have intended to do a study on the aging of results but just never took the time to do so. My idea was to date each game and have a decay of some amount (like 15/16 for the NFL) for each week. Thus games played several weeks earlier wouldn't count as much; the direct scores and strength terms could have different aging. I think may work for chess tournaments. Maybe I can try this for the big tournaments from 1895 to 1905 or so. The results of similarly structured tournaments should carry over pretty well: Hastings 1895, St Petes 1896, Nuremberg 1896, etc. The method is designed to measure relative strength; I haven't looked at generating quantitative win formula, but I think it shouldn't be too hard.

Jump to page # (enter # from 1 to 13)
search thread:
< Earlier Kibitzing · PAGE 13 OF 13 · Later Kibitzing>

NOTE: Create an account today to post replies and access other powerful features which are available only to registered users. Becoming a member is free, anonymous, and takes less than 1 minute! If you already have a username, then simply login login under your username now to join the discussion.

Please observe our posting guidelines:

No obscene, racist, sexist, or profane language.
No spamming, advertising, duplicate, or gibberish posts.
No vitriolic or systematic personal attacks against other members.
Nothing in violation of United States law.
No cyberstalking or malicious posting of negative or private information (doxing/doxxing) of members.
No trolling.
The use of "sock puppet" accounts to circumvent disciplinary action taken by moderators, create a false impression of consensus or support, or stage conversations, is prohibited.
Do not degrade Chessgames or any of it's staff/volunteers.

Please try to maintain a semblance of civility at all times.

See something that violates our rules? Blow the whistle and inform a moderator.

NOTE: Please keep all discussion on-topic. This forum is for this specific player only. To discuss chess or this site in general, visit the Kibitzer's Café.

Messages posted by Chessgames members do not necessarily represent the views of Chessgames.com, its employees, or sponsors.
All moderator actions taken are ultimately at the sole discretion of the administration.

Spot an error? Please suggest your correction and help us eliminate database mistakes!

Copyright 2001-2025, Chessgames Services LLC