< Earlier Kibitzing · PAGE 18 OF 18 ·
|Jun-11-18|| ||Gypsy: <Tiggler> Hello my friend!|
I see I was right: This is a rabbit hole. Very dangerous. One may think that he/she can safely just nibble at the edge and quickly disappear down the rabbit hole for centuries before coming up of a bit of air.
|Jun-11-18|| ||Tiggler: OK <Gypsy>, I know you are always too busy to get involved in my labyrinth.|
If you do get stuck here, and become too large to get out, I'll show you a tunnel to <AlyerKupp>'s warren, where you will also get lost, but at least you will shrink to a manageable size.
Here it is:
And this is a partial repost from there to tempt you:
<Tiggler: <AylerKupp> I have to tell you of a curious result that I discovered, and which <Gypsy> helped me to understand a few years ago.
Here it is:
If all players perform on average according to their current rating with random variation in accordance with the distribution used to generate their new ratings, then the population rating distribution must necessarily diverge. The population distribution will assume a gaussian shape and will keep a constant mean, but the standard deviation of the population ratings diverges and will NECESSARILY increase without limit.
A stable distribution of the population rating is only possible if the higher rated player in each game consistently underperforms by a small margin. Otherwise the top ratings must inflate.
This is easy to prove with a simple simulation. The reason for it is also known, and I'll find the links that help to put together the explanation if you are interested.>
|Jun-11-18|| ||Gypsy: <Tiggler> The 'power rating' problem - a more predictive handle on individual games - interests me ... a lot. |
I am collecting my thoughts a may start posting here about the topic 'without a further warning'. (Just forgive me if I also go mum for a period of time. And I will not have the cycles to really program anything involved.)
|Jun-11-18|| ||Gypsy: I mean, there is a ton of juicy topics you have touched upon here. But if there is one topic that stands above the rest, it would be the power-rating problem.|
|Jun-11-18|| ||Gypsy: < ... After reading it he told his student: “Do you realize what this means? If you’re right that means that Maxwell’s equations are incorrect!” The student’s reply was: “Well, if that’s the way it goes, that’s the way it goes.” > |
That is just wonderful!
|Jun-11-18|| ||OhioChessFan: Reminds me of an awful joke that I somehow find hilarious. |
An off Broadway production of Hamlet had an unfortunate run of actors in the lead role falling ill, and they desperately turned to a second rate actor. He was every bit as bad as they feared, and the crowd began jeering his performance. He stopped mid-speech and said, "Hey, don't blame me. I didn't write this crap!"
|Jun-13-18|| ||Jambow: <could be greatly improved when it came to trying to predict the outcome of a single game.>|
The ELO formula already gives single game predictions. Not super meaninful really but they already do exist. The fundamental problem is with limited outcomes of only a 0,.5 or 1 as a result the standard deviation is simply too large to be meaningful for a single game. As the number of games goes up the predictions become somewhat meaningful. Since there are human factors that can't possible be taken into account even then only limited usefulness.
Now a short term trend direction correction factor might improve things a bit and I suppose <AK> this is what you use the TPR's for?
For us to see if this could potentially be useful results should have been compared to single game odds for ELO predictions for starters. Did yours do better worse or on par? Two tournaments also means little if you had lots of results and your improved system consistently outperformed ELO, which I doubt then perhaps you are on to something.
We have developed a coin flip algorithem that takes into account, the size, weight, density of metals used as well as the air friction coefficient of both sides of the coin. HIStorically we examined data and determined that coins land on heads 50.00002% of the time. We did a coin toss and predicted the results with 100% accuracy. After this undeniable success we are scheduled for countless dissertations, and are sending our sales reps into the field. Would you like to purchace an advanced copy before the price goes through the roof. You get my point.
|Jun-13-18|| ||Gypsy: <... We have developed a coin flip algorithem that takes into account, the size, weight, density of metals used as well as the air friction coefficient of both sides of the coin. HIStorically we examined data and determined that coins land on heads 50.00002% of the time. ...> |
Sounds like you have at least passing familiarity with Persi Diaconis results.
|Jun-13-18|| ||Gypsy: <... The ELO formula already gives single game predictions. Not super meaninful really but they already do exist. The fundamental problem is with limited outcomes of only a 0,.5 or 1 as a result the standard deviation is simply too large to be meaningful for a single game. ...> |
Well, the way I look at things, Elo already gives quite meaningful predictions: Take, for instance, a game between a 2700 rated player and a 2200 rated player. (A common occurrence during the early rounds of Chess Olympiad.) We would not have a problem to know which way to bet, would we?
What I find most notable is that Elo is this effective even though it uses only the tournament x-tables. It disregards completely the actual content of the games ... the moves.
|Jun-13-18|| ||Gypsy: Since we have 2018 World Cup just starting, lets think about the soccer/football game for a moment: Using Elo for a power-rating of teams means that we ignore all about the game, other than W/D/L. At the other end of detail of the game predicting game would be an assessment of each individual player and key match-ups (Can Gerard Pique contain Edin Džeko?). But soccer also has something in between - the score. |
Can one make a fairly good predictive power-rating system for soccer based on the scores of the individual games?
|Jun-13-18|| ||Gypsy: http://lastplanetranking.blogspot.c...|
Here is a link to one such a rating system. It focuses on the junior-age soccer in Pacific NW, USA (Washington, Oregon, Idaho) with also some teams from California, BC, and such in the mix.
Computationally, the ratings are produced by the so called RAS algorithm for the so called, in the econometric world, matrix-balancing problem. (The algorithm is due to Sir Richard Stone, https://en.wikipedia.org/wiki/Richa...) The problem poses a large nonlinear but convex optimization problem and the algorithm converges a solution that is based on an implicit relative-entropy objective function. Thus, for the solution is 'optimal' from the information theoretical stand-point.
Lastplanet gives more recent links and uses.
|Jun-13-18|| ||Tiggler: <Gypsy> <RAS algorithm for the so called, in the econometric world, matrix-balancing problem ..... The problem poses a large nonlinear but convex optimization problem and the algorithm converges a solution that is based on an implicit relative-entropy objective function. >|
Can you provide any links where one might find some details? I did not find <lastplanet> at all useful.
|Jun-14-18|| ||Gypsy: <Tiggler> I know I am explaining things peace-meal. (Sorry about that, I am still doing a fair bit of thinking aloud)|
RAS ... actually denotes matrix multiplication
A .... is a given (non-negative) m by n matrix.
R & S are diagonal matrices (of the right dimensions) ... these are the variables of the problem and it is from the optimal R & S that we extract the ratings at the end. (In the case of rating soccer teams, one of the diagonal matrices yields the ratings of offensive skills, the other yields the ratings of defensive skills.)
The Matrix Balancing Problem goes as follows:
Given two non-zero matrices of the same non-zero support A, and G, find such diagonal matrices R & S, that the matrix product RAS resembles G as closely as possible. (Use relative entropy as the measure of closeness.)
A beauty of the RAS Algorithm is that the said entropy enters the computations only implicitly. The actual RAS solution process is fairly simple:
First find the diagonal matrix R1, so that (R1*A) has the same row-sums as G.
Second, find the diagonal matrix S1, so that (R1*A*S1) has the same column sums as G.
Note that the multiplication of (R1*A) on the right by S1, has probably messed up the previously established row-sum property. Thus:
Find the diagonal matrix R2, so that (R2*R1*A*S1) has the same row-sums as G; and then find S2, so that (R2*R1*A*S1*S2) has the same column sums as G.
And so on and so forth. The process converges in the sense that the row-sums and column-sums stabilize; and the product Rn*..*R3*R2*R1 converges to R, and the product S1*S2*S3*...*Sn converges to S.
|Jun-14-18|| ||Gypsy: Since I got all the way here, the next order of business must be how, in great Pele's name, does one go about setting up the results of a soccer tournament as a matrix balancing problem:|
Assume that the teams i and j have played k games with the aggregate score x:y. Then, let
(1) A(i,j) = A(j,i) = k
(2a) G(i,j) = x
(2b) G(j,i) = y
|Jun-14-18|| ||Gypsy: A couple of side-remarks here:
(1) The G, in matrix balancing, has been called the 'goal matrix' way before I first saw it used in connection with this particular soccer application.
(2) Other metrics besides entropy can be used. When I first worked on a matrix balancing problem, we looked at the Frobenius metric for the distance between matrices. That is same as the L2 metric if the matrices are viewed as vectors. Frobenius did reasonably well, but the entropy-based RAS produced clearly much superior estimates.
(3) Relative entropy often goes under the name Kullback-Leibler divergence
(4) Lastplanet got to her soccer rating scheme via a model that assumes that, in soccer, scoring is an approximate Poisson process. The arrival rate of goals is, on one hand, increased by the skill of the offense and, on the other hand, tempered down by the skill of the defense. From the modeling standpoint, that feels like a satisfying assumption; but the matrix balancing formulation does not really use the Poisson arrival assumption in any crucial way.
|Jun-14-18|| ||Gypsy: To close up for today, the reason I decided to bring up all this esoteric (baroque, really) background stuff about matrix balancing is that I see a fair chance to pivot from all this back to chess.|
The preliminary plan/hope is to use an engine, say, Stock-Fish, as an oracle that evaluates games and then turn those evaluations into a soccer like score.
If we can do that last step in a meaningful way, then something useful may yet rise from my abuse of your patience.
But I do have to take a break for now.
|Jun-14-18|| ||Jambow: <Well, the way I look at things, Elo already gives quite meaningful predictions: Take, for instance, a game between a 2700 rated player and a 2200 rated player. (A common occurrence during the early rounds of Chess Olympiad.) We would not have a problem to know which way to bet, would we?> |
Good point and I nearly added the disclaimer to my initial statement that the difference in ELO is important. So in that sense it is indeed very meaningful. Point being that to know if <AylerKupp>'s formula is useful then we would need to compare the long term results with significant data. If he got 75% success but the difference in ELO was large that may not mean much. Likewise if it has a very small sample size then it also has little value. Apples to apples is his algorithm vs. ELO...
|Jun-14-18|| ||Jambow: Now if you were to predict outcomes and given weighted odds in say a head to head between certain players, like Nakamura vs Carlsen ELO would NOT be a good predictor for betting if you chose Nakamura would it...|
|Jun-14-18|| ||Tiggler: <Gypsy> A lot of content to digest in your last two posts. I think I'll get there.|
|Jun-14-18|| ||Tiggler: I am thinking about a problem that might be analogous, possibly even homologous, to the soccer one posed by <Gypsy>.|
In optical science one posits an object vector (that may represent a matrix), a tranfer matrix, and an image vector (or matrix). The process is infected with noise (often Poisson in nature), so that the image and the estimate of the object are both random variables. The image processing problem consists in finding some optimal estimate of the object, given a sample image and a (model or experiment based) transfer matrix.
The tranfer matrix and the noise may be characterized in a signal processing sense by the transinformation.
One popular method is the maximum-likelihood expectation-maximization (MLEM) algorithm, which maximizes the likelihood for a Poisson data model. Sometimes, in optics, it is called the Richardson-Lucy algorithm.
|Jun-15-18|| ||Gypsy: <all> Not ignoring your points, just swamped by a friend's wedding tomorrow|
|Jun-19-18|| ||Gypsy: <Jambow> We completely agree on that: Elo predicts 'diddly squat' when the difference of ratings is small. However, I don't want us to lose sight of how well Elo does in general, despite of the facts that it throws practically all information about each game away; all of it, except the final result.|
|Jun-19-18|| ||Gypsy: <Tiggler: I am thinking about a problem that might be analogous, ...> I think you are correct. Your's is one of the 'deconvolution' type of problems. The go under many monikers: independent component analysis (different from principal component analysis), projection pursuit, and so on ...|
For these problems, some entropy based approach is usually a theoretically satisfying way to go. Sometimes, something like kurtosis (the 4th statistical cumulant, k4) is in fact used, but that is because it is a sort of poor man's entropy (in this case, it tends to be the first term of entropy cumulant expansion that matters).
Another place where entropy based methods seem to pop up a lot is when you deal with vectors in the first orthant and the world you are looking at is sort-of multiplicative, the quantities scale up or down a lot, and 'information' seems to be of importance (or at least of use).
|Jun-19-18|| ||Gypsy: I should clarify that <lastplanet> is not my own alter-ego but the alter-ego of my spousal unit. While we do most of things together, we each work on our own math problems. Fortunately, when she creates something nifty, like the <lastplanet> site, I am the person to to brag to. :-) Thus, I know the basics of the way computations work there.|
|Jun-19-18|| ||Gypsy: Here is a basic scheme for turning the moves (also a 'score') of a chess-game or chess-game fragment into a (soccer-like) score. It utilizes a chess-engine as an auxiliary oracle. |
X = [X0, X1, X2, ...]
be the sequence of position-evaluations by an engine, say, Stock-Fish.
x = diff(X) = [X1-X0, X2-X1, X3-X2, ....]
y = x^+ = max[x(i),0] ... for all i
z = z^- = max[-x(i),0] ... for all i
be the positive part and the negative part of x, respectively. (Note that both, x^+ & x^- are non-negative sequences, that x = (x^+) - (x^-), and that the indicated max operations has been taken taken element-wise over the entire sequence/vector.)
We define the game/game-segment (running) score by
Y = cumsum(y)
Z = cumsum(z)
and the overall score as
Y(end) : Z(end)
(Numerous variations of, as well as some objections to, the basic scheme will readily spring to mind; I suspect. We shall examine them later.)
< Earlier Kibitzing · PAGE 18 OF 18 ·
Bobby Fischer Tribute Shirt