ARCHIVED POSTS
< Earlier Kibitzing · PAGE 733 OF 1118 ·
Later Kibitzing> |
| Aug-22-14 | | Shams: Out of curiosity, how many collections must a game be included in before it reaches "notable game" status? I ask because this sublime game does not appear in a list of the White player's notable games (indeed, there is no such list.) F Parr vs G Wheatcroft, 1938
By my count it is in twenty-six collections. If only a couple more collections are needed, I can get to work on them. :) |
|
Aug-22-14
 | | SwitchingQuylthulg: <Shams> The limit, I believe, is 2. The problem here is that the notable games list is part of a special header that only some players have. To qualify for the special header, a player must have at least 25 games in the database, a criterion that Frank Parr (with 104) passes with flying colors. However, they also need to have either a highest rating of at least 2320 (don't ask) or the Chessgames Star of Megaimportance, and Parr doesn't have either of those. No special header, no notable games list; it wouldn't change anything if you added that game to another 100 collections. |
|
| Aug-22-14 | | Shams: <Switch> Harumph. Sounds a bit Old World Aristocratic to me*. Parr was clearly above FM strength in his heyday. Well, what can you do. [*or star-bellied Sneetchy] |
|
| Aug-23-14 | | zanzibar: This kinds of errors are getting harder to catch... how did it slip past <CG>'s internal consistency checks? A game with an orphaned name apparently, see following comment: N A Hussein vs J D Gemy, 2014 |
|
| Aug-23-14 | | zanzibar: Another one...
F Amonatov vs A R Saleh Salem, 2014 Admittedly the OLM is a huge tournament and difficult to error check. But there is still a lot of cleanup here - I'll have to collate the rest (if any) and post the cumulative results. * * * *
OK, there are indeed lots of them, and so I can't list them all at this time. Is it possible to ask <CG> to ensure that all player names in a downloadable game match a player name in playerlist.txt? (One could ask <CG> to make the player id available in the download, but then, in that case, one would still require the id in any downloadable game then also be in playerlist.txt) This tournament has been over for a bit, so that one might hope it stable and correct at this point. We seem to have the players correct, but in doing so, orphaned the PGN headers. (Where stable and correct means matching the FIDE downloadable version, I suppose.) |
|
| Aug-23-14 | | zanzibar: Well, using the latest playerlist.txt and Tromso Open 2014 I find 175 non-matched player names. In a sense, cleaning up the playerlist without cleaning up the game PGN has made my job harder. |
|
| Aug-23-14 | | chessmoron: Less than ONE WEEK away from Sinquefield Cup 2014. Time to put a page up. <CG>. Thanks. |
|
Aug-23-14
 | | chessgames.com: <zanzibar> Sorry if I'm being dense but what's the problem with F Amonatov vs A R Saleh Salem, 2014? I see that it came in as "AR Saleh Salem" but our translation table contains that entry and cleverly linked it to A R Saleh Salem. That is the player with Black, right? So what's the problem? |
|
Aug-23-14
 | | chessgames.com: Regarding N A Hussein vs J D Gemy, 2014:
You ask <How can White be <A H Al-Ali Noah> in the PGN (the game download and view) but <Noah A Hussein> here?> There is more than one explanation for problems like this, sometimes it can be traced to our own software, but in this case I think the answer is human error with player editing. Look at the page for Noah A Hussein. Look at the FIDE card. It reads "Noah, A .H. Al-Ali". So an incorrect FIDE number caused a game that should have gone to to Al-Ali to instead go to Noah Hussein. The software is not perfect, but you can't hang this one on the algorithm. It did everything perfectly; FIDE said it was "Noah, A .H. Al-Ali #4800249", and it really was. So our software looks up #4800249 and finds Noah Hussein. A FIDE # match trumps a name match, so the identification was regarded as complete and authoritative. This means other games from Noah Hussein might belong to this fellow as well. |
|
Aug-23-14
 | | chessgames.com: <chessmoron> Agreed, that's overdue. This is going to be a huge. And by the way, yes of course we'll be covering the top board live each day. |
|
| Aug-23-14 | | zanzibar: <chessgames> If I navigate to a page for a game, and it shows: <<Player A> vs <Player B>> shouldn't the PGN for the game be:
<[White "Player A"]> <[Black "Player B"]> Similarly, if I navigate to a page for <Player A> shouldn't all the games listed for this player have PGN with <Player A>? Unless you make available the controlling info (like the CG id or FIDE id) we lowly users can only work off the names. Now, I have the software to be able to work off the FIDE id, or the CG id - but most users don't have this capability. So even if <CG> did make the id info available, it's the player name that is referenced after downloading a game. In general, the average user's chess database program (SCID or Chessbase) will also only work off the PGN names for the players. For example, how can anyone get an accurate cross-table from a tournament where the same player is listed under different names in the PGN? I've been accused of being thick before, but I think it's a problem for <CG> not to be consistent with a player's name. Not for the <CG> collections or pages, which use the id anyways, but downstream - for the users after a PGN download. |
|
| Aug-23-14 | | zanzibar: <chessgames> - to continue... Now, according to your 2nd post the FIDE card is supposed to be <complete and authoritative>. OK, now take a look my post for this player:
Said Ahmed Ali Jidal Fadhil (Said Ahmed Ali Jidal Fadhil) All his games come from Tromso Open 2014, so there really should only be one version for the name, based on his FIDE card name since <CG> has no pre-existing name for the player. Instead there are three different version of the name for the same player: Names
<
Said Ahmed Ali Jidal Fadhil (CG version)
Ahmed Ali Jidal Fadhil Said
>
Games
<
2014.08.02 (R1.81) 1-0 Carlos Paul Abreu Jean -- Said Ahmed Ali Jidal Fadhil (A41) 292014.08.03 (R2.51) 0-1 Said Ahmed Ali Jidal Fadhil -- Leykun Mesfin (E60) 40 2014.08.05 (R4.56) 0-1 Ahmed Ali Jidal Fadhil Said -- Khalil Bengherabi (E09) 45 2014.08.06 (R5.63) 1-0 Uaychai Kongsee -- Ahmed Ali Jidal Fadhil Said (B07) 47 2014.08.08 (R6.73) 0-1 Cheda -- Said Ahmed Ali Jidal Fadhil (B07) 44 2014.08.10 (R8.61) = Charles Sidney Eichab -- Said Ahmed Ali Jidal Fadhil (A10) 41 2014.08.11 (R9.85) 1-0 Igor Yarmonov -- Said Ahmed Ali Jidal Fadhil (A46) 56 2014.08.14 (R11.59) = Ahmed Ali Jidal Fadhil Said -- Eduardo A Pascoal (E01) 44 >
And if you look at the <CG> names used for Said Ahmed Ali Jidal Fadhil, it's curious that <CG> seems to be flipping the names randomly with round number. (Correct - R1, 2, 6, 7, 8 Incorrect - R4, 5, 11)
Maybe the first couple of rounds might have gotten misnamed when <CG> was working without the new FIDE id tag. But I pointed out the correction by round 2 or 3. So how could this happen? |
|
| Aug-23-14 | | zanzibar: And while I'm on a roll, here is a list of a <CG> players with three names in the PGN download from the <Tromso Open OLM 2014>: <Al Amri Salim
2 ['Al Amri, Salim', 'Salim Al Amri']
Alejandro Montalvo
2 ['Alejandro Montalvo Rosario', 'Montalvo Rosario, Alejandro'] Alexis Murillo
2 ['Alexis Murillo Tsijli', 'Murillo Tsijli, Alexis'] Andres Guerrero
2 ['Andres Guerrero Vargas', 'Guerrero Vargas, Andres'] Bernal Gonzalez
2 ['Bernal Gonzalez Acosta', 'Gonzalez Acosta, Bernal'] Gondo Simplice Armel De
2 ['De Gondo, Simplice Armel', 'Simplice Armel De Gondo'] Herman Ho Hou-Meng
2 ['Herman Hou-Meng Ho', 'Ho Hou-Meng, Herman'] Kouko Hubert Ble
2 ['Ble Kouko, Hubert', 'Hubert Ble Kouko']
Lisandro Munoz
2 ['Jose Lisandro Munoz Santana', 'Munoz Santana, Jose Lisandro'] Luis Esquivel
2 ['Esquivel Golcher, Luis', 'Luis Esquivel Golcher'] Moawia Ahmed Holi Ali
2 ['Ahmed Holi Ali, Moawia', 'Holi Ali Moawia Ahmed'] Orlando Santana
2 ['Orlando Santana Otero', 'Santana Otero, Orlando'] Salim Al Amri
2 ['Al Amri Salim', 'Al Amri, Salim']> |
|
Aug-23-14
 | | chessgames.com: <So how could this happen?> I'm not sure, but I can certainly suggest a possible way for it to happen. Consider: call the three names for one of these players A' A'' and A''' (Said Ahmed Ali Jidal Fadhil comes to mind) Round one: FIDE sends us A' without a FIDE number (that we know of) and the software creates a new player record. Round two: FIDE sends us A'' without a FIDE number (that we know of) and the software creates a new player record. Round three: FIDE ends us A''' with a FIDE number and the software still creates a new record because the name is unique. Round four through end: FIDE continues to send us variations of A' A'' and A''' but now we have a working FIDE numbe so they all get assigned to the A''' record. Now we end up with three player records with three different names. A three-way merge is performed (I can look this up in the Librarian logfiles to see if in fact Ahmed was triple-merged) and we end up with one player, but the PGN will show all three variations of the name until pgnfix is run next. Then you examine the player and see what you saw.
Then pgnfix is run and everything is normalized again. I am not 100% sure that's what happened but it would explain everything, right? |
|
| Aug-23-14 | | zanzibar: <Round four through end: FIDE continues to send us variations of A' A'' and A''' but now we have a working FIDE numbe so they all get assigned to the A''' record.> Yes, this is possible, FIDE botched up the names, especially in the early rounds of the tournament. But as I noted for <Said Ahmed Ali Jidal Fadhil> your scenario still doesn't explain the correct R1,2,6,7,8 interlaced with the incorrect R4,5 and 11 results for the name. Unless the <CG> round numbers aren't to be trusted? I still haven't gotten the names tamed to the point to be able to do a secondary comparison of results like the rounds or movelists. I've had to rewrite a lot of my program to do error corrections, most of which must be automated for such a big tournament (>1k games for guided by hand. This was a big job, since I really had to assume both <FIDE> and <CG> could be inconsistent with themselves when it comes to player names. Only the FIDE id could be trusted really. And then, since I was starting to duplicate so much code everywhere, I decided it was finally time to break the programs up into modules. Until you get some good prototypes working it's hard to anticipate what is needed, design-wise. And modularizing everything takes some time to do even with a good design. I was hoping to finally put it all back together for a working pass with the latest downloads - and found another working assumption I made had failed (i.e. all <CG> user names in a PGN download would match a <CG> playerlist record). The final point is that pgnfix should have indeed normalized everything by now, since the tournament ended what, two weeks ago? |
|
Aug-23-14
 | | chessgames.com: <The final point is that pgnfix should have indeed normalized everything by now, since the tournament ended what, two weeks ago?> That is indeed perplexing. I have an idea but it doesn't quite match the evidence. It might have something to do with a human error that occurred yesterday that impacted about 20-30 games. Let me say this much: once fixpgn is run it should be impossible to find a discrepancy between the names in the PGN and the names you see as a blue link on the game pages. That's one of fixpgn's big jobs, to normalize names. (It actually does a list of over 40 things to PGN data, but normalizing the names is one of the most obvious and important changes.) So for right now, it should be impossible for you to find a discrepancy with the possible exception of the very newest games, e.g. the French Championships (2014). I don't have a good answer for why it was possible for you to find a discrepancy yesterday or earlier today. Like I said, I have a hunch the human error the other day might play a role. At least then we can chalk it up to a fluke. I apologize if this is tripping up your own analysis work. |
|
| Aug-23-14 | | zanzibar: <chessgames> No problem then, if we agree on the expected behavior (said with relief). I don't have a big rush on getting the corrected PGN - today, tomorrow or the next day are all fine. I suppose it's a good sanity check that I should have in place anyways, to check for unexpected CG names. Maybe I should factor out my name matching routine so that it can work against either the CG names and FIDE names. PS- Some of the unexpected <CG> names look as if the <FIDE> names were used verbatim - since they have commas in PGN White/Black names. |
|
| Aug-23-14 | | zanzibar: A quick followup - as it seems <CG> has corrected some (all?) of the 2014 Tromso Open OLM. Specifically I was looking at <Mohammad Younus>, a Pakistani player whose <CG> name didn't match any of the PGN files I downloaded the other day from the tournament (previously only <Younus Mohammad> and <Muhammad Younus Younus> Checking today, all his 2014 tournament games look OK. But.... I also peeked at his 2006 Olympiad games, which have PGN tags listing him as <Muhammad Younus Younus>. The same issue arises, not just for the one tournament, but for all tournaments for a given player. At the risk of being redundant...
<Shouldn't all games listed under a <CG> player's name have the same name as used on the profile page used in the PGN?> E.g.
Muhammad Younus <Player Profile> http://www.chessgames.com/perl/nph-... <Muhammad Younus Younus> as Black. Can it be that nobody has complained about this before?! |
|
Aug-23-14
 | | chessgames.com: I misspoke when I said it would be impossible for you to find a discrepancy. The only way you could find a discrepancy if you located a game of a player who's name recently underwent a change, and who had games dating way back (long before recent Olympiads etc.) And indeed <Younus Mohammad> is just such a case, because we just fixed his record a few days ago. As you know we had both <Younus Mohammad> and <Mohammad Younus Younus> in the database. The Librarian noted that FIDE prefers the single-Younus version, so that's the one we settled on. <Can it be that nobody has complained about this before?!> I don't recall the details but I would imagine that the new "Younus Younus" entry just appeared with the Olympiad, so until recently there was nothing to complain about, right? The only reason why you see a discrepancy on game #1418283 is not because the PGN has been inconsistent for years, but because the name changed in the past few days. Reconciling that is what fixpgn does, and I'm running it right now, from start to finish. |
|
| Aug-24-14 | | zanzibar: <chessgames>
<<Can it be that nobody has complained about this before?!> I don't recall the details but I would imagine that the new "Younus Younus" entry just appeared with the Olympiad, so until recently there was nothing to complain about, right? The only reason why you see a discrepancy on game #1418283 is not because the PGN has been inconsistent for years, but because the name changed in the past few days.>I agree with your analysis for this case.
But try to rephrase the question more generally - after a little setup. After you do the big run fixpgn the database will be properly normalized. But how often is that done? Not daily apparently, as the Younus example shows. (I don't exactly how regularly fixpgn is run over the entire data set, it seems I have the impression that it was targeted at the new games from recent tournaments.) I think the Younus example shows that <CG> might also want to run fixpgn immediately over the set of games of a player whose name is being changed. My question was for Younus, but is really more general - since I assume that <CG> has modified other player records in a similar manner. So again, while a whole-data-set fixpgn run fixes this problem, the Younus shows that such a run isn't done on a daily basis. Which means that <CG> can be inconsistent with itself for time-frames large enough for people to see it. Again, it seems the simple fix is to formalize the procedures when changing player data, ie to immediately normalize the subset affected by the changes. |
|
Aug-24-14
 | | chessgames.com: I know what you're saying, and I imagine you know what the expression "dirty bit" means? It's just a bit that flags records that need to be tended to by some other process, in this case fixpgn. New entries would automatically have their dirty-bit set, but other instances could set the bit as well. If the Librarian changes a player name, we set the dirty-bit for all of their games. Then fixpgn would simply process every game with a dirty-bit and stop. Running fixpgn "on the fly" isn't a hot idea, because it loads a lot of stuff into RAM. So it's blazingly fast when it fixes 10,000 consecutive records at once but very inefficient when you tell it to tend to a specific game. The dirty-bit sounds great in principle—but in practice, there are so many events that can require a game to be changed, it would be almost impossible to set the dirty-bit properly in every single case. You are very name-focused, but fixpgn has to worry about many other changes: the definition of an ECO code, or if it spots a 0-0 (zero-zero) castle and wants to replace it with the more proper O-O, or if an editor added or deleted a PGN comment and thus the moves require better word-wrapping. So the dirty-bit might help but I would still imagine from time to time we would have to run the "full scan" from start to finish. And that would still be a little better than what we have now. |
|
| Aug-24-14 | | zanzibar: <You are very name-focused, but fixpgn has to worry about many other changes: the definition of an ECO code, or if it spots a 0-0 (zero-zero) castle and wants to replace it with the more proper O-O, or if an editor added or deleted a PGN comment and thus the moves require better word-wrapping.> I admit to the name focus focus, since so much hinges on the name and finding the right name is such a challenge. I think doing the ECO code can be a challenge, but it's not so important in the end, as <perfidious> and <phoney> point out it can do funny things sometimes - which we note and discuss, but no real damage results. (Doing the ECO code probably is the toughest think to do code-wise, but I don't know, never thinking much about it. I normally let SCID do this for me) The cleaning of PGN is important, but I think somewhat mechanical and straight-forward. Maybe one of the easier tasks. But the name is important, maybe taking precedence. It can mess up all the stats, the crosstables, etc. If I don't know who played the game it loses much of its value. If I could be so bold, maybe fixpgn is too heavy for the light-weight name-synchronization task? In this one case you just want to propagate a librarian name-change to the White/Black pgn fields where you already know the gid of the games involved. Not a hard task, and one where I would think doing it asap would be best. (Yes, I know the dirty-bit concept - which would suggest to me running fixpgn on the fly when a user goes to download the stale cached version of the PGN. Far, far, beyond the level of sophistication I was thinking!) |
|
| Aug-25-14 | | Boomie: <CG>
On the Search page:
<Riga Technical University Open
Riga, Latvia
Aug 15-24
Alexey Aleksandrov wins with 8/12, beating out Rapport, Malkumyan, Fridman, et al.> However, this "result" is due to duplicate games entered for Aleksandrov. Notice there was only 9 rounds in the tournament. In his game list:
8. A Aleksandrov vs A Kveinys ½-½ 15 2014 Riga Technical University Open C95 Ruy Lopez, Closed, Breyer 9. A Aleksandrov vs A Kveinys ½-½ 15 2014 Riga Technical University Open C95 Ruy Lopez, Closed, Breyer 10. A Aleksandrov vs A Kveinys ½-½ 15 2014 Riga Technical University Open C95 Ruy Lopez, Closed, Breyer 11. A Aleksandrov vs A Kveinys ½-½ 15 2014 Riga Technical University Open C95 Ruy Lopez, Closed, Breyer |
|
Aug-25-14
 | | Chessical: <Chessgames.com>: A plea. Would it be possible to find time to include the following in the database so that I may submit: Game Collection: Hastings (1981/82) 1. Ree vs Lein
2. Littlewood vs Mestel
3. M Rivas Pastor vs Lein
4. Chandler vs Ree
5. Lein vs Littlewood
All the above were submitted Saturday, 26th July 2014. Many thanks. |
|
Aug-25-14
 | | chessgames.com: <Boomie> We're terribly sorry--we should have noticed that. It's now corrected. <Chessical> No problem, it will be given priority. |
|
 |
 |
ARCHIVED POSTS
< Earlier Kibitzing · PAGE 733 OF 1118 ·
Later Kibitzing> |
|
|
|