ARCHIVED POSTS
< Earlier Kibitzing · PAGE 704 OF 1118 ·
Later Kibitzing> |
| May-18-14 | | zanzibar: By the way, while I'm here. Is it possible to get a list of all player's names on <CG>? Maybe with pid? And are all living player's bios linked to their FIDE card? And for listing uncertain deaths, can you use circa, or a date like 1890(?) with best estimate? |
|
May-18-14
 | | chessgames.com: <john barleycorn: this page looks strange - 2 players a result and no game> Oh, I see what you've done. You included "year=2000" in your search, so it correctly gives no games and also shows their head-to-head score. Probably what would remedy that anomaly would be to state "Year search: 2000" at the header to make it perfectly clear what that page represents. <zanzibar> I'm looking at the code now. You are right, the Har-Zvi into HARZVI transformation is a bit odd. I'm trying to remember the rationale for doing that. I remember something about players who had names that could be written with-or-without an apostrophe, and the dash got swept up in the same kind of logic. Just to clarify something, the advanced search results on names will do one of three things: 1. Find a player with that exact name and assume that's what you want. 2. Find several players and list them.
3. Admits that it doesn't really know who you are looking for and gives you every game that includes a player name even similar to what you typed (even if it only matches a first name). For the notable players like Smirin, the simplest way is to use the pulldown and locate him by player ID#. (Pro tip: there is an option in your Chessgames Preferences Page that allows you to see a much longer list of players.) |
|
May-18-14
 | | chessgames.com: <zanzibar>
<Is it possible to get a list of all player's names on <CG>? Maybe with pid?> Yes, it's possible. A very elementary perl script could output that. Nobody's ever asked before. <And are all living player's bios linked to their FIDE card?> Probably not. Some of our more enthusiastic game uploaders wait to see when their submissions get processed and immediately assign FIDE ID numbers to the new player entries. However it's hard to imagine there isn't a single person who got overlooked. Now that FIDE is keeping track of people under 2000 the number of entries in their database has exploded. <And for listing uncertain deaths, can you use circa, or a date like 1890(?) with best estimate?> Whether births or deaths, the database allows for uncertain months and days, but not uncertain years. Specifically, it accepts: YYYY-MM-DD
YYYY-MM-00
YYYY-00-00
In other words, if it's circa 1890 then you'll have to put in 1890-00-00. Then it's only proper to explain in the biography that the year of his death is not known with certainty, perhaps right up at the top where the birth and death years are displayed. |
|
| May-18-14 | | zanzibar: <chessgames.com> & others. Many thanks. It's a little too late, and I'm a little too tired to add more today. |
|
May-19-14
 | | Stonehenge: Duplicate games, both have incomplete game scores:
M van 't Kruijs vs Anderssen, 1861 and M van 't Kruijs vs Anderssen, 1861 Full score:
[Event "Amsterdam"]
[Site "Amsterdam"]
[Date "1861.07.13"]
[Round "?"]
[White "van 't Kruijs, Maarten"]
[Black "Anderssen, Adolf"]
[Result "1-0"]
[ECO "A00"]
1. a3 e5 2. c4 Bc5 3. Nc3 a5 4. e3 Nc6 5. Nge2 d6 6. d4 Bb6 7. Na4 Ba7 8. d5 Nce7 9. b4 f5 10. Nec3 Nf6 11. Be2 O-O 12. Nb5 Bb8 13. Nbc3 c6 14. b5 cxb5 15. cxb5 b6 16. Bc4 Ng6 17. O-O f4 18. Bd3 Nd7 19. Qh5 Qe8 20. Ne4 Bc7 21. exf4 exf4 22. Bb2 Nde5 23. Ng5 h6 24. Bxg6 Nxg6 25. Rfe1 Ne5 26. Qxe8 Rxe8 27. Bxe5 dxe5 28. Ne4 Bd7 29. Nac3 a4 30. Rab1 Rec8 31. d6 Bd8 32. Rb4 Kf7 33. f3 Ke6 34. Rd1 g5 35. g4 fxg3 36. hxg3 h5 37. Kf2 Ra7 38. Ke3 Be8 39. f4 exf4+ 40. gxf4 g4 41. Rd5 Bf6 42. f5+ Kf7 43. Nxa4 Rb8 44. Nxf6 Kxf6 45. Kf4 Rg7 46. Nc3 Bd7 47. Ne4+ Kf7 48. f6 Rgg8 49. Rxh5 Rh8 50. Re5 Rbd8 51. Re7+ Kf8 52. a4 Rh1 53. Rb2 Be8 54. Rd2 Bf7 55. Kxg4 Bb3 56. Ng5 Ra8 57. Nh7+ Kg8 58. Ng5 Rxa4+ 59. Kf5 Rf1+ 60. Kg6 Ra8 61. Rg7+ Kf8 62. d7 1-0 |
|
May-19-14
 | | chessgames.com: M van 't Kruijs vs Anderssen, 1861 happened to have a correction slip from 2008 with exactly those extra moves. Thanks. |
|
| May-19-14 | | zanzibar: This was discussed on the bistro, but I'm not sure I was quite clear enough about it. Doing an <Advanced Search> with two players: (1) <Van der Werf>
(2) <Winants>
(Note: any year)
should yield 0 hits, since <CG> has no games between these opponents. Oops! Good thing I checked - looks like my submission from last night was already processed. Good work! Let's, ahem, carry on.
ttp://www.chessgames.com/perl/chess.pl?yearcomp=-
exactly&year=&playercomp=either&pid=&player=van 
der+werf&pid2=&player2=winants&movescomp=exactly-
&moves=&opening=&eco=&result=
http://www.chessgames.com/perl/ches... This yields 21 games. That's 20 games more than should be there. OK, <CG> is being generous with <Van der>, unfortunate complicating the lives of millions of Dutch. But, heck, they asked for it with these multi-word last names! (Still, the <CG> behavior is a little non-intuitive I think). The usual "geek-solution" is to "quote" the search term to eliminate wild-carding and force an exact match. That doesn't work here though, as <CG> just strips out the quotes and returns the same results. OK - what surprises me is that <Werf> is last, and without a comma in the search term, should be given primacy as the search match. Now, it's been suggested that I just search on a piece of the name, which would be <Werf>. But what if the name was <Van der Wolff>, then I would be picking up a lot of spurious hits. OK, what's my point? Well, my main point, besides documenting some of this behavior, is to advocate for the more exact search with quotes. You could still have extra characters in the name, but, if doing an exact search term, the exact search term must also be present. Yes? |
|
May-19-14
 | | chessgames.com: <the <CG> behavior is a little non-intuitive I think> That's an understatement, and it's a problem we've wrestled with for years. One thought was to make an "implicit AND" so that "Van Der Werf" gets translated into (MATCHES "VAN") AND (MATCHES "DER") AND (MATCHES "WERF") as opposed to what we have now
(MATCHES "VAN") OR (MATCHES "DER") OR (MATCHES "WERF") But then we run into problems when people do searches on "Gary Kasparov" and find no results, because we spell it "Garry". And we get searches like that all the time. Even a search for "Bobby Fischer" technically doesn't match our database since we have him listed as "Robert James". A hack which might help would be to strip out all very common strings, like VAN, VON, DER, etc. before the search is performed. Not much of a solution but it would help in this case and similar ones. Implementing quoted-string searches is very complicated and I don't know if we want to take that step. |
|
May-19-14
 | | SwitchingQuylthulg: How the Advanced Search works seems to depend on whether it's considered as a player search or a game search. If you only input one player, e.g. "Mark Werf", with nothing else (year, opponent, result etc.) given, it's to some extent interpreted as an AND search: thus, you get Mark van der Werf and no one else. (More precisely, it's interpreted as "Mark" plus at least one of the following words, but with just the one following word that boils down to the same thing.) If there are multiple results for that single input, the player listing and the game listing will be based on different logics: the player listing will be based on the above logic and the game listing will be based on AND/OR logic. Thus, a search for "geert peeters danneel" (http://www.chessgames.com/perl/ches...) will give Geert Peeters and Geert Danneel as player results while listing all games by anyone named Geert, Peeters or Danneel. <chessgames.com: But then we run into problems when people do searches on "Gary Kasparov" and find no results, because we spell it "Garry". And we get searches like that all the time. Even a search for "Bobby Fischer" technically doesn't match our database since we have him listed as "Robert James".> Build a database of valid search terms for each player (or players for each valid search term) and include "Gary" as a valid search term for Garry Kasparov. A database like that would solve any number of spelling issues (the same way redirects do at Wikipedia); it would no longer matter if you spelled Kļaviņš as Klavins, Kliavin, Kljavinsh or Klyavin, you'd still get him as a result. And if you let editors help build that database, it would be reasonably complete pretty soon and without requiring that much effort from your side. |
|
| May-19-14 | | zanzibar: <chessgames> yes, there are a number of approaches. I'm not sure if the quoted string approach would be so complicated, but I'd have to review my MySQL. A bit rusty in that department, I am. The common conjunction omission (van/der/von/y/de) is similar to what Google does. It's actually a good idea for a minimal fix of the current system I think. Another hack approach, that jumps out to me anyways, is to allow a comma in the search string. Then <collation name>, <prenom> would really map to (MATCHES "<collation name>"). Since, er, non-experts would be unlikely to use a comma they wouldn't be exposed to the stricter searching behavior. Etc. etc. We won't solve the problem here.
You're aware of it, and I'll pop in here when I find "illuminating" examples of it from time to time. If you do come up with some other ideas though, please let us know! Thanks. |
|
May-19-14
 | | chessgames.com: I could chat all day about how quoted string searching works, the first search engine to allow for it on the net was Altavista and it was regarded at the time as revolutionary. What it boils down to is creating a dictionary of every name that appears, and then making a table that says "in this player, A follows B" where A and B are both names. If you are dealing with a lexicon of size N then the number of rows in this table will be at most N^2 (but realistically much lower since some names are unlikely to follow others.) Then if you do a search for "A B C" (with quotes) you end up finding records WHERE (B FOLLOWS A) AND (C FOLLOWS B). If you quote 10 things you'll have a search with 9 clauses, no big deal. All of this could be done (I actually implemented a system like that years ago) but the number of people who would even use quoted searching is very few, and without addressing some more fundamental problems I think it would be a waste of effort. Not to mention the countless gigabytes to store N^2 names. <Build a database of valid search terms for each player (or players for each valid search term) and include "Gary" as a valid search term for Garry Kasparov. A database like that would solve any number of spelling issues (the same way redirects do at Wikipedia); it would no longer matter if you spelled Kļaviņš as Klavins, Kliavin, Kljavinsh or Klyavin, you'd still get him as a result.> This is the true solution.
In a way, we've already collected a wealth of data on the subject. Every time we get a game attributed to a new way to spell Nimzowitsch, it gets added to a table. Then if we ever import a game with a player spelled exactly like that it assumes it's Nimzowitsch and we don't have to go through the player merging process yet again. For example, suppose somebody is named James Smythe and variations of his name appear as "Jim Smith", "Jim Smyth", "James Smith", etc. A table could know about every name in the database and which ones are "the same", that "Jimmy" and "James" and "Jim" are all the same name. "Petroff" is the same as "Petrov", "Anatoli" is the same as "Anatoly", and on and on. That way if somebody enters any combination of (James, Jimmy, Jim, Jimbo) + (Smith, Smythe) they get the player. It could even include common misspellings: "Fischer" is the same as "Fisher". |
|
May-19-14
 | | chessgames.com: There are some hacks in place on the EZ Search for only for a handful of specific and notoriously hard to spell players. For example, even these half-hearted spelling attempts work: search "Dzinzihashvilli" correctly goes to Roman Dzindzichashvili search "Nimtzovisch" correctly goes to Aron Nimzowitsch And so forth.
I stopped adding these hacks, because clearly adding a few clauses for each hard-to-spell player name is not the way to solve this problem. A more general solution like suggested by Switching is needed. |
|
May-19-14
 | | chessgames.com: OK, here it is with what we now call the "Vandervon Hack". A search on <van der Werf> http://www.chessgames.com/perl/ches... Exactly the right output. |
|
| May-19-14 | | zanzibar: <chessgames> just tuned in. And yes, that does look better. Do the AdvSearch on <Van der Wiel> vs <Van Wely>, it looks good too: http://www.chessgames.com/perl/ches... Ironically, I was just just thinking of hacking my tournament search builder to switch over to the pid's, having used the preference option to get the longer list in the drop-down box (after your suggestion). I edited the html source to strip out the name/pid pairs, and am building a python data structure to map SCID name->pid. Then I'll use the pid-pair of the two players in a game to get the more specific game(s) list from <CG> in my automated (i.e. software driven) searches. I think the <Vandervon Hack> is productive, and hopefully won't break any expected behavior. It shouldn't, at least until M. Von der Van gets a FIDE card! Aside- Just curious, doing an AdvSearch on <Byrne> I see a gold star next to Donald's name. What does it mean? Also, shouldn't you indicate which Byrne is given preference? Finally (for now!) - why is it <Li, Chao> on the one hand, and <Bu Xiangzhi> on the other? |
|
| May-20-14 | | zanzibar: <RE: Nimzowitsch's name> (Or, why is it the <Nimzovich Defense> and not the <Nimzowitsch Defense>?) The more detailed explanation of how all the different spelling arose, that I've been able to find, is here: http://home.swipnet.se/~w-148618/sp...
The footnote at the bottom about the otherwise mysterious J. Hannak finally allowed me to find out who Lasker's biographer was. (Hannak wrote what is probably the most famous biography about Lasker, yet try to find out who exactly Hannak was. It's not easy.) |
|
May-20-14
 | | chessgames.com: That star by Byrne's name is the <Chessgames Star of Megaimportance>. It means that the player doesn't have a rating, but we recognize them as an important player. (That name is pejorative as some very obscure and not very strong players have the star.) It was explained ages ago in a post addressed to Stonehenge found here: chessgames.com chessforum. |
|
| May-20-14 | | zanzibar: <chessgames> thanks, an entertaining read from the past. |
|
| May-20-14 | | LIFE Master AJ: BUG
The new PGN player. <default?> The last move is too far to the right. Is it possible to re-position that area? The reason that I ask is that playing through the game now has the bloody board dancing all over the screen, it is MOST annoying! |
|
May-20-14
 | | Stonehenge: It's also kinda hard to find players like Emre Can and Martin Severin From. Searching for Can or From doesn't give any results. Also 'May' doesn't find May Aung Hlaing. |
|
May-20-14
 | | SwitchingQuylthulg: <Stonehenge> Yeah, that would easily get my vote for most annoying search-related problem; I don't see the point in having the Advanced Search filter stop words. It's not just a couple players, either - almost 200 players are affected by this issue, and that's without considering names shorter than 3 characters (which are unsearchable whether or not they're also stop words). |
|
| May-20-14 | | zanzibar: <RE: Player/Game lookup> If <CG> provided a page (or pages) with a list of all players similar to the lists FIDE provides, and also allowed CG-pid entry in a name field - much more exact and specific searches could be done. Again this is similar to FIDE's lookup. A number is easy for software to identify and parse, and would uniquely identify a player. A master list would allow any user interested enough to dig up the pid. FIDE provides this (listing name/title/rating/Federation/dob/etc): http://ratings.fide.com/download.ph...
Maybe <CG> could do the same. I'm actually working on some python code that will do <CG>/<FIDE> lookups as well. |
|
May-20-14
 | | chessgames.com: It would be almost trivial to allow numbers in name searches. We did something like that last year when we allowed GIDs in the EZ Search (to aid with the Holiday Present Hunt). Not many people would be advanced enough to use the feature but those who did would certainly appreciate it. About allowing a download of a list of all players, with PIDs and other relevant information: that's a distinct possibility. However, we have always been somewhat guarded about dumping out our database en masse, as no doubt competitors will download it and use it as a starting point for their own ventures. But the player list is very specific data that doesn't serve much purpose without the rest of the database. <LIFE Master AJ: BUG The new PGN player. <default?> The last move is too far to the right. Is it possible to re-position that area?> I'm not sure what you are saying; can you describe it in more detail or perhaps obtain a screen shot of what you are seeing? |
|
| May-20-14 | | zanzibar: <chessgames> yes, doing the pid lookup would be easy, and appreciated by some - even though it would be low volume. (You already can do it via the url)
The user download would be convenient - but maybe just presenting a way to display a list of all players online is just as good (it could be divided in pages as other sites do). The issue would be one of sorting mostly (sortable columns being the slick way to go these days). A secondary issue is what columns to display - name, country, title(s), dob, ATH rating, active period, number of games come to mind at first thought. |
|
May-20-14
 | | Domdaniel: While we're on the subject of players who don't show up in searches ... I recently tried to search for a player named <O'Hare> - who has several games in the database, and is a corr IM.
But whether I tried <O'Hare> or <Hare> I didn't get his games - in either case I was sent to games by <Hare>, not what I was looking for. What's the solution, please? |
|
May-20-14
 | | Domdaniel: The player I refer to above is Ciaran O'Hare -- I beat him in an OTB game back in 1976, and he returned the favour later. And yes, search by first name turns up the right person - but this doesn't solve the surname problem. |
|
 |
 |
ARCHIVED POSTS
< Earlier Kibitzing · PAGE 704 OF 1118 ·
Later Kibitzing> |
|
|
|