chessgames.com
Members · Prefs · Laboratory · Collections · Openings · Endgames · Sacrifices · History · Search Kibitzing · Kibitzer's Café · Chessforums · Tournament Index · Players · Kibitzing
 
Chessgames.com User Profile Chessforum

chessgames.com
Member since Jun-19-02
no bio
>> Click here to see chessgames.com's game collections.

Chessgames.com Full Member

   chessgames.com has kibitzed 13275 times to chessgames   [more...]
   Feb-15-21 chessgames.com chessforum (replies)
 
chessgames.com: Dear Chessgames.com members: We've recently become aware of a technical difficulty with the "engine" server, which is used for game/move analysis. It appears that a hardware failure may be responsible for making the analysis engine unavailable. We're actively ...
 
   Jan-22-21 Santa Claus (replies)
 
chessgames.com: Dear chessgames members: Santa Claus <finally> got around to sending us his list of lucky winners for this year's "Dear Santa" contest! We thank Santa for his diligence, and have learned that his tardiness in providing his list was <unavoidable> due to ...
 
   May-31-20 Chessgames Bookie chessforum (replies)
 
chessgames.com: <♕♔♕ Bettors and Worse ♕♔♕> As we start this year's ChessBookie cycle with the Summer Leg, I would first like to thank our fearless new Bookie <jingohanson>, who made it possible to continue the game. Next, I hereby announce in ...
 
   Mar-14-20 World Championship Candidates (2020/21) (replies)
 
chessgames.com: Everybody please keep the political bickering off this page.
 
   Feb-22-20 Kibitzer's Café (replies)
 
chessgames.com: May I humbly request a change from REM, <Hazz> You decide. :)
 
   Mar-12-19 Spring Chess Classic (A) (2019) (replies)
 
chessgames.com: We've added the games through Round 9 for the St. Louis Spring Chess Classic (Group A).
 
   Mar-08-19 Prague Chess Festival (Challengers) (2019) (replies)
 
chessgames.com: Games have now been added for the Prague Chess Festival Masters and Challengers sections, and we'll include the Open section results as they become available. For news & details, see the official site at http://praguechessfestival.com/
 
   Mar-08-19 Prague Chess Festival (Masters) (2019) (replies)
 
chessgames.com: Games have now been added for the Prague Chess Festival Masters and Challengers sections, and we'll include the Open section results as they become available. For news & details, see the official site at http://praguechessfestival.com/
 
   Mar-08-19 World Team Chess Championship (Women) (2019) (replies)
 
chessgames.com: Games have now been added for Rounds 1-3 of both the Open and Women's sections of the 2019 FIDE World Team Chess Championship. For news & details, see the official site at http://wteams.astana2019.fide.com/e...
 
   Mar-08-19 World Team Chess Championship (2019) (replies)
 
chessgames.com: Games have now been added for Rounds 1-3 of both the Open and Women's sections of the 2019 FIDE World Team Chess Championship. For news & details, see the official site at http://wteams.astana2019.fide.com/e...
 
(replies) indicates a reply to the comment.

Chessgames Member Support Forum

Kibitzer's Corner
ARCHIVED POSTS
< Earlier Kibitzing  · PAGE 819 OF 1118 ·  Later Kibitzing>
May-19-15
Premium Chessgames Member
  chessgames.com: <Can we think about storing player's names in the <CG> database in sorting order?>

I think about it quite a bit. It's easier said than done.

The core issue here is that, to the surprise of some, we do not have fields called "first name", "last name", "middle name", "maiden name", etc. We just have two fields, "long name" and "short name". The long name is what we show on the player page, and on links to that page, and when they've become player of the day, etc. The short name is what you see in a game list.

The logic behind that design is that how things LOOK is what impresses people. If a list looks like gibberish, people think the data are a mess.

There are some chess database sites, one in particular that I truly admire, that show game titles like <Karpov, Anatoly - Kasparov, Garry>. We hold ourselves to a higher standard. We want it to read <Karpov-Kasparov> in the game lists and links. It's concise, and unambiguous.

So that's the motive: make it look good, and if it doesn't look good we can change the string until it does. It's such a dead stupid system it can't go wrong.

Your request is much more involved. Let's try to formalize it. Perhaps the schema could work like this:

<Let us store names in two fields, called $firstname and $lastname. Then when the PGN presents the player name, it can present the string "$lastname, $firstname".>

A decent start.

Of course, we know that the world isn't as simple as "last name, first name". However we have to try something, so the first name / last name approach seems sensible. To do this, we immediately have to start making conventions for handling names that aren't easily stuffed into the firstname/lastname format.

(If I haven't shared this with you in the past, it's a good read: "40 Falsehoods Programmers Believe About Names" http://www.kalzumeus.com/2010/06/17...)

An obvious starting question would be "what about middle names?" We could decree something like "The middle name, when known, should be appended to the first name after a space." Then we'd make some rule about Spanish names, how to handle Carlos Jesús Torre Repetto. And then Asian names, which present a new twist on what we call a "$lastname" — humorously, it would be the one that comes first.

Then there are a few people like "Garcia" who have no first name, and even some players like "Judy" who have no last name. And then there are computers. And people known only as "Dr. H". We can go down the list and come up with a way to stuff this data into a two-field schema, but the conventions that we have to cook up to make this thing fly will be pretty dense and at times arbitrary.

However, even having done that, we still have the display issue. I don't trust software programmed with first name, middle name, last name, maiden name, etc. to always get it right. In fact, without some special provision it would automatically call our previous world champion "GM Viswanathan". Will it know when to use the Spanish maternal surname and when it's excessive?

I'm not saying that it's a bad idea, I think it's probably a partial solution to the ideal representation. I think the near-perfect system might have representations for all sorts of names, and titles, and flags to tell it that it's an Asian name or a Myanmar name or a computer, and on top of that it should *still* have fields that are strictly for display purposes, input lovingly by hand.

In short, I don't see changes to this part of the database to be made incrementally. We don't start today by adding a last-name and then later make a provision for Spanish or Asian names; if we are going to overhaul the system it should be done all at once.

May-19-15  zanzibar: <chessgames> Just to be clear, I think a first step is much easier.

The name presented outside a list could be exactly the same. In fact, outside the PGN, it could always look exactly the same.

The <display name> is very simply formed from the <collation name>.

A) No comma in <collation name>, then

<display name> = <collation name>

and you're done.

B) If

<collation name> = <part1, part2>

then

<display name> = <part2 part1>

Completely mechanical and uncomplicated. All the intelligence is done by the biographers in forming the <collation name>.

You can sort a <CG> list of players via the <collation name> and present it on the page via the <display name> if you wish.

I'm just asking <CG> to make the intelligence available via the <PGN> by propagating the <collation name> if possible.

There would be one extra step in the stack, the same uncomplicated mechanical step everywhere.

You would then exactly mirror FIDE, and all the rest.

E.g. Chinese names could be westernized or not, according to the player's implicit wishes (i.e. we assume that FIDE entered their version of the name accordingly).

Spanish names would be sorted on the patronymic as they should be.

There isn't need of additional fields in this scheme. Everything is done via the comma, and the vast majority of cases are covered.

Example 1:

Display: <Dr. J. S. Smith Garcia Jr.>

Collate: <Smith Garcia Jr., Dr. J. S.>

Example 2:

Display: <Joe Tsai Wu>

Collate: <Wu, Joe Tsai>

Example 3:

Display: <Wu Joe Tsai>

Collate: <Wu Joe Tsai>

May-20-15
Premium Chessgames Member
  SwitchingQuylthulg: <zanzibar: I'll present some arguments pro-and-con later.

Ideally, there would be a preference switch for downloads on the fly.

Speaking of switches, do you ever use the ZIP file downloads <Switch>?

Those, being statically generated, would have to make a choice or be duplicated. So, it's relevant.>

Yes, I use the Zipfile Archive. If the PGN there starts giving names in collated-style rather than display-style, that will mess with several of my programs.

If that happened I'd probably write a whole new program to change every name in the PGN back to display-style before I did anything else... which is essentially what I'm suggesting you do, just in the other direction.

Note that I'm not opposed to creating a new <collated-style> field for every name, if somebody's prepared to put in the effort; though I'm not sure who that somebody would be, unless it's you. Collating every name accurately would involve humongous amounts of work (if you just copy from FIDE or other databases, you'll copy a lot of errors... and many names are not in any other database). And since <display-style> is what people see, very few users would ever notice that humongous effort, and even fewer would benefit from it in any way.

I am opposed to giving collated names in the PGN, though, even if somebody does put in all that work. With a good index of collated names - which would be a necessity - changing display-style to collated-style would be every bit as easy for you as doing the opposite, especially if <cg> adds pids in the PGN.

<zanzibar: <Abdel> I think everybody gets a vote, btw.

You just have to <"care"> enough to pipe up.

Of course, I'm well aware that there are different weights attached to each vote. No doubt <Switch> is in the heavy-weight bracket, given his long-term commitment and contributions round these parts.>

Ultimately, this is a "one man, one vote" system: Daniel Freeman is the man, and he has the vote. In this particular case I suspect that'sa good thing :)

May-20-15
Premium Chessgames Member
  Tabanus: <downloads on the fly> No thanks.
May-20-15  zanzibar: <Brutus:
There is a tide in the affairs of men.
Which, taken at the flood, leads on to fortune;
Omitted, all the voyage of their life
Is bound in shallows and in miseries.
On such a full sea are we now afloat,
And we must take the current when it serves,
Or lose our ventures.

Julius Caesar Act 4, scene 3, 218–224>

Of course, flood or no flood, Brutus didn't fair too well, did he?

May-20-15  zanzibar: And yes, flies should be swatted upon, not downloaded on.
May-20-15  zanzibar: <Switch> yes, the obvious first step if converting would be to use the FIDE id to look up the FIDE name.

Do you, or does anybody, have an idea of how many <CG> players with FIDE id's have a different name from the FIDE display-name?

May-20-15  zanzibar: <Does anybody track the changes to the <CG> database?>

I mean, besides <CG>?

And even then, if I were to ask you how many players have had there name changed, say from August 2014 to today, could anybody tell me?

Or how many players have been deleted?

Or how many mergers?

<Switch> do you have an idea of the scale of these changes?

* * * * *

If we don't keep the collated-name in the central location, then all the satellite mirrors will be constantly having to chase their tails.

Time after time. There is a definitive issue of stability in this scheme. And everybody who wishes to use collation-style must repeat the work - which isn't simple or mechanical.

But... if <CG> disseminates the collation names, we do the work centrally, and everybody benefits (if they wish to).

Those who don't care about sorting, or correlating with other databases for integrity checks, only need to add a simple mechanical transform step to their code. A step that is almost trivial.

Switch over, and you can almost directly compare a PGN from <CG> with those from <TWIC>, <FIDE>, <MillBase>, <ChessBase>, etc, etc.

Why is <CG> different?

(What about <ChessAssistent>? Anybody know?)

May-20-15  zanzibar: By the way, I'm studying the evolution of <CG>'s database as viewed by a comparison of two different snapshots of playerlist.txt.

Specifically, comparing Aug 23, 2014 to May 15, 2015 - about 3/4 of a year.

I've noticed many players who name has changed but no longer have a gender assignment.

<Q- Is this gender depletion a new <CG> policy, or an oversight?>

.

May-20-15  zanzibar: I've heard text is dead, ergo...

https://www.youtube.com/watch?v=Sc7...

May-20-15
Premium Chessgames Member
  chessgames.com: I've got a few more things to say about the collation request but let me start here instead

<I've noticed many players who name has changed but no longer have a gender assignment. <Q- Is this gender depletion a new <CG> policy, or an oversight?>>

Gender depletion should never happen unless an editor unchecked the gender selector. Could you please give an example or two? The line from the old file compared to the line in the new file.

May-20-15  zanzibar: If you want an example of what is involved in "chasing the tail" of <CG>'s database, please see my rough draft here:

https://zanchess.wordpress.com/2015...

It's far from finished, but the list at the bottom (pre-format python output) can be immediately consulted.

There's a bit of a free-for-all involved with the name changing for married women, which is done far from consistently.

And I'm not exactly sure who decides which pid gets preserved during a merge.

Also there seems to be termites here as well, happily munching various biographical information that seem to disappear in the merges.

Does anybody care to discuss the post?

Has anybody done any similar examination of changes to the <CG> database?

May-20-15  zanzibar: <chessgames> see my post, there's plenty to examine there.

There should be a <gender> (M/F) just before the <rating/#games> at the end of each line.

Doublets are listed with the old snapshot above the newer one.

I took an early lunch, and will be off-line till the end of the day most likely.

(I kinda hope <chessgames> would hold off on writing about the collation issue just yet, in the hope that others might write a posting with their viewpoint - though it seems unlikely)

May-20-15
Premium Chessgames Member
  chessgames.com: <zanzibar> So players who used to be "female" are are set as "no gender assigned?" If true, that's very troubling. Conceivably one of the player records might have sloppily not have the "F" set, but our editors are usually very careful about such things, but I can't imagine that happening for more than one or two records.

When you wrote "see my post" I assume you mostly mean the link you provided. I'm looking now.

Just to pick one example:

<old> 127442 / $ 14507323 Kristina Solic (1986.12.19) HR -- Y F 2237 / 77

<new> 133694 / $ 14507323 Kristina Saric (1986.12.19) HR -- Y 2261 / 94

This is surely one of the players that Sargon merged recently during the maiden-name cleanup. So the "F" in the first line indicates that the player is female, and the lack of an "F" indicates "no gender assigned".

But I'm confused, because if I go to the page of the new record #133694 Kristina Saric it has her marked as female. I know you can't see that as an editor, but you have access to http://www.chessgames.com/playerlis... where you can see her "F" flag set.

I did another spot-check on Rena Graf (nee Mamedova) and the same thing. You claim that she doesn't have he gender flag set, but I see otherwise.

So either your software has a bug or these records have been changed between May 15 and today. I'm hoping for the sake of the database that your software has a bug.

I also have some comments about what you call "M0" and "M1" but let's put this to rest first.

May-20-15
Premium Chessgames Member
  SwitchingQuylthulg: <zanzibar: <Switch> yes, the obvious first step if converting would be to use the FIDE id to look up the FIDE name.>

Yes, it's the obvious first step, and a very easy cheap way to introduce a lot of new errors into the database. FIDE's handling of non-Western names is all over the place, and contains many inconsistencies and outright mistakes.

And if you <did> copy all their data (and all their errors), you'd still have to come up with collated-style names for the ~33000 players without a known FIDE card.

<But... if <CG> disseminates the collation names, we do the work centrally, and everybody benefits (if they wish to).>

Like I said, I'm not opposed to <CG> disseminating the collated-style names; just to <CG> using them in PGN files. But acquiring the collated-style names in the first place (or at least, acquiring reasonably accurate collated-style names in the first place) is a lot harder than you seem to appreciate; so hard I don't remotely like the cost-benefit ratio.

<Those who don't care about sorting, or correlating with other databases for integrity checks, only need to add a simple mechanical transform step to their code. A step that is almost trivial.>

As I already noted, if collated-style names are included in the playerlist, the <opposite> step will also be almost trivial.

May-20-15
Premium Chessgames Member
  chessgames.com: Switching's post almost entirely reflects my feelings on the matter.

Nobody thinks that it's a bad thing to offer collated names, but there is an effort-to-reward ratio that we have to take into account today. We've promised to get the Sandbox up and running, at least in beta, we want to expand editor abilities to involve changing events and sites and maybe even players, we want a more rigorous and transparent logging system, and meanwhile we are scrambling to make a mobile-friendly version of our homepage.

But only one person is requesting collated names, and that's as hard of a task as the aforementioned ones. It's not a bad idea, just neither urgent nor easy.

May-20-15
Premium Chessgames Member
  Annie K.: <cg> hmmm... there are two more things I wish you'd add to your higher priority to-do list:

1. mobile-friendly version of Guess-the-Move

2. improvements to the Kibitzing Search

Plz? :)

May-20-15  Abdel Irada: <Ultimately, this is a "one man, one vote" system: Daniel Freeman is the man, and he has the vote.>

I am aware of that, and did not mean the word "vote" literally.

All I'm suggesting is that standardization offers benefits that may not be immediately obvious, but may become so in future. Things progress, sites change, sites are even sold or merged; in case of any such eventuality, I think you may find using standard list-order will *eventually* turn out to be enormously helpful.

May-20-15  Abdel Irada: <2. improvements to the Kibitzing Search>

They might include making it possible to search for kibitzes in a specific forum.

May-20-15  Abdel Irada: <Q- Is this gender depletion a new <CG> policy, or an oversight?>

Psst! Rumor has it Daniel is trying to neuter us all. It begins with a little inconspicuous gender depletion, and next thing you know, no one *has* a gender anymore.

Pass it on, but don't let the admins find out we're on to their skulduggery!

May-20-15  zanzibar: I'll double check my code - and will hand check against the text file source I may have screwed up, I rushed it out straight-off because I wanted to show how volatile the <CG> database is in order to emphasis the difficulty of locally tracking the database is (vs. keeping things like collate-order central).

I'm feeling bad already, and apologize in advance - I certainly don't want to waste anyone's time chasing red herrings.

Let me redo, double-check and repost.

(I normally didn't print out the gender in my CG class __str__ function. But seeing all the deleted records I recognized many were due to merging maiden and married name records together. So a change on that day could easily messed me up. One version using the old class function... etc.

To make up for it I looked into the Rosangela dos Ramos matter you posted on the bistro!)

May-20-15  zanzibar: For the record - this screenshot suggests <Chess Assistant> also uses <collating-style> naming:

http://img.informer.com/screenshots...

(NIC online does list display-style names, but doesn't seem to allow sorting on any field. In fact, due to advertising overlay, it's difficult to read the game listings)

May-20-15  zanzibar: OK, I did a redux of the redux and the gender issue was indeed a red herring.

Other than a few Males who became Females (not that way, they were females all along - <CG> just had to catch up) there is no problem with the gender.

But that does detract from the main point - that there is plenty to talk about in the post - and lots of issues that should be explored.

I'd like to hear <chessgames> points about M0 and M1.

I'm on the road now, and will update my blog post and then come back here with some specifics.

The systematic approach is in the blog, the case-by-case approach is just to serve as examples, and as a sanity check on the diagnostics.

https://zanchess.wordpress.com/2015... (ignore gender for now).

May-21-15  zanzibar: The question is where to begin?

First of all, the condition of the database is improving, and while I can lay any claim to the credit of implementing the changes, I'd like to think I was part of the team.

Biographer Bistro

That was about the time of my last snapshot.

A more careful analysis of the <CG> playerlist snapshot I had from then reveals 52 FIDE id's used by multiple <CG> "players".

A great many of these were women who got married, so that <CG> had two entries for them - under their maiden name, and under their married name.

A scan of the most recent <CG> snapshot that I've looked at shows that there are no duplicate (or triplicate) FIDE id's in the date.

Congratulations - this is good progress.

Victory cannot be declared just yet, but it is a hopeful development. Perhaps I should leave it there for tonight.

May-21-15  zanzibar: Err... what happened to the kibitz # in my link above?

(It's reply=7551)

Jump to page #    (enter # from 1 to 1118)
search thread:   
ARCHIVED POSTS
< Earlier Kibitzing  · PAGE 819 OF 1118 ·  Later Kibitzing>

NOTE: Create an account today to post replies and access other powerful features which are available only to registered users. Becoming a member is free, anonymous, and takes less than 1 minute! If you already have a username, then simply login login under your username now to join the discussion.

Please observe our posting guidelines:

  1. No obscene, racist, sexist, or profane language.
  2. No spamming, advertising, duplicate, or gibberish posts.
  3. No vitriolic or systematic personal attacks against other members.
  4. Nothing in violation of United States law.
  5. No cyberstalking or malicious posting of negative or private information (doxing/doxxing) of members.
  6. No trolling.
  7. The use of "sock puppet" accounts to circumvent disciplinary action taken by moderators, create a false impression of consensus or support, or stage conversations, is prohibited.
  8. Do not degrade Chessgames or any of it's staff/volunteers.

Please try to maintain a semblance of civility at all times.

Blow the Whistle

See something that violates our rules? Blow the whistle and inform a moderator.


NOTE: Please keep all discussion on-topic. This forum is for this specific user only. To discuss chess or this site in general, visit the Kibitzer's Café.

Messages posted by Chessgames members do not necessarily represent the views of Chessgames.com, its employees, or sponsors.
All moderator actions taken are ultimately at the sole discretion of the administration.

You are not logged in to chessgames.com.
If you need an account, register now;
it's quick, anonymous, and free!
If you already have an account, click here to sign-in.

View another user profile:
   
Home | About | Login | Logout | F.A.Q. | Profile | Preferences | Premium Membership | Kibitzer's Café | Biographer's Bistro | New Kibitzing | Chessforums | Tournament Index | Player Directory | Notable Games | World Chess Championships | Opening Explorer | Guess the Move | Game Collections | ChessBookie Game | Chessgames Challenge | Store | Privacy Notice | Contact Us

Copyright 2001-2025, Chessgames Services LLC