Members · Prefs · Collections · Openings · Endgames · Sacrifices · History · Search Kibitzing · Kibitzer's Café · Chessforums · Tournament Index · Players · Kibitzing User Profile Chessforum
CG Librarian
Member since May-07-11 · Last seen Apr-20-15
<"I adjust">

Hi, I'm your friendly Chessgames database librarian. My job is to make the database better by processing correction slips.

I'll be using this forum to ask for help when something needs more research. You can use this forum for the same purpose, to get others' input on possible errors or duplicates.

If you've submitted a correction there's no need to post here about it. I will see the correction slip and it will be fixed as soon as possible. Thanks!

A few notes:

1. As you probably know, there's currently a long backlog of corrections. From now on, new corrections will get priority, while I also chip away at the older ones. If you submit a new slip on something that isn't fixed yet, that will bump it up to the top and it will get fixed faster. Please be judicious about this.

2. Probably due to the backlog, volunteer bio editors have started to put "aka" and a duplicate player link in bios. Please don't do this, just submit a correction slip. Similarly, if there's a problem with a player name (such as first and last name reversed), submit a slip on it rather than putting it in the bio, so it can get changed in the database.

3. We also try to delete kibitzes about errors that would be confusing once the error is fixed, so the more you keep the corrections to the correction slip, the less extra work for me and the faster I can correct the database. Full Member

   CG Librarian has kibitzed 21 times to chessgames   [more...]
   Jun-04-12 CG Librarian chessforum (replies)
CG Librarian: OK, here are a few things I wanted to mention: 1. I got a copy of Chess Personalia (quite a while ago now) :) 2. The reason CG put a hyphen in Spanish double last names was so the database software didn't get confused about what the last name was (for things like the Player ...
   Nov-03-11 European Team Championship (2011) (replies)
CG Librarian: <Slaven MNE> You're right. We also had the wrong Georgiev. I think the error must have gotten propagated from the official site.
   Aug-08-11 World Junior Championships (2011) (replies)
CG Librarian: Here's the situation with incorrect game scores for this tournament: we first received many truncated games, then the correct versions. I've removed all the incorrect duplicate games that affect the leaderboards. If you see more please submit a correction slip on them.
   May-28-11 World Championship Candidates Final (2011) (replies)
CG Librarian: <alexmagnus: Actually if you do the search now you get +9 -5 =27. One Gelfand win from 1990s, present just a week ago, now magically disappeared... Maybe it was attributed to some different players.> Hello, I just saw this. The stats changed because I merged away a ...
   May-08-11 chessforum (replies)
CG Librarian: <Domdaniel: Welcome, o Eager and Bright database administrator person.> Thanks, and hello everyone! My chessforum is now available for correction-related comments. I'm sure I'll also be posting things that need additional research, so check back often.
(replies) indicates a reply to the comment.

Kibitzer's Corner
< Earlier Kibitzing  · PAGE 16 OF 16 ·  Later Kibitzing>
Premium Chessgames Member I've been thinking about the issue of how to include IDs in Chessgames PGN and I am starting to lean towards a somewhat radical solution. Here's the proposal:

For each game in the database, we maintain not one but two chunks of PGN. One in the format you know today, and the other with a group of new tags:


(and maybe a few others i've forgotten.)

When you download/view PGN from Chessgames, whether you see the normal format or the expanded format would depend on a setting in your preferences.

The Game table is rather large and this will make it much larger so I have to weigh some technical considerations before suddenly bloating the size of the actual files. For one thing, I'd like to know that there is more than one user who would appreciate this change.

Premium Chessgames Member <illegal PGN> For our purposes any PGN that uses features more than {simple comments} is illegal. We don't want variations, or nested variations, or even the symbols like $1 and $2.

It's rare when we provided annotated PGN, and when we do we want it to be on the lowest-common-denominator. Not all PGN viewers can handle RAV notation, etc. The notes usually come from a book where the {simple comments} is all that is required.

Sadly some games with variations slip into the database from time to time. These games need to either be cleaned up or deleted.

Jun-16-15  zanzibar: <chessgames> let me prepare a proposal on the enhanced identifiers, and why it shouldn't be just a z-thing.

I envision it as integral to our new form of tournament building (also needing a writeup).

* * * * *

As for the stripped down PGN, I agree that is best for <CG>. Similar to <MillBase>'s approach in that matter.

I checked the abbrev RAV just to be sure it was what I thought -

RAV = Recursive Annotation Variations

which I found here:

So given that even Tim Harding takes so much time and effort to write and talk about PGN, I don't feel so bad doing a little bit myself.

Of course his article in from 2003. I hope to have largely finished with the topic in twelve years time.

Jun-17-15  zanzibar: I'll describe some of the requires for the <CG> database design, as I see them.

(1) Backlinks - CG_id = (tid, gid, w_pid, b_pid)
(2) Safeguarding tournaments via tid locking.
(3) Baselining tournaments upon creation.

* * * *

(1) Each game should include at least four id's which are a kind of back-link - (tid, gid, w_pid, b_pid).

When a game gets created in an unattached state, the tid is 0 or -1. Some indicator that the game is in full edit mode.

Any editor could change the PGN headers, provided the changes are logged somewhere (I would hope in the comment stream). That includes moves/etc.

Editors cannot directly change the id data. They can do so indirectly, e.g. by changing the name of a player. But software could intervene to ensure such changes are valid. Etc.

(2) Once a game is attached to a tournament, software ensures that it, and all other games belonging to the tournament, are PGN normalized.

At this point editors are no longer able to freely "hand-edit" the games. The rules are stricter, whatever they might be. This is to ensure the integrity of a tournament.

Hopefully, once a tournament is created, all edits to it are vetted through the Bistro. Perhaps a template for admin edits should require a url to a Bistro comment. <CG> should work out the details.

The main idea here is simple, once a game has a valid tid, only software that has the built-in checks necessary to ensure tournaments never become denormalized can change the data.

<No more hand editing, as a general rule, no matter the temptation of convenience.>

So, whatever the powers an admin may have, there are limits when it comes to tournament games (reasonable limits, much like those in the Magna Carta). Changes that affect a tournament game must only be done via update routines. Routines that understand the requirements for tournament integrity.

(3) As part of the recognition of how difficult it is to change a tournament - so there should be additional requirements before a tournament is created.

I feel that a redundant table of basic tournament data should be kept, recording certain basic facts (e.g. N_players, N_games, N_missing, xtabs(?), bracket dates, location).

Perhaps the tournament a snapshot of the pgn of the initial tournament should be kept - both for reference, and for backup in the event of some catastrophe.

Jun-17-15  zanzibar: <Tournament Creation via PGN bulk/batch submission>

This is a very important step, as I see it.

The handcrafted methods utilized previously just can't every meet the production demands.

Take a look at this graph:

The blue shows the tournaments/year contained in <MillBase> vs. the red, which shows the same for <CG>.

And that gap doesn't even account for all the work currently being undertaken to repair the <CG> tournaments.

The full background can be read here:

Ultimately, we should be able to take a normalized PGN file containing all the games from a tournament, and just submit it to <CG> for promotion.

Of course, if games from the tournament already exist on <CG> this involves some difficult questions concerning merging data.

Let's agree, however, that this is a design goal. And instead, let's pick a less ambitious interim goal.

Jun-17-15  zanzibar: <Interim Design Goal>

(1) Bulk correction of Round/Date tags for all games in a tournament.

(2) Simultaneous submission of missing games.

* * * * *

(A) Working assumption - all games have (tid,gid,w_pid,g_pid) in PGN download (however it gets encoded).

The idea is to allow those who wish to build a tournament via SCID or ChessBase.

<(I) Initial Collection Build.>

First step, build a collection of the pre-existing tournament games on <CG> - then download an incomplete tournament.

(I)(a) Of course, the user could also have used the ZIP downloads to find the incomplete tournament.

<(II) Missing games addition>

The user then adds the missing games from whatever source necessary. The tournament-to-be is now complete.

(II)(a) It is assumed the user has normalized the E/S/ED tags for all the games.

(II)(b) Perhaps there should be an option to do so when downloading a collection from <CG>. (Please note)

<(III) Round/Dating>

The biographer working at home now can hand-edit, or however, the game data to update the Round and Dates in the PGN.

This is usually the focus of much of the work of a biographer. It could still be done by hand, the old way, online. But now we also provide the biographer the option of using his/her own homebrew tools (like SCID, ChessBase or a python program, etc)

<IV. Crosstables Generation>

Actually, this step may involve something new - stubs, and so it might actually belong to step II. The idea is that a valid xtab must be able to be produced before a tournament is bulk submitted to <CG>.

Of course, we could consider this a null-step, if we disregard such a strict, but useful, step. I would not recommend such a policy in general.

But one must be practical.

Now the tournament PGN is submitted to <CG> and a complete tournament appears, only missing an intro.

(End of User Requirements/Design)

(<CG> Requires/Design in next post)

* * * * *


Nobody is forcing the adoption of the above. It is a alternative to the current methodology only. However, there is very little difference between working game by game on <CG> vs. game by game in your favorite database program. I would assume most biographers would eventually adapt to this style, no matter their feelings about it today.

For those inclined to use more advanced features, this new approach opens previously locked doors. So it should really be a win-win situation.

Jun-17-15  zanzibar: <CG Design/Requirements Bulk Submission>

<(1) Normalized data check>

This is a basic check, but the E/S/ED headers must be normalized. Else bounce the submission back to the user.

<(2) E/S/ED/Round/Data updating of existing <CG> games>

We are assuming that <CG_id> is contained in some of the games. <CG> could check the pgn to be exactly the same for all such games, else bounce the submission.

(This requirement should ultimately be relaxed to allow movelist corrections - but we are doing an interim design at the moment).

So, E/S/ED may be overwritten by the normalized submission. Also the Round and Date headers containing valuable biographical data.

<(3) Creation of new <CG> games>

Those games lacking a <CG_id>, or whose <CG_id> has (tid,gid) = (0,0), are created as new games on <CG>, with the other E/S/ED/Round/Date headers copied over.

* * * * *

This bulk submission can then be promoted to a tournament, with a tid assigned by <CG>.

Or... we could allow a submission, much like the above, where no gid=0 is allowed, but tid maps to an existing tid. Then the above could be used to supply either the Round/Date data to pre-existing <CG> tournament games. Or even E/S/ED/R/D data.

That's all really pretty straight-forward when you break it down.

* * * * *

Consider the advantage of this method to correct the Round/Dates of a pre-existing tournament. Just download the games, and use your database to update the PGN. Then you can check your results via a crosstable as you work. When the results look good, just upload the PGN.

No fuss, no muss. More powerful, accurate and efficient than doing each game online, when you can't even see a xtab until you download anyways.

Premium Chessgames Member
  gauer: In addition to the Winboard-style or UCI- style engines, don't forget to also think about the container class displayers to show games: (i) none, (ii) pgn4web, (iii) chessviewerdeluxe, (iv) cvd2, (v) skjbase, (vi) mychess, (vii) mistybeach and (viii) chesstutor - where the 1st 2 display well on my machine. 3 & 4 once were favourites when they would load (without dependency on whatever the bust is with Java nowadays), allowing a variation board and display of date, site, etc alongside the game mentioned on his developer forum (yes, if you get it to load, the variations do display).

Yes, chessbase has its own ("extended pgn" - maybe not "real" pgn - but there's also a few versions of any one pgn "standard" itself, usually explained in help files of different programs, etc) export format, which CVD doesn't exactly complain much about when using online. Some similar Java viewers would also allow a list of games to select from within the viewer window itself (maybe a good way of handling groupings tid games).

I liked the idea of using the comments nesting (can be created with chessbase 9 as one source) to avoid having to see a 10-fold rep (or 3-fold or 2-fold) of position ending not have to appear more than a couple of times on the pgn display viewer. I liked it even more for that case when the comment nesting was not used for <ending> positions in chessbase, but when strange things like colour-reversal or ECO code transpositions might've occurred in chessbase. Some of my own personal notes (unshareable) in Opening Explorer positions also include similar comments, one one jumps from one node to a next one (I can even think of QID positions where a transposition at move ply+2 does a "transposes" to a ply+0 position, simply because a B or R has moved twice along a diagonal instead of once to reach a transposed opening book position (I 1st saw this type of example in "chess, black & white", (or similar title) by Kaufman - a big example of an opening book debugger - so there's some homework for you). The only "trouble" when using a" transposes" word pointer is that it doesn't remember the move number increment after the transposition - but it sure is nice for merges with copying and pasting chessbase pgn into the same nesting scheme. Try it with some of your own annotating next time you add some personal notes with chessbase from a book covering 2 or more games all on the same opening.

I'd usually strip off: plycount, eventdate, result, and ECO (ECO can change from program to program!) headers when saving them to notepad, since (most) can be auto-regenerated through pgn upload utility. I'd rather rely on the good stuff at/after the final ply-count in the pgn's move list, like whether it's plycount matches anything in the header, whether it has a result or # or both following the move list, and whether it matches the header and the result makes sense (was it really a <black> mates in 2 puzzle, or is it a checkmate for <white>)? And only rely on the roster tag headers if the move list is empty/ambiguous to draw a conclusion from.

So to me, if a pgn tid set lets me create a crosstable primarily first, and displays a game or partial games move list result (perhaps empty unrecorded move list games included), and only then talks about things like consistency of dates or locales, then I am usually happy.

<drafted on 15-June-15 before replied about their RAV stance, that pattern which I don't usually submit with anymore - part 2 of 2 of the draft continued...>

Premium Chessgames Member
  gauer: Pgn should also have the necessities to extend to include things like: (i) annotations / time-recordings / draw-offer comments (I supply these as comments, although chessbase also has other ways of associating punctuation like "?!" to move lists outside comments), (ii) computer score assessments on a per-ply basis, to a certain depth, for a particular annotating engine, (iii) whether a node can display on its board-viewer of what can or should happen with the viewer when it encounters move-branching within a tree, (iv) what happens when it hits a leaf of a tree, (v) etc (in that last case, when you do have a tree with such nesting, and you pivot to promote one or another to a primary variation in chessbase, then you can see why a leaf might want to hang onto a result if the game result of the main var was instead pivoted to an analysis point result (like a computer engine assessment)).

What's normally needed to build a (swiss - I already mentioned why one might want team "tags" or similar (or maybe better yet, find a way to <not> include them in the pgn source itself, while allowing an offline program to find out what is really going on) for some non-swisses...) is the player names (for each black and white player in a round), the round number (not necessarily the date), a result look-up, and to see which if any players had an empty game result to another player or had a bye.

If we can build off of that, then I think we're getting somewhere.

Premium Chessgames Member
  gauer: CCA's (there's no direct link I can see into a page like the U2000 standings, so just click to it...) has quite a few "r/e" memos beside players who have had to somehow need re-pairing(s), sometimes due to section switches mid-tournament, at the expense of paying an entry fee twice by taking a withdrawal and re/entry mid-tournament again. Not sure that FIDE allows such "wild-west" pairing algorithms, which are common tricks that non-FIDE arbiters might use to pair swisses with for the otherwise unpaired player(s).

I couldn't possibly imagine how various pairing software varieties handle such situations, but now that is in the business of trying to display crosstables that could be generated for display amongst other pairing software, it might be useful to know how display those irregularities (or when a tid is assigning things like 0 or 1 or 1/2 point byes, etc). If pairing is being done "live" in round k of an n round tournament (0<k<n), then the format might want the arbiter to get to pair to round k+1 via the format flexibility (some pairing rules rely on a group theory rule (or isomorphism to a rule by thinking of it in a math way) called associativity when trying to regenerate a table from one round to the next - can be a hard problem in general when ELO style FIDE algorithms aren't quite adhered to).

That's the main reason why I emphasize that if we treat pgn (or x-pgn styles, or RAV or whatever) as a <data structure>, that we primarily need to think about how it interacts with other <programs> first, and then maybe things like engines, or displaying moves or fen-string positions, and then worry about what the end user might see - like whether the game headers is human-readable and the moves can be read in his language notation of choice, and only then display the extra goodies afterwards, like the event-dates, plycount, ECO, annotations tags afterwards. It could also be able to provide navigation points into particular opening explorer positions (could you imagine a case of an "opening explorer notes" note where at an early ply like ply-7, a merge of all annotations of all 750,000 games are merged into that note for that move?!).

Maybe <most> tids would have a blank/no team header, if chosen as an extension for use to display team pairings like: - where the author gives <opinions> about swisses.

[Event " Casino de Barcelona (2007) "] [Site "Barcelona ESP"]
[Date "2007.10.18"]
[EventDate "2007.10.18"]
[Round "1"]
[White " Leinier Dominguez Perez "] [Black " Marc Narciso Dublan "] [ECO " French (C11) "] [WhiteElo " "] [BlackElo " "] [WhiteTeam "-"]
[BlackTeam "-"]

1. e4 e6 2. d4 d5 3. c3 ♞f6 4. e5 ♘fd7 5. f4 c5 6. ♘f3 ♞c6 7. ♗e3 cxd4 .... etc 1-0

I'm not going to repeat the whole pgn of today's GOTD. But the Unicode and addition of the URLs to the pgn header might make for a good discussion of whether this has any benefits of a more universal display across languages, pending that we believe that most programs from 2010 can comprehend what the pgn standard couldn't quite handle a quarter century earlier. I'll leave that style of "pgn" open for discussion.

Anyways, this week, I'm preparing more for a weekend tournament and am busy with other work.

Jun-18-15  zanzibar: <gauer> It would be helpful if you could supply an outline to these extensive posts.

And some motivation/roadmap to what your primary goals are, at the onset.

I just tried to read through the posts, and as usual have to go back to the start to reread it all (maybe more than once), just to get oriented.

Can we agree that your discussion is far beyond the immediate goal of getting <CG>'s current PGN database in consistent, correct shape at the 7-tag level?

Premium Chessgames Member
  gauer: Back about 5+ years ago when using chessbase 9 mega (which DOES support crosstable generation and RAV, but does NOT include a round by round pairing generating algorithm, nor even any options in it to support non-<FIDE>-swiss pairings so well - you should've seen the errors I generated in that program), I would help in a local Kitchener tournament or 2 in a live round-by-round basis to generate 0-ply pgn headers for each swiss-sys pairing which an arbiter generated, and sometimes used a variant of the standard FIDE swiss algorithm (since the Canadian fed has a different hierarchy for doing swiss pair-up/down rules than FIDE's would've used).

So adding in the names of all players (like uses a pid), I would even try to colour-pair an unpaired pid for a round to a fictional "ghost"-bye player in a header for a round, to have a crosstable generate correctly (so as to get a proper display sum in it for all but the ghost players). Chessbase's "T" command (its help error warned of "warning: experimental function"!) was very buggy doing it this way, when adding just the pids, 0-ply games, a round, result and event/section name - and fill in the other stuff later on.

2-5 days later I could add in the moves (a handheld was a handy way to capture game scoresheet results), and do a save, and hopefully watch the "T" command not crash. And later distribute the pgn to the TD afterwards. Much of the time it would actually work, but once in a while it would either look for an exit to a long-running loop. So it seems like chessbase was only expecting one of a few "valid" pairing options to display to crosstable via the T command.

Maybe in a newer chessbase release, there is support for doing or import/export of pairing rules from 3rd party pairings algorithms?! That was one of the biggest things I missed in seeking a 3-digit price-tag program, and would be great to see if pgn formats with 0-move lengths would be able to create tables or pairings that would allow a tournament creator to build on with - for doing things like displaying things like posting results to a newsgroup or place like in a real-time basis?!

If the moves of empty game results can be found later, then I'm all for adding them in - when the 2 scoresheets agree and are wanted to be published.

Doing things like spell-checking pid names or matching the correct rating might be something to be done later, within a future step of a tid update, after it gets finished?! Those were the types of problems I was encountering when I did some tournaments back then.

Does SCID support any pairing generation algorithms - or crosstable display formats (incl team formats)?!

Jun-18-15  zanzibar: <gauer> to answer the question at the end ...

<SCID> doesn't do any pairing algorithms afaik.

It does offer several crosstable (xtab) formats -

RR which it calls "all-play-all"

Swiss where it gives color + results vs Round


Knockout (which understands compound Round notation).

The latter gets a little confused with 3rd & 4th place continuations, but displays all matches correctly, just not necessarily sorted properly.

It does not do Scheveningen or other Team match-ups.

(Scheveningen are displayed somewhat properly as RR's with lots of missing pairings.)

Premium Chessgames Member
  gauer: One may want to see G Neumann vs Steinitz, 1870 and the notes to see such situations where RAV notation could be useful. But it is also okay to display it in the current format.

1 d4 Nf6 2 c4 e6 3 Nf3 b6 4 g3 Bb4+ 5 Nc3 Ba6 6 e3 Be7 7 b3 Bb7

1 d4 Nf6 2 c4 e6 3 Nf3 b6 4 g3 Bb7 5 Nc3 Bb4 6 e3 Bd6 7 b3 Be7 is one of many such ways (okay, I didn't check to see how critical it is to look at playing the extra Bishop moves in the variation - was only using it as an example...) to see a non-standard opening transposition between opening explorer nodes. In one case, each bishop moves twice; in the other, one bishop moves once, and the other thrice, to get the same node.

Jun-18-15  zanzibar: <gauer> transpositions between different opening move sequences is a matter than should be factorized out entirely from the issue of accurate game data.

Of course RAV notation is very useful, not just for opening transpositions, but also for any analysis worth its salt.

But now I'm wondering what the real thrust of the discussion is?

Certainly, at some point <CG> should support detailed RAV's in user comments, much like <CT> already does. That is, if it's ever planned to allow user analysis to be easily played over online.

Users can just post valid RAV (PGN) excerpts already in a post, although it requires a cut-and-paste into SCID or similar.

I don't see posting annotated games as a major focus of <CG>, at least presently. ChessBase and others already have a big lead in that department.

Jun-19-15  zanzibar: I'm a little rusty with SCID command line interface (it's unfortunately in TCL). But I'm fairly sure it has the ability to produce a Swiss or RR xtab.

Would having such an ability be useful for <CG> tournaments?

It's a rhetorical question of course, from my viewpoint.

Premium Chessgames Member
  gauer: I think it <would> be useful to have a consistent output crosstable format as what chessbase or SCID (or maybe what or swisssys or swissperfect, etc might export to Excel in some sort of comma separated value format), and has a basic editor tool embedded (not the fixed-width font; instead a leaderboard generator of the html crosstable) into its tid pages (not sure if it would export something similar that swisssys or swissperfect could read/use).

I'm not sure whether <pairing> programs are in general able to <just display> (maybe 0-move games) pgn's from a round if its output is being exported to a projector in the hall - but probably it's safe to assume they (ie swisssys, swissperfect - sometimes <does include> a pgn link on its output page) wouldn't have much for <pgn editor tools> like chessbase or SCID.

And I also <do> like the chessbase format of using a round.table compound format option of being able to export with, in the round header (or at least have an Xpgn header to carry a table/board header, along with the round - team pairings use both the table and board as well as the round, or specify the seat position, and the country team pairings for the round). Maybe this data could be carried as extra, optional drop-down fields to fill out on the gid/tid pages, instead of inserted as a compound round structure (or at least I'm not sure why we'd want to remove the extra data, once we know we had the seating positions for a round - usually generated by a rule related to FIDE ratings ranking).

I tried answering that one on my forum a day or so ago, but today I'm gone to a tournament (so I'll be ignoring those bistro posts all weekend) - the only major difference was that the 3rd digit number was to indicate whether a game was a replayed in the Australian championship (the software cared about having 2 decimals in the round field, but specifying a (maybe large) number after the round decimal point seems to allow for arbitrarily large numbers - which to the human reader can be read in a string format). Again, it might be interesting in the meantime to see just how flexible pairing software can be, if you had not looked at some of the repairings in those CCA tournaments.

Jun-20-15  zanzibar: <gauer> the topic is crosstables (xtabs).

A big fat "yes" vote from me on the importance of having a reference xtab, and being able to dynamically generate xtabs from tournament games.

Most people like leader-sorted RR's for the main display. But for purposes of crosschecking I think a Swiss is essential. No tournament should be accepted without a valid Swiss, showing color-pairings and rounds.

Also, I think name sorted versions of all the tables should be available somehow, again for checking. Since tiebreak rules can be so complicated, sorting on collation-names (i.e. surname, prenom) is important.

All xtab software should be able to produce name sorted tables, and then we have a canonical format to check.

Informant did that, having a column for placement.

How <CG> exposes these reference tables is a detail. Maybe by a dropdown selector, or a link to a page dedicated to the tables, etc.

Being able to dynamically generate xtabs from the actual games in a snapshot of the tournament is very valuable for checking for inconsistencies that can happen after the intro page is published.

The kind of errors that shouldn't happen but do.

And unless you're willing to write thousands of lines of code to find errors, the very best general-purpose tool is the Swiss xtab.

Everybody can hunt out bugs if they learn how to use the Swiss+RR xtabs from a tournament PGN.

Premium Chessgames Member
  gauer: Are there any sort of standard columns that we should and should not want to include in (i) round-by-round or (ii) leaderboard crosstables? Are there any <optional> columns that we'd want to import and export via csv (comma-separated-value) format?

The problem is is that if we do allow 0-move game stubs for which to build a crosstable from strictly automatically, that we need to know how handle players paired to an unpaired bye (ghost pairing). If on the other hand, we do some of the editing of the table by hand, then we don't necessarily need all games present when voting in a tournament, but the scores of byes and unplayed games need to be inserted manually (especially when they are different from the FIDE usage recommendations of when to assign a bye of a certain score - pairings programs have about 5-10 other symbols other than win/loss/draw, and also carry the colour/floats in the pairing to the next round, and maybe it would be good for our tables to know how to handle these as well, for some not-quite-perfect-swiss algorithm - swisssys has a number of choices, too).

Not sure if it's worth's time to carry that excess baggage - but it might've helped in reconstructing some of the older US open due-colour pairings, etc - if it was better known of how the pairing variations were working as we later found more new games of it.

Jun-20-15  zanzibar: <gauer> Non-scoring byes need no stub. There simply isn't any pairing for the player in the round, and the xtab just inserts "...".

Scoring byes need a fictitious player or maybe more, let's name them BYEn, where n is used for enumeration purposes only.

No need for these players to have a bio, or to have their statistics tracked.

As far as the extra symbols used by pairing programs, I'd have to gain some familiarity to say anything further.

As for the extra columns, I like the SCID system, imperfect though it may be. It includes the player's age, nationality (federation), ELO, and titles. Also, S-B tiebreak scoring.

Premium Chessgames Member <I'll describe some of the requires for the <CG> database design, as I see them.>

I have some opinions on that post and I'll get back to you soon.

Jun-22-15  zanzibar: <chessgames> Good, it would be nice to get some discussion going with an eye towards implementing some of the ideas.

Just a quick word on the rationale for the terser <CG_id> tag.

Basically, this tag isn't intended for people to edit by hand. It's intended for transfer of information from one program to another in the PGN.

Therefore, it was designed for easy parsing, and to avoid taking up too much space in the PGN.

It is much easier to look for the one tag (<CG_id>), then to have to parse out <ChessgamesWhiteID>, <ChessgamesBlackID>, <ChessgamesTournamentID>, <ChessgamesGameID>.

The information is naturally a tuple, and all I have to do is process all the tags normally, then:


try: = tuple( [int(k) for k in g.CG_id.split(".")] ) except: = ( 0, 0, 0, 0 )


If people understand the tuple that's fine, but the tag is intended to mark off pre-existing <CG> games which a biographer downloads to begin the tournament build process.

These pre-existing games are only edited for Round/Date information. Missing games are added, but without the <CG_id> tag.

And it takes less file space because it is more compact as well. Additionally, if you do know the format you can do searches with an editor and have all the information right there, one line/game.

It has a lot of advantages.

Jun-23-15  zanzibar: I did a series introducing Swiss error checking - which could include RR and maybe KO tournaments as well.

Basically, we are looking for a player showing up twice within a round:

Biographer Bistro (kibitz #11707)

Having introduced the problem via some easy examples let's tackle a more difficult case - <Baku Open (2013)>:

Baku Open (2013)

The question is what is going on in R5?

Swiss can be difficult due to missing games. But here we have 69 players for a 9-round match:

Expected N_games = 68*9/2 = 306

In fact, <CG> has 302 games, whereas TWIC gives 308 games. So the coverage is almost complete.

(Note: TWIC gives 72 players, since it includes 3 R1 dropouts maybe(?))

We can look at the games/round:

>>> [ len(G[r]) for r in range(1,10)]

[33, 34, 34, 34, 41, 26, 33, 34, 33]


And it appears that R5 has "stolen" games from R6.

How many? Could be 7 or 8 depending.

Now here's the trick. List out R5 games, sorted by gid to see a possible bias:

note the games @g

<[1732270, 1732271, 1732275, 1732276, 1732277, 1732278, 1732286, 1732287]>

are injected in-between R5.10/@g1732256 and R5.11/@g1732798

Take those out of R5 and into R6, and one gets a correct Swiss.

* * * * *

So, the question becomes, how is <CG> going to clean all that up?

(1) Doing as the above seems much too time-consuming.

(2) Not doing any fix is an option.

(3) Redoing all the tournaments from a source like TWIC or the official sites, seems likely the best and most efficient.

(4) ???


Jun-24-15  zanzibar: <chessgames> I am really wondering what the best approach to revamping the <CG> promoted tournaments should be...

I am truly beginning to think that the best approach is to jettison the current batch of non-biographer tournaments/games and just rebuild the post-2000 set of tournaments anew.

So I'm led to a series of questions (as usual):

Are you aware of the issues?

Assuming yes, have a rough idea of the scope of the problem?

And in consequence, what might be your suggestions on remedies?

While we're here, have you also given any thought to the need for biographical documentation for these tournaments?

Jun-29-15  zanzibar: Doing the ICCF stuff jammed me up yesterday (and some today)... as they use unicode in their ratings list.

Some of the PGN I got from <crawfb5> also uses unicode (or more accuratly, some extended character set), whereas the other source I used stuck with ascii.

So doing the ICCF provided me a good excuse to actually bite the bullet and see how involved handling unicode names is.

The answer is that it's quite involved, and very confusing at first.

Ok, it's not so bad once you realize who uses what encoding, but the whole business has gotten so many people so confused that it's hard to find good targeted advice on the web.

And in the end, I think my earlier conclusion from my first utf-8 attempt with SCID, still stands - it's not really worth the effort.

Still, this time through a little more progress was made, and I'll provide some notes and code later.

One thing is for sure, it's necessary to maintain ascii compatibility at some level, so a player's name would have to track both ascii and unicode versions of the long/short <CG> names.

The other conclusion is that some unicode encoding has to be used to manage the byte-streams, and it isn't always utf-8. So beware.

Jump to page #   (enter # from 1 to 16)
< Earlier Kibitzing  · PAGE 16 OF 16 ·  Later Kibitzing>

from the Chessgames Store
NOTE: You need to pick a username and password to post a reply. Getting your account takes less than a minute, totally anonymous, and 100% free--plus, it entitles you to features otherwise unavailable. Pick your username now and join the chessgames community!
If you already have an account, you should login now.
Please observe our posting guidelines:
  1. No obscene, racist, sexist, or profane language.
  2. No spamming, advertising, or duplicating posts.
  3. No personal attacks against other members.
  4. Nothing in violation of United States law.
  5. Don't post personal information of members.
Blow the Whistle See something that violates our rules? Blow the whistle and inform an administrator.

NOTE: Keep all discussion on the topic of this page. This forum is for this specific user and nothing else. If you want to discuss chess in general, or this site, you might try the Kibitzer's Café.
Messages posted by Chessgames members do not necessarily represent the views of, its employees, or sponsors.

You are not logged in to
If you need an account, register now;
it's quick, anonymous, and free!
If you already have an account, click here to sign-in.

View another user profile:

home | about | login | logout | F.A.Q. | your profile | preferences | Premium Membership | Kibitzer's Café | Biographer's Bistro | new kibitzing | chessforums | Tournament Index | Player Directory | World Chess Championships | Opening Explorer | Guess the Move | Game Collections | ChessBookie Game | Chessgames Challenge | Store | privacy notice | advertising | contact us
Copyright 2001-2015, Chessgames Services LLC
Web design & database development by 20/20 Technologies