< Earlier Kibitzing · PAGE 17 OF 39 ·
Later Kibitzing> |
Feb-16-19
 | | AylerKupp: <<keypusher> Yes, I think AK was just trying to define the four terms I was asking about (thanks, AK!), not say that Lc0 had prior knowledge or whatever.> Correct as I mentioned to <alexmagnus> above. I hope that I succeeded to at least some extent. Along those lines you and others might find this interesting. https://www.chessprogramming.org/Ne... gives a list (likely incomplete, but it's a start) of neural network-based chess engines in alphabetical order. I thought it would be more interesting to sort the list in chronological order and this is what I came up with: 1989 Morph
1994 SAL
1995 NeuroChess
1996 Alexs
1998 Chessterfield
1999 Octavius
2000 ChessMaps
2004 Tempo
2005 Hermann
2005 Scorpio
2006 Blondie25
2007 Stoofvlees
2011 Arminius
2014 Deep Pink
2015 Giraffe
2015 Zurichess
2016 Spawkfish
2017 AlphaZero
2017 Gosu
2018 Leela Chess Zero
(unk) Golch
The dates are approximate; they refer either to the time a paper described that particular engine or the time that the engine first appeared in a tournament or match. Still, I think it's interesting to see where AlphaZero and LeelaC0 fit in the chronology of neural network-based engines. I think that Giraffe is of special interest. Its developer, Mathew Lai, developed it as part of his Master's thesis and continued to improve it. But he was hired by Google's Deep Mind and decided that it would be a conflict of interest to continue developing it, given his position as part of the AlphaZero development team and his knowledge of trade secret information obtained. Of particular interest, I think, is Lai's paper "Giraffe: Using Deep Reinforcement Learning to Play Chess" which, unfortunately, I could no longer find a *.pdf version. And I think that the following is also interesting: https://motherboard.vice.com/en_us/... AlphaZero was not the first application of reinforced learning as it applied to chess (see, for example, http://citeseerx.ist.psu.edu/viewdo..., published in 2008, but it's certainly the best known one. I think that reinforced learning in chess, as opposed to supervised training where the engine's neural network is trained, like LeelaC0, by providing it as inputs a set of games, holds the most promise in uncovering new chess principles because it's not constrained by the biases inherent in games played by human players. But again, what do I know? |
|
Feb-16-19 | | nok: Yep. As much as Google would like us to think otherwise, twas clear from the beginning that A0 was warmed-up ideas backed w/ big money ie. big hardware. AlphaZero - Stockfish (2017) (kibitz #186) |
|
Feb-16-19
 | | alexmagnus: <If true (and I believe it is) then that's a meaningless statement since <all> neural networks start with zero knowledge> Not quite. I'd say training data are kind of a knowledge. The original Alpha Go (the one the beat Lee Sedol) was trained on top human games as opposed to self play. That is, its knowledge consisted of rules of Go <and> those games, even though in the beginning it could make no sense of the games. Then (again, with go) they created a network that trained purely by self-play, from zero knowledge. Thus Alpha Go Zero was born, and it turned out to crush the original Alpha Go. Then that algorithm was made general purpose so it could learn chess, go and shogi simultaneously. And <this> is Alpha Zero. |
|
Feb-17-19
 | | keypusher: <AK> incidentally, I think you noted earlier that A0 appeared to plateau around 3500. One of the developers is quoted in <Game Changer> saying that in Go AlphaZero continued to improve via self-play for a long time, but in chess improvement pretty much stopped after a while, probably because the draw rate got very high once A0 got good. In Go (and shogi) draws remain rare even at very high levels of play, I guess. |
|
Feb-17-19
 | | keypusher: <AlexMagnus> You were right — the focus for DeepMind was Go, not chess. Demis Hassabis, the CEO, has a dramatic account of the development of AlphaGo and the Lee Sedol match. They had to announce the match to coincide with the publication of a Nature paper, at a time when Sedol was definitely still stronger than AlphaGo. They expected, based on the program’s rate of improvement, that it would surpass Sedol before the match, but they couldn’t be sure. Even when the match arrived, they thought the program was stronger, but they were worried about “overfit” — that AlphaGo had gotten really good at beating itself, but might not be able to handle Sedol. There was also a glitch in the program that caused the one defeat AlphaGo suffered in the match. You have to feel for Sedol. Based on the previously published AlphaGo games, he had no reason to think it could beat him. |
|
Feb-17-19
 | | Ron: Game Changer: AlphaZero's Groundbreaking Chess Strategies and the Promise of AI, a recently published book about AlphaZero: https://www.amazon.com/Game-Changer... |
|
Feb-17-19
 | | alexmagnus: <You have to feel for Sedol. Based on the previously published AlphaGo games, he had no reason to think it could beat him.> Yes, based on the Fan Hui match, it was expected Alpha Go would need one stone odds to compete with players of Lee Sedol's level. |
|
Feb-17-19
 | | keypusher: <Ron> I recommend the book. I’ll try to comment on it more intelligently at some point. I suspect a better book could be written (perhaps, has been written) about DeepMind’s assault on Go. But I did find it fascinating. |
|
Feb-17-19
 | | AylerKupp: <<alexmagnus> Not quite. I'd say training data are kind of a knowledge.> I thought that's what I said, or at least what I tried/meant to say. All neural networks need to be trained since they start our with zero or near-zero knowledge (in chess I would consider position representation and rules of the game as part of the near-zero knowledge) and then become trained by virtue of analyzing its training set. Generating a good training set is not a trivial task since it needs to consider many aspects. See https://arxiv.org/pdf/1509.01549.pdf, section 4.3. And different approaches, all valid, are used to generate this training set: (a) Most engines. The training set is derived from master and grandmaster-level games. (b) AlphaZero: The training set is derived from self play. (c) LeelaC0: The training set is derived partly from self play and partly from opponents. i.e. games played on-line between LeelaC0 and opponents on the Internet. In my opinion I think that (b) is best because it removes any human biases and allows the neural networks to uncover game-playing possibilities that might not have previously occurred to even the best human players. Hard to see how this can be done in (a) and even (c). |
|
Feb-17-19
 | | AylerKupp: <<keypusher> I think you noted earlier that A0 appeared to plateau around 3500. One of the developers is quoted in <Game Changer> saying that in Go AlphaZero continued to improve via self-play for a long time, but in chess improvement pretty much stopped after a while, probably because the draw rate got very high once A0 got good.> Well, I can only address chess since I'm not at all familiar with Go or Shogi. In the figures shown in the original paper, "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", https://arxiv.org/pdf/1712.01815.pdf as well as the follow-up article, "A general reinforcement learning algorithm that masters chess, shogi and Go through self-play", https://deepmind.com/documents/260/... AlphaZero's rating reached a minor plateau between 40K and 100K training steps and then increased, but it pretty much flattened out after about 150K training steps. The figure for Go was a little different, its playing strength increased rapidly until about 180K training steps, increased more slowly until about 380K training steps, and much more slowly after about 380K steps. So it seems that there is some sort of playing strength barrier <with the current configuration of AlphaZero>. But I don't think that it has anything to do with the increased number of draws as AlphaZero got better but more with the limitations of the AlphaZero configuration. Performance of deep neural networks can usually be improved by adding more hidden layers, so these apparent barrier could be overcome by doing that. This will, of course, require time as well as possibly additional and more powerful hardware. I personally don't see the point of Google/DeepMind doing that since they seem to have made their point, but you never know. I plan on getting a copy of "Game Changer" once it's available in paperback around Mid-March. I hope it's not yet another rah-rah blabbering about the "inherent superiority" (which <could> be true, I just don't think it's been proven) of neural network-based approach to chess engine development and that it provides a more detailed description of the process of development AlphaZero and the hardware configuration used in the AlphaZero vs. Stockfish matches. Any opinion as to whether these topics were adequately addressed? |
|
Feb-17-19
 | | keypusher: <AK> I got it i(in paperback) two days ago and I’m enjoying it enormously. I certainly wouldn’t tell you not to buy it. But I’m not sure you would enjoy it as much as I do. Sadler (like me, frankly) clearly fell in love with how A0 plays, and that colors the book, although he takes pains to point out SF’s strengths too. Also, this book was clearly done in cooperation with DeepMind, so there’s no criticism of anything DeepMind did. I think A0’s biggest fan would admit that the conditions of the 2017 games, for example, were unfair to SF. But that’s simply not discussed. I think you would find the discussion of the development and functioning of the neural network very simplistic. I doubt you would learn anything. To be fair to the authors, I think they decided that the likely audience for the book would not be able to absorb a more complicated discussion, and they’re probably right. But that does limit the utility of this part of the book for people with backgrounds like yours. There is an appendix that discusses the hardware issue more clearly (I think) than the Science paper. You’ll be interested in that. But I’ll probably post info from that appendix on this page well before mid-March, so you may want to watch this space. :-) |
|
Feb-18-19 | | SugarDom: But Lee Sedol was not the strongest human Go player at that time. |
|
Feb-18-19 | | MrMelad: <AylerKupp: Generating a good training set is not a trivial task since it needs to consider many aspects> Well, I suppose generating *the best* training data set can be a bit challenging but all the choices you mentioned and those in the article are <good> choices for training data sets per se. When it comes to deep learning and neural networks, if you have a big data set with results (for example, hundreds of thousands of quality high level games with the result - 1-0, 0-1 or 1/2-1/2) - it is a very good data set. And if you can trivially generate it, like leela with stockfish or AZ with random self play it is even better. Generating data for most problems in deep learning is not so trivial, consider, for example, trying to develop a network to determine if an image contains a dog. Even if you have many thousands of images - if nobody marked your images for containing dogs, it will not be very trivial to "generate" your data. |
|
Feb-18-19
 | | alexmagnus: < But Lee Sedol was not the strongest human Go player at that time.> World number 5 IIRC. |
|
Feb-18-19
 | | AylerKupp: <<MrMelad> When it comes to deep learning and neural networks, if you have a big data set with results (for example, hundreds of thousands of quality high level games with the result - 1-0, 0-1 or 1/2-1/2) - it is a very good data set. And if you can trivially generate it, like leela with stockfish or AZ with random self play it is even better.> I was a systems engineer manager and, as such, I was typically responsible for systems test. Likewise when I was a software engineer I had to construct test sets. Over time I determine that a good test set (which I think would be similar to a good learning set) had 2 basic components: (1) A synthetically generated test set. The objective of this test set is to test for as many conditions as possible using the smallest number of test cases. This is not easy to do since you (or at least I) tended to overlook some of the conditions that need to be tested for. But it's an important component in terms of maximizing the number of conditions you test for with a minimum number of test cases. That's important because the more test cases you have the longer you need to conduct the tests and time, as usual, costs money. (2) A real-life test set. It's just simply impossible (it certainly was for me) to determine a synthetic test set that covers all the conditions that you need to test for. I think that it's always important to also include real-life data in your test set because they will point out test conditions that you will never think of. Fortunately this is usually relatively easy to generate although not efficient in covering all the conditions that need to be covered with a small number of positions. An example I learned about (fortunately I was not involved in development the test set) was the case of an infrared imaging missile seeker that a group in my company developed and went to test in Germany. It was winter time and there was a lot of snow on the ground. It never occurred to the algorithm developers that he ground would appear hotter than the expected targets because of the reflection of the sun. It invalidated <all> the algorithms they had developed and the had to cancel the test. I've never been involved in developing a learning set for training a neural network but I suspect it would have similar characteristics to a test set. You can develop a training set for a chess engine with specific positions to teach the neural network as many features of the domain that you are involved with but I will go out on a limb and say that you are <guaranteed> not to address all the needed features. So I think that you would need to complement this learning set with a brute force-like set of, in chess, master and grandmaster-level games. And I would think that a good learning set would involve both approaches. I would say that self-play falls into category (2). Yes, the more games involved in the learning process the greater the chance that all the necessary positions will be covered. But it might be best for the AlphaZero and LeelaC0 to review the games their engines play to uncover situations where the engine did not handle the position as well as it could have, and include these positions in their learning set. This is analogous to developing a training set for a neural network to recognize dogs. You need to define that characteristics of a dog from elementary to more complex details in the different hidden layers of the network and organize them hierarchically. Even things like color are important because it's just as important for a network to reject, say, a green dog than to recognize a brown one. |
|
Feb-19-19 | | MrMelad: <AylerKupp> Interesting post. <Likewise when I was a software engineer I had to construct test sets. Over time I determine that a good test set (which I think would be similar to a good learning set)> I don't know in what manner you mean "similar". In deep learning the emphasis is usually on huge data sets (maximum amount of data) with some "ground truth" attached to them (also called "gold"), while automatic tests usually require the minimum amount of data that covers the test requirements. For example, for an automatic test (especially an elaborated system level test) - running over huge similar data sets for the small chance of encountering an unexpected behavior is unfeasible - you are usually much better off by mapping the different possibilities and finding out (or generate) small data sets that cover those. In deep learning the approach is completely different - the more data you have, the more robust and reliable your neural network will become as it gets more and more reinforced with new data (of course there can be a limit for improvement, like the limit AZ team encountered and reported). <You can develop a training set for a chess engine with specific positions to teach the neural network as many features of the domain that you are involved with but I will go out on a limb and say that you are <guaranteed> not to address all the needed features> Actually, the major advantage of the AZ approach was that it didn't rely on mapping "features" of chess, but specifically on the assumption that random play will eventually cover all features. I think this is a big advantage since it didn't soil the algorithm with assumptions that might proven later to be false. <So I think that you would need to complement this learning set with a brute force-like set of, in chess, master and grandmaster-level games. And I would think that a good learning set would involve both approaches> I have to disagree on that, another advantage I think was that it <didn't> use any human play games at all. This is an advantage because: 1. There aren't that many available games of GM level - few hundreds of thousands are not very relevant when your data set is tens of millions large 2. If you soil your data set with non-optimal solutions (human play) and reinforce the network based on those it can degrade the overall level of the network. 3. The network eventually reached a certain limit which means it didn't really matter what they used for the initial data, random play was completely sufficient for that end. |
|
Feb-19-19 | | MrMelad: <But it might be best for the AlphaZero and LeelaC0 to review the games their engines play to uncover situations where the engine did not handle the position as well as it could have, and include these positions in their learning set> This can be a good idea if indeed the games had a certain bias for some sort of positions and lack for others, but it seems that they tried to avoid this by not choosing always the best solution according to the network but had some random factor associated with choosing the move so that, I think, would eventually lead to all types of position and to test all types of strategical approaches. |
|
Feb-20-19
 | | Penguincw: Some information about AlphaZero: https://en.chessbase.com/post/the-a.... |
|
Feb-22-19
 | | AylerKupp: <MrMelad> Thanks for your comments. Perhaps I'm a bit of an old fogey wedded to old fashioned ideas as well as ignorant of the best ways to train a neural-network driven solution to a problem. So be it; I'm both open to and eager to learn. By "similar" I just meant that both test case datasets and training data sets can require a large amount of data to ensure adequate coverage of all the possibilities. And in test case applications it's important to be able to have a measure of the test coverage; i.e. and estimate of the percentage of faults that remain undetected with the test case set as currently implemented. Only when the percentage of undetected faults is sufficiently small (it will never be zero) will the test case set be considered "adequate". And since a test case dataset will be used many times to test similar items, it's somewhat important that the test case dataset is somewhat efficient; i.e. the average number of faults detected per test case should be adequately high. But this is not a concern when training a chess engine network since the network will (presumably) be trained only once, and the training set it's not directly involved in playing the games. Yes, in the case of training a neural network the more data you have the more robust and reliable your network can be <provided> that the new data is not redundant with the old data. And perhaps you do want redundancy in order to reinforce the learning of the network. I just don't know. The assumption that purely random play will <eventually> cover all features is probably a good one, provided that we have a handle on how long "eventually" will take and we can afford to wait for however long that will take. And as you provide the network with more and more data, (what is sometimes referred to as "naive" training) there is the danger of overfitting and making the network <too> responsive to new data. I don't think that the argument that using master- and grandmaster-level games to provide the initial data to train the network will "soil" it holds water. After all, what other than deliberately bad play will "soil" your data? "Eventually" if the initial training set is augmented by self-play the human-induced errors will be overwhelmed by the reinforcement process. But I think that using the results of human (or even other engines') play would provide better initial play, and then the training can resort to self-play once novelties are played since obviously there would not be any record of best play against moves that have never been played before. But, again, I could either be completely wrong or the amount of time required to train a neural network-based engine with self-play is not sufficiently long to be of great importance. As far as the neural network reaching a certain limit of capability, it could mean that the number of feature detections incorporated were constrained by the number of hidden layers in the network. If you don't have enough layers and neurons in the network you simply will not be able to extract the required number of features to realize optimum play. Expanding the network to cover these additional features would likely require additional modeling and hardware, which may or may not be if major importance. But the field is a relatively new one and I suspect that enhancements will be coming along fast and furious. |
|
Feb-22-19
 | | AylerKupp: <<Penguincw> Some information about AlphaZero> (part 1 of 2) There has unfortunately been so much hype concerning AlphaZero that it has reached a cult following, and articles about it contain a lot of fallacies and inaccuracies by people who should know better. Here are some examples from https://en.chessbase.com/post/the-a... <What he did not tell me at the time was that they were already developing a chess engine that was unlike anything anyone had ever seen before.> False. Neural network-based chess engines have been around for at least 20 years as I indicated in AlphaZero (Computer) (kibitz #412). Sure, the early versions could only handle a small chess subset (like endgames with a much reduced piece count), or played abominably, or both. But that's not new or unique. Early "classic" chess engines at comparable stages of development played with either a board subset (e.g. a 6x6 board) or played abominably, or both. <The DeepMind neural network took a radically different path> Yes, radically different from "classic" chess engines which I define as having a hand-crafted evaluation function, a search tree expanded via iterative deepening and quiescent search, best moves evaluated by the minimax algorithm used on the search tree branches, and search tree reduction by, at a minimum, alpha-beta pruning. AlphaZero implemented chess playing knowledge via reinforcement training methods for their neural network and position evaluation via Monte Carlo Tree Search (MCTS), but both had been used before. Chess playing knowledge by reinforcement training was probably initially implemented in the chess engine Giraffe back in 2015 (http://www.talkchess.com/forum3/vie... ) by Matthew Lai who subsequently joined the Deep Mind team. Reinforcement training itself goes back at least to 1993 (http://www.ideanest.com/vegos/Monte...). MCTS was implemented in a Go subset by Brend Brugmann (http://www.ideanest.com/vegos/Monte...) also as far back as 1993. <In the end AlphaZero played a test match against an open source engine named Stockfish, one of the top three or four brute force engines in the world.> I would hardly call any of the top "classical" chess engines (Stockfish, Houdini, Komodo, etc.) as "brute force". A true "brute force" approach (as was done initially during chess engine development) would involve determining all the legal moves for the site to move and evaluating <all> the branches in the search tree to determine the absolute best move to play. But because the search tree expands exponentially because of an estimated branching factor above 30 (the average number of legal moves from any position) this approach is not feasible beyond a relatively small search depth, even if the fastest supercomputers were assigned the task. Instead, besides the alpha-beta pruning algorithm to reduce the number of search tree branches to be evaluated, modern classic search engines use a variety of heuristics to reduce the number of search tree branches to be investigated much more. It is this reduction of search tree branches that has resulted in the increased search depths achieved by modern classic chess engines, particularly Stockfish, in a reasonable amount of time, and not improvements in computer hardware. Hardly "brute force". |
|
Feb-22-19
 | | AylerKupp: <<Penguincw> Some information about AlphaZero> (part 2 of 2) <In the 100 games that were played against Stockfish, AlphaZero won 25 as white, three as black, and drew the remaining 72 games. All games were played without recourse to an openings book. In addition a series of twelve 100-game matches were played, starting from the 12 most popular human openings. AlphaZero won 290, drew 886 and lost 24 games. Some in the traditional computer chess community call the match conditions "unfair" (no opening books or only constrained openings), but I conclude that without doubt AlphaZero is the strongest entity that has ever played chess.> Too bad that the author's conclusion did not address the overwhelming computational advantage (about 100X higher) that the hardware that AlphaZero ran on had over the hardware that Stockfish ran on in both their matches. On data provided by DeepMind in https://deepmind.com/documents/260/... (Figure 2) it shows what would happen if the time control allowed by AlphaZero was reduced in an attempt to make the computational capabilities of the hardware that AlphaZero ran on comparable to the computational capabilities of the hardware that Stockfish ran on. When AlphaZero's time was reduced to 1/30 of the time allowed for Stockfish, Stockfish (approximately +25 =64 -11) outperformed AlphaZero, a 57% vs. 43% scoring %. And when AlphaZero's time was reduced to 1/100 of the time allowed for Stockfish, making the computational capabilities of the two sets of hardware approximately equal, Stockfish (approximately +38 =58 -4) significantly outperformed AlphaZero, a 67% vs. 33% scoring %. This was almost the reverse of the scoring % that AlphaZero achieved against Stockfish (approximately +28 =30 -2; 64% vs. 36%) when both engines were allowed the same amount of time. So I would say that the author's conclusion, without a doubt no less, that AlphaZero is the strongest entity that has ever played chess, is not substantiated by the facts provided by DeepMind when the disparity of their hardware computational capability used in the match is taken into account. To put it another way, the winner of the 2018 Formula 1 championship was Lewis Hamilton. I believe without a doubt that on a racetrack I could beat him handily if I was driving a Ferrari 488 or McLaren 720S and he was driving a Prius running on only 3 cylinders. So I would call the results of AlphaZero's two matches against Stockfish inconclusive at best. Don't get me wrong. I believe that the techniques used in AlphaZero, although not entirely original, represent a great advance in the development of neural network-based chess engines. And I'm sure that much of the implementation of AlphaZero is original even though the principles might not be. And the use of reinforced learning is a significant means to possibly discover new chess principles as a result of the learning not being biased by human approaches compared to the previously used neural network training based mostly on analyzing master and grandmaster games. But I think that their main thing that AlphaZero has demonstrated is that neural network-based chess engines are more than capable of defeating the best classical chess engines running on conventional multiprocessor systems when hosted in specialized processors (TPUs and GPUs) that can most effectively support the high degree of parallelism inherent in the use of neural networks. |
|
Feb-22-19 | | SChesshevsky: <...another advantage I think was that it didn't use any human play games at all. This is an advantage because:...> Though in AZ's training it wasn't fed any human games, I'm wondering if in fact it did end up using human games? Given the massive number of games it played and it's goal of finding the best winning approach, I wonder how many games it played in training that resulted in nearly matching or exactly matching actual GM versus GM games? I wouldn't be shocked that once AZ got a general idea of what clearly doesn't work and an inkling of what does work, the games it would end up playing would be very similar to GM versus GM games. So when putting that task on auto pilot and running it on super fast massively powered computers for a relatively long period of computer time, I'd assume it would likely have generated at least some games that have been played before and many more that are significantly similar. More useful than it's exhibitions with SF, might be seeing a wide swath of the self training games that AZ played when it reached it's highest level before venturing out to play others. Those might give a more accurate hint as to what new, if anything, AZ has found out about chess. |
|
Feb-23-19
 | | alexmagnus: Meanwhile LCZero lost the TCEC Superfinal to Stockfish by one point... By the way, this was the second drawiest Superfinal ever, with a draw rate of 81%, and the only Superfinal decided by a single point. |
|
Feb-23-19
 | | keypusher: <AK>
Some follow up from <Game Changer> i <During training 5,000 TPUs were used to generate self-play games, and 16 second-generation TPUs were used to train the neural networks. These computing resources minimise the time taken to complete the training. By contrast, when playing Stockfish AlphaZero used a single machine with 4 first-generation TPUs.> As I think you noted, you’d expect them to use second-generation TPUs for play. The book doesn’t say why they didn’t. It’s been posted several times, but one more time: <Stockfish was configured according to its 2016 TCEC world championship superfinal settings: 44 threads on 44 cores, a hash size of 32 GB, Syzygy endgame tablebases.> Matches:
A thousand games from the initial position in Jan. 2018. This is the one A0 won +155 -6 =839. A0 gave Sadler and Regan a file of 110 games, 80 with A0 as White. These are all in the CG database. <[O]ne game was selected at random for each unique opening sequence of 30 plies; all AlphaZero losses were also included.> A match using the TCEC openings. One hundred games were provided to Sadler and Regan, with the score +17-8=75. These are also in the database. <[O]ne game as White and one game as Black were selected at random from the match starting from the match starting from each opening position.>. Not sure how many games were played total — will check . <For the Science publication, a series of matches starting from common opening positions as specified by a popular chess website. 2000 games were made available to us. > Not in the database, and I don’t know if they are available anywhere. <140 games between AlphaZero and STockfish played on our request from specific opening positions. We wished to understand which opening scheme AlphaZero would choose as Black against 1.c4 and against 1.d4, 2.c4 systems (which (Stockfish rarely played) and to also to understand which systems AlphaZero would choose as White when faced faced with a King’s Indian or Gruenfeld. These games were played as a much faster time control of 18 minutes per player per game with 1.5 seconds added for move.For each of the games above we received the moves of the game and also evaluations by both AlphaZero and Stockfish after each move.> There are some games in the book that are at not in the database, but I don’t know if they’re from the 140 or the 2000. |
|
Feb-24-19
 | | AylerKupp: <keypusher> Thanks! I just got "Game Changer" in the mail today but I haven't had the time to start reading it yet. There seems to be some discrepancies in the description of the hardware resources used in the two articles and the book. In "Mastering Chess and Shogi by Self-Play" they only state that in the 100-game matches AlphaZero ran on a system using a single machine 4 TPUs without indicating whether they were 1st or 2nd generation, and that Stockfish used 64 threads (without specifying the number of cores) and a 1 GB hash table. So this must have referred to the 1st match. In "A general reinforcement learning algorithm that masters chess, shogi and Go through self-play" they indicate that AlphaZero used a single machine with four first-generation TPUs and 44 CPU cores while on p.4 that Stockfish used 64 CPUs (cores?) with no mention of hash table size. But on p.19 they say that Stockfish used 44 threads on 44 cores (two 2.2GHz Intel Xeon Broadwell CPUs with 22 cores), a hash size of 32GB, and syzygy endgame tablebases. So this must have referred to the 2nd match Now in "Game Changer" they also say that for the match AlphaZero still used a single machine with 4 1st generation TPUs and Stockfish used 44 threads on 44 cores and a 32 GB hash table, plus Syzygy endgame tablebases. So they must also have been referring to the 2nd match. So I'm still not 100% sure of the hardware configurations used in the 1st and 2nd matches. Oh well. |
|
 |
 |
< Earlier Kibitzing · PAGE 17 OF 39 ·
Later Kibitzing> |
|
|
|