< Earlier Kibitzing · PAGE 19 OF 39 ·
Later Kibitzing> |
Feb-27-19 | | MrMelad: < I am well aware that the method I used, comparing TFlops, is the <worst> method to use when trying to compare the performance of 2 computers with significantly different architectures. …
A much better way is to compare synthetic benchmarks similar to the application that you're going to use. But because the architectures of the CPUs and GPUs/TPUs are so different, no such synthetic benchmarks exist that I know of (although I didn't do much research on the matter) that can satisfactorily compare the performance of the 2 machines.> There’s a reason why people don’t compare much between the two, they are fundamentally different. GPUs (and TPUs) are based on the fact that in some applications (like rendering and manipulating 3D objects on a 2D screen) there are enormous amount of simple mathematical calculations that can be performed in parallel. The GPU can solve some mathematical operations faster than a CPU and much larger amounts of it per second. However, the software has to be specifically written to take advantage of it. Memory has to be uploaded to the GPU in large chunks and the code to manipulate it is written using a different language and framework usually (such as Cuda or OpenGL) than the one used for the main application. GPU applications are much harder to debug, when they are optimized they are usually much harder to read and considerations like memory speeds can regularly take precedence over code clarity. Just to give a hint at how complicated things can become, using an if statement is seriously discouraged when writing code for a GPU so it completely changes how one thinks about conditional programming. This is why I don’t think there’s a direct way to compare CPUs and GPUs directly. It’s not a matter of a factor. If I know my algorithm will run on a GPU I design it differently from the start. So attempts to measure how long it takes to run on a CPU are simply not relevant. To quote the highest raked answer from 2015 in your stack exchange link <GPU has not proven its usefulness in chess programming> - it just emphasizes the point that stockfish wouldn’t know what to do with a GPU in the first place and the fact that AZ can utilize it is not an “unfair” advantage but a technological difference. <Well, computational capability can be considered an "energy unit" and much more relevant than, say, watts. And "computational capability" is not equivalent to "CPU power", it applies to any entity that is capable of doing computations.>
But it’s impossible to compare GPU computational capability with CPU. If your algorithm is made mostly of nested if statements even a 10 TPUs device might not be faster than a single core CPU. <DeepMind themselves suggested equalization by reducing the time allocated to AlphaZero to make its move> That’s a good point and AZ still heavily outperformed stockfish on a 10/1 advantage. <What, if anything, is novel about the current AlphaZero implementation?> Everything! 😊
The hardware, the software, the choice not to use human data, the random play, the reinforcement process, the best player in batch selection process, its influence over creating the next generation of engines (Leela), etc… |
|
Feb-27-19
 | | AylerKupp: <<Sally Simpson> <If Stockfish has access to a tablebase and AlphaZero does not then why in this position (White to play) from Stockfish vs AlphaZero, 2018 play on for another 25 moves. ... Would Stockfish not be announcing mate in 'x' moves when ever it played?> Engines using the UCI GUI/engine communication protocol (most recent engines do) just report evaluations, they don't make any decisions to continue to play or not play, nor do they announce mate or resign. If you are using the UCI interface only the program operators or the GUI make those kinds of decisions. I don't know what GUI they were using during the AlphaZero vs. Stockfish match. I use Arena 3.5 and when I use it to conduct engine vs. engine tournaments it can optionally adjudicates the games as follows, and each of the options is user-settable: (a) Won for one side or the other if the evaluation crosses a specified threshold for a specific number of consecutive plies. he default threshold value in Arena is [ ± 9.00] (i.e. a queen up or down) for 3 consecutive moves and I have never bothered to change it. It is reflected by indicated that the losing side has resigned in the game termination status. (b) Drawn if either 200 moves have been played or 150 moves have been played with an evaluation of [0.00]. For the latter I would assume that means <consecutive> but I don't really know. (c) Won for one side or the other or Drawn if the tablebase returns a <Win> or <Draw> indication. Options (a) and (b) are optionally enabled/disabled as a pair and option (c) optionally enabled or disabled separately. Some complications occur because neither Nalimov nor Gaviota tablebases pay attention to the 50-move rule so they may not return a <Draw> indication correctly. The Syzygy tablebases have an option to observe the 50-move rule but that must be enabled in the engine's configuration file. But other GUIs which I am not familiar with probably have different adjudication situations or they specify/enable/disable them in different ways. <Would Stockfish not be announcing mate in 'x' moves when ever it played?> No engine using the Syzygy tablebases (and those are the only tablebases that Stockfish supports) would return a mate indication in 'x' moves because the Syzygy tablebases do not contain Distance-to-Mate (DTM) information. That's the reason that they are much smaller than the Nalimov, Gaviota, and Lomonosov tablebases containing the same number of pieces. These engines can only announced mate in 'x' if the mate is discovered during their minimax evaluation of their search tree. So I'm assuming that you used an off-line program such as http://chessok.com/?page_id=361 or http://www.k4it.de/index.php?topic=... to determine the mate in 71 for the position that you displayed. Now a question for you: What is the AlphaZero 'Repertoire Explorer'? |
|
Feb-27-19
 | | Sally Simpson: ***
Hi AylerKupp,
Thank you for your reply. That seems to have cleared it up. Think I got the gist of it. "What is the AlphaZero 'Repertoire Explorer'?"
At the top of this page on the left hand side you will see 'Repertoire Explorer'. As the game in question was a KID which I play. I was curious to see what else had been tried. Yet it only shows two games. *** |
|
Feb-27-19
 | | keypusher: <MrMelad> <AK> I'm going to stop saying this, but really, it's a wonderful discussion. From down here it seems like you're both making good points, despite my paid-up membership in the A0 cult. :-) <Sally> <AK> <As the game in question was a KID which I play. I was curious to see what else had been tried. Yet it only shows two games.> For whatever reason only the ten games released in 2017 have been loaded into the Repertoire Explorer. Of course, for 100 of the 210 games from 2018 in the database, the opening was pre-selected. I've discussed A0's opening preferences in other posts, which I won't repeat. There's a fuller discussion in <Game Changer>. Even in preselected openings A0's choices when it takes over can be fascinating, viz. AlphaZero vs Stockfish, 2018 (Caro-Kann) AlphaZero vs Stockfish, 2018 (Evans Gambit) Stockfish vs AlphaZero, 2018 (Neo-Gruenfeld(?)) |
|
Feb-27-19
 | | keypusher: <sally simpson> Other King's Indians:
AlphaZero vs Stockfish, 2018 AlphaZero vs Stockfish, 2018 Stockfish vs AlphaZero, 2018 AlphaZero vs Stockfish, 2018 (a good one, adagmator did a video about it) Stockfish vs AlphaZero, 2018 Stockfish vs AlphaZero, 2018 Stockfish vs AlphaZero, 2018 Stockfish vs AlphaZero, 2018 AlphaZero vs Stockfish, 2018 |
|
Feb-27-19
 | | Sally Simpson: ***
Thanks K.P. Thought I'd better have a look at what the Alpha and Stuckfish(sic...after seeing it was a typing error) were doing because the human sheep will follow and I'll have to set up some new 2 move traps for them to fall into. A tad disappointed to see it is because a carbon based unit has not got around to loading the repertoire bank. Was hoping Alpha had infiltrated the system and was hiding things. Recall a book Tony Miles (and Eric Moscow) did on the Yugoslav v The Dragon and a wonderful combination not given in the book but just waiting to be played. I speculated (around about 1980) that Tony tucked it up his sleeve for OTB use rather than spill the beans. (Back then I trusted nothing I saw or who wrote it. Especially if it was written by me! ) Here. (White to play)
 click for larger viewMiles and Moskow look at 21.g5 Nh5 which holds and 21.Rh2 Rg8 which also defends. How about 21.Nf5 (surely Miles and Moscow looked at this). If the very plausible 21...Rg8 22.Qxh7+ Nxh7 24.Rxh7+ Kxh7 25.Rh1 mate.  click for larger viewEven if you find a flaw you would still put this beautiful idea in the book. *** |
|
Feb-27-19
 | | AylerKupp: <<MrMelad> <I think it’s a terminology issue that is bothering me here. Stockfish was not limited to lower computational capability, if he would’ve given access to this capability he wouldn’t be able to use it. If you are merely trying to imply that AZ had computational advantage then I agree, but it was not achieved by limiting stockfish, only by utilizing better a different type of technology.> I reviewed what I presented in AlphaZero (Computer) (kibitz #458) and I made a mistake there; there was no way that Stockfish 8 could achieve similar computational capability by adding CPUs, So It doesn't seem like we will be able to reach agreement here. As far as I'm concerned unless the 2 systems have similar computational capability then any comparison of the relative performance of the 2 engines in terms of results achieved is totally meaningless. If the match were done today with Stockfish 10, which can support 512 cores and a system could be configured with 512 cores running at a faster clock rate, then this system's computational capability would be just a little bit less than a system for AlphaZero configured with 1 core and 1 1st generation TPU. But that's equally meaningless with regards to the results in the AlphaZero vs. Stockfish matches. <There’s a reason why people don’t compare much between the two, they are fundamentally different. ... So attempts to measure how long it takes to run on a CPU are simply not relevant. > They are fundamentally difference but it is not only relevant to compare the two but you <must> do that. Otherwise how would you know which architecture is best suited to use for a given application? And I showed you a way to do it although I will be the first to agree that it's not very accurate. But even if it is off by a factor of 2X either way it's certainly better than to simply throw up your hands and say that it can't be done. I've had similar discussions with others before. And, yes, I'm very aware that in order to achieve optimum performance from a highly parallel architecture machine you must avoid out of sequence execution just like you would have to do with highly pipelined machine architectures. You must know your application in order to determine what kind of architecture is best suited for its implementation. <That’s a good point and AZ still heavily outperformed stockfish on a 10/1 advantage.> I think that you meant either a 10/1 time advantage for <Stockfish> or a 1/10 time <dis>advantage for AlphaZero. And yes, AlphaZero outperformed Stockfish with a 1/10 time disadvantage (+15, -3, =77, Score % = 55.5%) per the data. Although we might quibble whether that is "heavily" or not. But as I pointed out in AlphaZero (Computer) (kibitz #432), if AlphaZero had a 1/33 time disadvantage DeepMind's data shows that Stockfish would have outperformed AlphaZero somewhat; (+18, =64, -10; Score % = 57.0%). And if AlphaZero had a 1/100 disadvantage then DeepMind's data shows Stockfish would have outperformed AlphaZero by a margin similar to AlphaZero's margin in the actual matches (+24, =58, -4, Score % = 67.0%). By graphing the performance results shown in Fig. 2 and interpolating between a 1/33 and a 1/100 AlphaZero disadvantage to estimate the results when AlphaZero would have a 1/80 disadvantage (which is the inverse of my most recent calculation of the performance advantage that AlphaZero had in the actual matches), it projects to an estimated Stockfish score of (+37, =58, -5, Score % = 67.0%) compared to AlphaZero's score of (+30, =68, -2, Score % = 64%) in the actual matches. Almost the reverse results and definitely "heavily". |
|
Feb-27-19
 | | AylerKupp: <<MrMelad> <To quote the highest raked answer from 2015 in your stack exchange link <GPU has not proven its usefulness in chess programming> - it just emphasizes the point that stockfish wouldn’t know what to do with a GPU in the first place and the fact that AZ can utilize it is not an “unfair” advantage but a technological difference.> Of course that was in 2015 and this is 2019; things change quickly in the computer business. But I do agree that it certainly hasn't been proven yet, but I think that's mainly because insufficient effort has been applied due to the lack of adequate $ponsors. And I'm not sure why you or anyone else would not consider a technical difference that gives your engine a significant computational advantage over another an "unfair" advantage, but here we clearly differ. <Everything! The hardware, the software, the choice not to use human data, the random play, the reinforcement process, the best player in batch selection process, its influence over creating the next generation of engines (Leela), etc…> I pointed out in various earlier posts, and gave substantiation, that none of those items you mentioned other than the proprietary hardware were original innovations by AlphaZero; they were all done by others and sometimes significantly before AlphaZero did them. What AlphaZero did do was integrate them better than any other person or team and achieve superior results – mostly due to its computational capability advantage. Don't let your enthusiasm for AlphaZero overly influence your opinions. Go ahead and check the links I provided to verify what I said. As far as its influence over creating the next generation of engines I suspect that if AlphaZero had never existed that engines such LeelaC0 using GPUs would have been created eventually and influenced by Mathew Lai's work the Giraffe chess engine. But that's just conjecture on my part, and AlphaZero certainly accelerated this process. |
|
Feb-27-19 | | nok: <none of those items you mentioned other than the proprietary hardware were original innovations by AlphaZero; they were all done by others and sometimes significantly before AlphaZero did them. What AlphaZero did do was integrate them better than any other person or team and achieve superior results – mostly due to its computational capability advantage.> Right on, Ayler.
AlphaZero (Computer) (kibitz #413) |
|
Feb-28-19 | | MrMelad: <AylerKupp> I don't think GPUs should be measured in terms of CPU cycles and I've made a case for it, which you haven't directly addressed. I said <It’s not a matter of a factor> yet you keep returning to estimations (which you admit are wrong) and try to settle the uncertainties with a factor. So be it, I appreciate your input on this point and graciously disagree. On AlphaZero being "novel" and innovative:
<Of course that was in 2015 and this is 2019; things change quickly in the computer business> And who changed this specific thing? Who drew heavy attention to reinforcement learning, GPUs and TPUs in chess engines? That's right, it was DeepMind and AlphaZero. I call it novel and ground breaking to influence the attitude and direction of an entire field. An example for its influence over the field - in the last SPIE 2019 conference that ended few days ago, DeepMind gave a <keynote> speech regarding reinforcement learning in many other fields. DeepMind and google has become a world leader in the field of reinforcement learning. <none of those items you mentioned other than the proprietary hardware were original innovations by AlphaZero; they were all done by others and sometimes significantly before AlphaZero did them> The MIT book on Deep Learning by Goodfellow, Bengio and Courville from 2016 (https://www.goodreads.com/book/show...) mentioned the field of Reinforcement learning as a <crowning achievement> of Deep learning and <mentioned DeepMind specifically> as a major contributors, possibly the most important: <Another crowning achievement of deep learning is its extension to the domain of *reinforced learning*. In the context of reinforcement learning, an autonomous agent must learn to perform a task by trial and error, without any guidance from the human operator. DeepMind demonstrated that a reinforcement learning system based on deep learning is capable of learning to play Atari video games, reaching human-level performance on many tasks (Mnih et al. 2015)> This important book lists DeepMind as <pivotal> in the role of proving that reinforcement learning is viable and central in its future research. The book was written even before AlphaGo and AlphaZero became public, imagine how the next edition of the book will describe DeepMind's <super-human> achievements. Besides, the algorithms and methods of deep learning can be traced back to the 1940s, does it mean any research in the field is not original? Proving an algorithm can work is just as important as theorizing it would. Also from the book:
<Deep learning only "appears" to be new, because it was relatively unpopular for several years preceding its current popularity> ...
<Deep learning has become more useful as the amount of available training data has increased> ...
<Deep learning models have grown in size over time as computer infrastructure (both hardware and software) for deep learning has improved> ...
<One may wonder why deep learning has only recently become recognized as a crucial technology even though the first experiments with artificial neural networks were conducted in the 1950s. ... The learning algorithms reaching human performance on complex tasks today are nearly identical to the learning algorithms that struggled to solve toy problems in the 1980s> ...
<The most important new development is that today we can provide these algorithms with the resources they need to succeed.> Providing the resources for deep learning the succeed is a <crucial> step, and dwarfing their achievements to just <warmed-up ideas backed w/ big money ie. big hardware> as <nok> would like us to think is ridiculous, especially when the ideas themselves can be traced back to… DeepMind! |
|
Feb-28-19 | | MrMelad: Saying AlphaZero is not innovative is like saying: The kid who yelled that the emperor has no clothes contributed nothing new as the emperor was already naked when it was pointed out NASA first moon landing wasn't innovative as most of the science used was invented hundreds of years earlier and Jules Verne predicted it Nothing significantly innovative occurred in the computer industry over the last 80 years since Von Neumann helped create the Von Neumann architecture Garry Kasparov never had a significant innovation in chess since most of ideas were based on concepts developed many years earlier Why would any of these <obviously false> statements be any less wrong than claiming AlphaZero did nothing new? |
|
Feb-28-19
 | | Sally Simpson: ***
"Why would any of these <obviously false> statements be any less wrong than claiming AlphaZero did nothing new?" The program 'Kimo' was doing everything Alpha does today. "On analysis levels Kimo often finds incredible moves, which other top programs will not be able to see or understand. The openings book is tiny by today's standard (just 30,000 positions, compared to many millions for the other programs), but of such high quality that we have yet to see Kimo come out of book with an inferior position." https://en.chessbase.com/post/kimo-... A month later it was reported that soon after that ChessBase item the developers realised there was more money to made enhancing the guidance system of cruise missiles so left chess to blow things up. *** |
|
Feb-28-19 | | MrMelad: <Sally Simpson> Kimo might be classified as deep learning architecture but it: 1. Not reinforcement learning
<Kimo's algorithms are based not on a brute force search but rather on chess knowledge derived from around 20,000 high quality games> 2. Doesn't use GPU
3. Most probably use different network hidden layer model and gradient descent convergence (not stochastic) It is somewhat a similar idea and sounds like it was ahead of its time but to say <The program 'Kimo' was doing everything Alpha does today> is not understanding what AlphaZero does. Actually point 1 is more than enough to distinguish between the two. AlphaZero's main innovation and the whole point was it's <reinforcement learning> approach. |
|
Feb-28-19
 | | keypusher: <AK> <MrMelad> I think somewhere in <Game Changer> it compares the hardware for A0 to CPU architecture...or maybe I'm mixing that up with something I saw elsewhere. I'll check, I don't have the book with me. I'm curious about Lc0, which still doesn't have its own page. Here's an excerpt from an article about a tournament it won recently. http://blog.lczero.org/2019/02/leel... <TCEC CUP-2 was a tournament running with the 32 top engines of TCEC running on normal TCEC hardware of 43 cores of 2 x Intel Xeon E5 2699 v4 at 2.8 GHz.
Leela was running on a Nvidia RTX 2080 Ti plus a RTX 2080 with the 32742 net and Lc0 v20.1.> No idea how what Leela was using compares to what the other engines were using. Lc0 was developed using distributed computing resources, I understand. But does that mean that the distributed resources took the place of the 5,000 first-generation TPUs to play the 44 million practice games and 64 second-generation TPUs to train the neural networks? There's a reference in the Leela blog article to a souped-up StockFish (SF dev 176 threads) used to kibitz the tournament games. I think the same SF was used to kibitz the recent TCEC final, in which SF beat Lc0 50.5-49.5. It seems like preselected openings are a near-necessity to generate meaningful numbers of decisive games in these events, which makes me think they should try FischerRandom or similar setups instead (but maybe that would lead to an even higher percentage of draws, at least for standard alpha/beta type engines). Since the NN devices were trained from the standard starting position, I wonder how they would do. <Sally> Nice combination. I'll bet Miles saw it. |
|
Feb-28-19 | | MrMelad: <Sally> <keypusher> I don’t know if this point was already mentioned, but it occurred to me that the deep learning approach has advantage in the opening phase over the classical heuristic alpha beta approach.
Even if the heurisitic attempt to reduce the possible choices that needs be examined reduces it by a large factor there’s still close to infinite many choices left. In contrast to that the deep learning approach is much closer to what a human would do in the manner that it looks for patterns and general rules that can be encompassed as the network’s hidden features and is discovered from accumulated experience. This may explain the relative success of even a primitive deep learning construct such as Kimo in the opening phase and it makes AZ openings study among its most significant results. |
|
Feb-28-19 | | Tiggler: This 1981 paper provides early insights into associative learning, and the representation of advantageous responses in an associative memory. https://scholarworks.umass.edu/cgi/... Drive reinforcement systems, such as AlphaZero, are descended from the work of Barto and Sutton. |
|
Feb-28-19
 | | Sally Simpson: ***
Agree Kimo did not have a learning capability but it's approach was novel. There are billions of different chess postions but 99.999% of them are stupid and would never be seen on a chessboard. Kimo used the 'joke theory'. The are only 30 jokes and every new joke you hear is a variation of one of the 30 jokes. Kimo reckoned you only need 30,000 GM games on hand to play good chess. It played in chunks. At any position Kimo search it's 30,000 games D.B. looking for matching pawn structures. Takes a snapshot of the game ten moves in advance. If it likes what it sees it plays for that. If there was a forced deviation it searched for another chunk. Humans snapshot all the time but rarely 10 move deep unless it's an ending. It never stored or learned anything but it could run on an average home computer, provided you were willing to part with a huge pile of money and it was very very good. One funny weakness was mating in the endgame. If it saw a mate in 10 it stopped and went for it when in some cases due to the way it searched it missed a mate in 3 or 4. It still mated but it took longer, it could not see difference between mate in 10 or mate in 3. As long as it won it did not matter. The programmers used the pawn structure recognition routine in the map reading program in Ferranti Missiles. At the testing stage it thought the north coast of Skye was the coast of the Benbecula Missile range (they do look similar) and blew up a moored boat. Thankfully nobody was onboard at the time. *** |
|
Feb-28-19
 | | AylerKupp: <<MrMelad> I don't think GPUs should be measured in terms of CPU cycles and I've made a case for it, which you haven't directly addressed.> I agree, and I never said that. Those are your words, not mine. It obviously makes no sense to compare GPUs/TPUs, or even CPUs, on the basis of cycles, since those are highly dependent on the computer's architecture. I didn't address it because it seemed so obvious that I didn't think I needed to. What I said was that CPUs/GPU could be <approximately> compared on the basis of <operations per second>, specifically floating point operations per second, where the data (representation and precision) and operations are defined by IEEE std 754, which all CPUs/GPUs will likely support. Google's TPU doesn't, converting floating point data to lower-precision integers and that's one reason why it's faster than otherwise comparable GPUs; lets bits to push around after the initial quantization. Flops or equivalent Flops in the case of TPUs if you include the necessary quantization step are fundamental units defining computational capability. Different architectures allow flops to be performed in one or more cycles depending on the degree of parallelism that the architecture performs. But they do reflect the amount of calculations that can be performed in one unit of time. <yet you keep returning to estimations (which you admit are wrong) and try to settle the uncertainties with a factor.> Again, I never said that they were wrong, I said that they were not necessarily <accurate>, which is not the same thing. And when you are dealing with quantities that you know are inaccurate it's not only usual but proper to try to identify the degree of inaccuracy (e.g. ± X%). Which is what I tried to do. <And who changed this specific thing? Who drew heavy attention to reinforcement learning, GPUs and TPUs in chess engines? That's right, it was DeepMind and AlphaZero. I call it novel and ground breaking to influence the attitude and direction of an entire field.> Sigh, we're obviously not communicating, but I suppose a lot depends on your definition of "innovative", which I tend to interpret as "original". I indicated and provided evidence that the DeepMind team did not invent any of these items (neural network applications to chess playing engines, reinforcement training, use of parallel processors (GPUs), etc. No doubt that they came up with <refinements> in many of these areas, and these <refinements> were certainly original. But the concepts were not. You did say that Goodfellow et. al. identified DeepMind as a major <contributor>, possibly the most important, in the field of deep reinforcement training and I would tend to agree with all of that. But not the <originator> of deep reinforcement training. Use of TPUs was certainly original but that's hardly surprising since they were developed and optimized by Google for a specific purpose, and they are the only ones who have them. But the principles behind GPGPUs and TPUs are similar, except that in TPUs they are more specific-purpose and hence more effective in this type of applications. What I do think that DeepMind did do that was significant (and perhaps even innovative) was to integrate all these areas more effectively than any other person or team. This was a major accomplishment and I certainly applaud that. During my career I have been called "original" by many of my associates but I disagreed with them; what I thought I was best at was taking various independent concepts and integrating them in a manner that had never (to my knowledge) been used before. Not that different from what the DeepMind team did. And they certainly popularized the entire concept as a result of the overwhelming defeat of Stockfish by AlphaZero in their matches. Driven, of course, by AlphaZero's overwhelming superiority in computational capability. But look, our discussion is unfortunately degenerating into a "he said, I said" type of thing and I don't think that serves any constructive purpose. You seem to be set in your awe of AlphaZero and I doubt that I can change that with mere facts. So we'll just have to agree to disagree on what original and what wasn't and was shown by the matches and what wasn't. I don't want to continue this discussion, it's a waste of time. |
|
Mar-01-19
 | | AylerKupp: <<keypusher> No idea how what Leela was using compares to what the other engines were using.> This is what I've been able to find out:
In <TCEC Season 12> held from Apr-2018 to Jul-2018 all engines were running on identical hardware, an Intel dual-CPU, 22 cores/CPU Xeon E5 2699 v4 2.8 GHz server. I estimated its computational performance @ ~ 1.5 TFlops by using the average theoretical peak theoretical performance for a single-CPU Xeon E5 2699 v4 @ 2.8 GHz server when executing AVX2 and FMA instructions per https://www.microway.com/knowledge-.... Leela C0 made its first appearance this season and was the first chess playing neural network (CPNN) that did so. It competed in Division 4, the lowest. And it did not have any GPU support and ran on the same hardware as the other engines. To show that it was capable of competing against the "big guys" 3 series of mini-matches were conducted prior to the event and LeelaC0 performed reasonably well. But in the actual tournament it did very poorly, finishing in last place in Division 4 (and last place overall) with a score of 2.0/28 and a rating of 2714. See http://www.chessdom.com/breaking-le... and http://mytcecexperience.blogspot.co... for details. In <TCEC Season 13> held from Aug-2018 to Nov-2018 the same CPU server was used but CPNNs were allowed to use a CPU server consisting of: (a) One Intel i5 2600K server with one 4-core CPU. I estimate its computational performance @ ~ 0.124 TFlops per https://boinc.bakerlab.org/rosetta/.... (b) One Nvidia GeForce RTX 2080 with 1 GPU. I (or rather Nvidia) estimate its performance @ ~ 10.1 TFlops per https://www.anandtech.com/show/1324.... (c) One Nvidia GeForce RTX 2080i with 1 GPU. I/Nvidia estimate its performance @ ~ 13.4 TFlops per the same link as above. So the total computational performance capability of the GPU server is ~ 23.5 TFlops, or ~ 16X the computational performance of the CPU server. So I was wrong in my original assumption that LeelaC0's hardware had approximately the same computational performance capability as the other engines' hardware. LeelaC0 again competed in Division 4 due to it's lackluster performance in TCEC season 12. But this time, helped by an extra month's development and no doubt by it's access to the GPU server and substantial computational capability advantage over its competitors, it won Division 4 with a score of 20.0/28 points, and achieved a rating of 3219. This was 1.5 points ahead of DeusX (18.5/28 points), another CPNN which, perhaps not coincidentally, also had access to the GPU server. The highest performing "classical" engine was Wasp 3.2 with 18.0/28 points. See https://www.reddit.com/r/chess/comm.... In <TCEC Season 14> held from Nov-2018 to Feb-2019 the same CPU and GPU servers were used as in TCEC Season 13. As a result of its fine performance in TCEC Season 13, LeelaC0 was eventually promoted to the Premier division and competed against Houdini, Komodo both regular and MCTS), Stockfish, Fire, Andsacs, and Etherial. It captured 1st place in Division 1 with a score of 19.5/28, 2.0 points ahead of Komodo MCTS and achieved a rating of 3227. It qualified for the Superfinal where it was barely edged out by Stockfish who won by a score of +10, =81, -9, with 2nd place overall. LeelaC0 was able to achieve a rating of 3404 but I'm not sure if that was at the beginning or the end of the Superfinal match against Stockfish. |
|
Mar-01-19 | | disasterion: Hi <Sally> A long way back up the thread you talk about this position: click for larger viewand the possibility of 21.Nf5, hoping for 21... Rg8, when as you point out, White has a forced mate. Can't black just take the Knight? After 21... Bxf5 I don't see what progress white can make. |
|
Mar-01-19
 | | Sally Simpson: ***
Hi disasterion,
It's not forced mate after 21.Nf5 and 21...Bxf5 does look the best move. However it would an unsettling time as Black, faced with a TN has to discover Bxf5 OTB (they would have to spot the Queen & Rook sac) and then White would have the advantage of knowing the position after Bxf5 and play accordingly. Of course it is humorous speculation on my part. The chances are they missed the cute Queen/Rook sac. If they saw it then it's too pretty to leave out. I would often do this with opening books. Find a trappy move not mentioned, then wondered why. Typical chess players paranoia.
These days the players have lost their paranoia and passed it onto computers. I'm one of the old school and still think everyone is out to get me. *** |
|
Mar-01-19 | | disasterion: Thanks Sally. Good points, and it's a nice trap. There's nothing wrong with a bit of old school paranoia... |
|
Mar-01-19 | | scholes: Comparing alpgazero to kimo is like comparing comparing Ford Model T to Ferrari. Currently leela chess zero has already been training with a new network architecture. That architecture has much better performance compared to alpha zero architecture. That architecture was discovered only few months after alphazero 2017 paper was published. If there was no elo limit in chess then that architecture would have been 150 ELO better than network in tcec finals. But since it seems stockfish is very close to perfect chess. Hence it is suspense how much new architecture will gain. |
|
Mar-01-19
 | | AylerKupp: <<scholes> Currently leela chess zero has already been training with a new network architecture. That architecture has much better performance compared to alpha zero architecture. > Could you please elaborate? Where are the neural network architectures of AlphaZero and Leela Chess Zero described in sufficient detail to conclude that one provides better performance than the other? And what is it about the difference that provides the better performance; more layers? More neurons/per layer?, other? And what is this "Elo limit in chess?". And how can you predict how much better (in terms of rating point gain) Leela Chess Zero's new architecture would be than the previous architecture? Can you predict what the results of a match between the new and previous architecture would be without actually conducting such a match? |
|
Mar-01-19
 | | AylerKupp: <<Sally Simpson> These days the players have lost their paranoia and passed it onto computers. I'm one of the old school and still think everyone is out to get me.> That reminds me of the old joke: "Just because you're not paranoid doesn't mean that everyone is not out to get you." |
|
 |
 |
< Earlier Kibitzing · PAGE 19 OF 39 ·
Later Kibitzing> |
|
|
|