< Earlier Kibitzing · PAGE 14 OF 39 ·
Later Kibitzing> |
Dec-29-18
 | | WannaBe: <DrDum> was nice enough to share this link with me, and I'll post it here: https://www.newyorker.com/science/e... |
|
Dec-30-18 | | Ladolcevita: It's amazing Alpha Zero totally trumps the Stockfish,but it's also kind of curious and perplexing to me... If one software could be defeated by another, does that mean the current softwares haven't grasped the absolute truth of chess yet? Or is there even such a thing as the absolute truth of chess that when "Alpha Ultimate" has attained, it would never lose one single game? I mean, is chess truly exhaustible? The current engines's defeats clearly shows not yet. |
|
Jan-04-19
 | | keypusher: Chessgames has now uploaded 100 more games, all with openings taken from the 2016 TCEC event. |
|
Jan-04-19
 | | AylerKupp: <<Ladolcevita> It's amazing Alpha Zero totally trumps the Stockfish, but it's also kind of curious and perplexing to me...> Given that AlphaZero was running on hardware that was more than 100 times faster than the hardware that Stockfish was running on (see my post on AlphaZero vs Stockfish, 2017), the only thing that's perplexing to me is why AlphaZero did not win more games than it did. And what might have happened if the hardware that AlphaZero was running on was "only" 30 times faster than the hardware that Stockfish was running on can be seen in https://www.chess.com/news/view/upd... |
|
Jan-04-19 | | john barleycorn: I am always happy to see that a human commentator can "explain" and "comment" those games. can s/he? really? |
|
Jan-04-19
 | | keypusher: <AylerKupp> The Science paper contains a somewhat more detailed description of the hardware used for Stockfish, which I’ll quote in case you haven’t seen it: <Stockfish was configured according to its 2016 TCEC world championship superfinal settings — 44 threads on 44 cores (two 2.2 GHz Intel Xeon Broadwell CPUs with 22 cores), a hash size of 32 GBs, syzgy endgame tablebases, a 3 hrs time control with 15 additional seconds per move.> I understand very little of that, but anyway, both SF8 and SF9 were tested with this configuration. You’ve said several times that A0’s hardware was 100 times faster than SF’s — how can you tell? And does it matter? Deep Mind characterizes the hardware for SF as optimal for SF. Is it? How does computing speed affect performance if (as is claimed) A0 is looking a fraction of the number of positions per second that SF is examining? I don’t know much, obviously, but I’ve fallen in love with A0’s chess, so I am probably being a bit defensive. |
|
Jan-04-19
 | | keypusher: <john barleycorn: I am always happy to see that a human commentator can "explain" and "comment" those games. can s/he? really?> Really no different from Chernev annotating a Capablanca game, is it? At least the commentators have strong engines to help them. Anyway, Sadler, Daniel King, agadmator and doubtless others have managed to do good videos about the games. |
|
Jan-04-19 | | scholes: It's good Google didn't run sf also on GPU. It would have scored zero points. Ayler Kupp, in September 2018, Nvidia released new line up of consumer gpu which are 4-5 times of faster for machine learning applications. Now on a single RTX 2070 GPU (500$??) Leela gets nearly 40 percent of alpha zero speed. So claim that a0 was running running on unfair hardware is no longer true. |
|
Jan-04-19 | | ChessHigherCat: Those two articles cited by <Murky> and <Wannabe> above show how informative articles by well-informed, open-minded people can be as compared to endless speculative quibbling by speculative quibblers about unfair hardware advantages and paranoid theories of Google's concealed draws, etc. That was never what was interesting anyway, what's exciting is a new era in computing based on self-teaching and neural networks. |
|
Jan-05-19 | | WorstPlayerEver: <CHC>
<open-minded people> are those who think their brain works like as some kind of parachute, I assume. Boy, <unfair hardware advantages..> what does that even mean? That humans, since their arses got beaten, simply refuse to play the darn better players? However, they also seem to have lost their intellectual status, which they will simply avoid by making cliche statements, such as: "It's no fun to face the engines!" Now -this topic is the living proof- suddenly chess has to be <fun>. Kaboom! Not by means of gathering any intellectual property in a decent way, but simply by means of a total lack of ambition concerned the intellectual input. Otherwise Google Club Masters have to spend more than a rotten peanut on chess. I'd rather would have seen the 8 chessmen problem solved, but hey... that had spoiled all the F.U.N.! Obviously... Good grief.. how awfol! |
|
Jan-05-19 | | WorstPlayerEver: PS the main theme in this thread btw is <decency> You know the thing that guys like <Hitler> once claimed to be theirs, the thing that all the <keypushers> in this world now claim to be theirs. For no apparent reason, but that's okay. Needless to say that <decency> is the center of all political activities nowadays. Literally; an example:
A. Building a wall is decent.
B. Building a wall is not decent.
This is called: multiple choices regarded decency. Another example:
A. <This> is meant for the <Rogoff> page.
B. <This> is <on topic>. Last but not least: intellectual corruption by making false associations; A. You have an <open mind>
B. You are not decent.
The implications of all these famous incorrect associations are so clear that one must wonder why babies are not allowed to vote. While I bet they can do it. |
|
Jan-05-19 | | ChessHigherCat: <WPE> <open-minded people> are those who think their brain works like as some kind of parachute, I assume.> Well looking at all the data without trying to force into it conspiracy theories could be seen as a sort of parachute to prevent crash-landing in an abyss of paranoia. As to the rest, I plead: Does not compute, does not compute, does not... |
|
Jan-05-19
 | | Diademas: <There is a distinct difference between having an open mind and having a hole in your head > -Norwegian proverb. |
|
Jan-05-19
 | | AylerKupp: <<keypusher> I understand very little of that, but anyway, both SF8 and SF9 were tested with this configuration.> Yes, I'm familiar with the Stockfish configuration as described by the Science paper and it's the same as described in the original Deep Mind paper (https://arxiv.org/pdf/1712.01815.pdf). The AlphaZero configuration as described is a little different; in the original Deep Mind paper it's described as a "single machine with 4 TPUs" (with no indication as to whether they were 1st or 2nd generation TPUs) while in the Science paper it's described as a "single machine with 4 1st generation TPUs and 44 CPU cores". Yet Deep Mind had available at least 64 2nd generation TPUs which they used to train AlphaZero's neural network as described in the original Deep Mind paper. So it's unclear to me whether they used 1st or 2nd generation TPUs in the second match, although I don't see why they wouldn't use 2nd generation TPUs since those have greater processing capability. <You’ve said several times that A0’s hardware was 100 times faster than SF’s — how can you tell? And does it matter?> I can tell <approximately> by looking at the least common denominator between the hardware, teraflops (10^9 floating point operations per second) to get <an idea> of how much faster one computer is than another. These numbers are readily available. Now I will say that this is the <worst> possible to try to compare the performance of two computer systems but that's the only one available and my estimate could easily be off by a factor of 2 – either way. A much better way is to run benchmarks that simulate the applications that you're going to be evaluating, chess engines in this case. But the methods used by AlphaZero and Stockfish are so different that there aren't any benchmarks that properly simulate the two; the types of applications that run on GPUs and TPUs are typically very different than the types of applications that run on general purpose CPUs. If you were to rehost Stockfish to run on a GPU it might not run any faster than it runs on a CPU and might even run slower. As far as AlphaZero's hardware being faster than Stockfish hardware, t matters quite a lot. It's like running a match between 2 engines of comparable playing strength, say Stockfish 10 and Komodo 12.3, with Stockfish running on a machine with 1 CPU and Komodo running on a machine with 100 CPUs, both with similar clock speeds. I wouldn't be surprised if Komodo won, say, 70% of the games with the other 30% being drawn. But that's just a guess; I have no data to support that statement. But think of it this way. If you were to arrange a series of races on a given circuit between a Ferrari and a Prius, which car do you think would win the majority of the races? And I said "majority" rather than "all" to account for the possibility of mechanical breakdowns and driver errors on the part of the Ferrari; i.e. the probability of a non-win by a Ferrari is not zero. |
|
Jan-05-19
 | | AylerKupp: <<keypusher> <How does computing speed affect performance if (as is claimed) A0 is looking a fraction of the number of positions per second that SF is examining?> In a "classic" engine (i.e. an engine not using neural networks and alpha-beta pruning) the faster the computer system the deeper it can search in a given amount of time. And the deeper it can search the less likely it is to succumb to the horizon effect, the better it can evaluate a position, and the better it will typically play. Of course there are many other factors that are involved in determining a chess engine's playing strength. As far as AlphaZero looking at a fraction of positions per second as Stockfish, I think that's irrelevant, although Deep Minds seems to want to make a big deal out of it. All it means is that AlphaZero is only able to look at a much smaller number of positions in the same amount of time. There are many approaches to determine what is the best move to be played. In a "classic" chess engine you can evaluate a lot of positions using a very simple evaluation function (which takes less time to execute) or you can use a fairly complicated evaluation function which will take more time to execute and therefore will not allow the engine to look at as many positions in the same amount of time. The key to playing strength is to find the right balance. AlphaZero does not have an evaluation function in the classical sense; it runs as many simulated games from a given position as it has time for, and its position "evaluation" is the statistical result of the games run from that position for each move that it considers. The real key are the heuristics used to most effectively prune the search tree to eliminate from consideration any moves that are considered to be sub-optimal. After all, the best way to save time in evaluating a branch of an engine's search tree is not to evaluate the positions in that branch at all! I think that the key to AlphaZero's playing strength is that it seems to be able to do a very good job of selecting the optimal moves to consider playing from each position, and the fact that by running the simulations to the end of the games it is not affected by the horizon effect (I don't think). And I don't remember seeing a description of how AlphaZero selects the set of moves that it will consider as the basis for conducting its game simulations. <I don’t know much, obviously, but I’ve fallen in love with A0’s chess, so I am probably being a bit defensive.> Don't be afraid to admit to have fallen in love and don't feel defensive about it. AlphaZero's use of reinforcement learning is a revolutionary approach to game playing that seems to hold great promise, although there are two areas that concern me about it's ability to get much better. See my post AlphaZero (Computer) (kibitz #321). |
|
Jan-05-19
 | | AylerKupp: <<scholes> It's good Google didn't run sf also on GPU. It would have scored zero points.> And what's your basis for that statement?
<Ayler Kupp, in September 2018, Nvidia released new line up of consumer gpu which are 4-5 times of faster for machine learning applications.> These two AlphaZero vs. Stockfish matches are history. So the availability of new consumer GPUs which are faster than previous consumer GPUs will not change the results of those matches If either AlphaZero or Stockfish was running on a future and much faster quantum computer it would also not affect the results of those matches either. <Now on a single RTX 2070 GPU (500$??) Leela gets nearly 40 percent of alpha zero speed. So claim that a0 was running on unfair hardware is no longer true.> All my comments are related to the fact that since AlphaZero was running on much more powerful hardware than Stockfish in these matches then the results of the matches cannot indicate any inherent superiority of one approach vs. another in developing a superior chess playing engine. Nothing more. |
|
Jan-06-19 | | WorstPlayerEver: <CHC>
Boy, every time you try to think you seem to avoid conspiracy theories.. It's kinda funny.
<Well looking at all the data without trying to force into it conspiracy theories could be seen as a sort of parachute to prevent crash-landing in an abyss of paranoia.> Do you even know what data is?
https://www.quora.com/How-many-game... Still... there is not a single output known from these games. We do speak of a few hundred megabytes, btw. |
|
Jan-06-19 | | ChessHigherCat: <WorstPlayerEver: <CHC> Boy, every time you try to think you seem to avoid conspiracy theories..>
And that's a bad thing?
<Do you even know what data is?
https://www.quora.com/How-many-game...
<Still... there is not a single output known from these games> Okay, I forgot who I was talking to: when I said look at all the data, I assumed it was obvious that I meant all the relevant data, excluding the data that are irrelevant (such as your digressions on the wall) and imaginary data (such as your enlightening statements on the total absence of Soviet tanks at the start of the Operation Barbarossa and Stalin's lack of warning signs of the invasion). As usual, you're completely incapable of formulating a logical argument, so it's up to me to try and fill in the vast cavities in the cerebral cobwebs but if you're trying to say that the training data are essential to determining which system plays better, I beg to differ. In fact, a comparison would be impossible because SF doesn't have any self-training data. In any case, the final score of the match is by far the most relevant and convincing evidence. Similarly, the exact content (moves) of the draws are of peripheral interest at best, especially since we can't even analyze them using SF on the principle that the same party can't be both judge and party to the proceedings. |
|
Jan-06-19
 | | keypusher: <Ayler Kupp> Thanks for your responses. Re one of your concerns about AlphaZero -- that it seems to reach 3500 Elo rather quickly, then plateaus -- I wonder if it's bumping up against a draw ceiling in chess. One thing that stands out in the Science article is how common draws are in chess compared to shogi or Go. Duncan Suttles, who used to do quite well with extreme hypermodern strategies, complained that chess was too indeterminate -- it took too many mistakes to lose. (I never felt that way myself...) I don't know if Elo ratings are comparable across the games, but AlphaZero's Elo in Go seems to be more like 5000. Maybe that's because draws are rare in Go. |
|
Jan-06-19
 | | AylerKupp: <<keypusher> Re one of your concerns about AlphaZero -- that it seems to reach 3500 Elo rather quickly, then plateaus -- I wonder if it's bumping up against a draw ceiling in chess.> Perhaps. It's hard to tell at the scale of the charts what AlphaZero's Elo rating gain was when going from about 150K training steps (when it first reached the approximate 3400 Elo rating) to 700K steps. But I can say that in the latest (Jan-04-19) CEGT 40/4 tournament (6 secs/move) Stockfish 10 4CPU gained 138 Elo rating points (about 4%) over Stockfish 8 4CPU when running on comparable hardware. So it seems that Stockfish is still capable of improving, at least at fast time controls, although I don't know how long it can keep this up. I don't know if Elo ratings are not comparable across games since they are based on relative performances, and of course you can't compare AlphaZero's performance against Stockfish with AlphaZero's performance against Elmo or AlphaZero's performance against AlphaGoZero. And I don't know enough about either Shogi or Go to even make a guess. |
|
Jan-06-19 | | ChessHigherCat: <keypusher: Re one of your concerns about AlphaZero -- that it seems to reach 3500 Elo rather quickly, then plateaus -- I wonder if it's bumping up against a draw ceiling in chess. One thing that stands out in the Science article is how common draws are in chess compared to shogi or Go.> It's probably premature to talk about the death of chess but I bet what will happen is that the computers will refute (or "bust") certain dubious openings so that nobody will play them anymore. And maybe certain openings will be proven to be inevitable draws given optimal play on both sides, so white would stop playing them too (except against a much stronger player to avoid getting scrunched). |
|
Jan-06-19 | | john barleycorn: <keypusher: Re one of your concerns about AlphaZero -- that it seems to reach 3500 Elo rather quickly, then plateaus -- I wonder if it's bumping up against a draw ceiling in chess. One thing that stands out in the Science article is how common draws are in chess compared to shogi or Go.> Because "draws" in chess can be "agreed" (almost) any time and not solely be so by the position on the board but by other considerations. Then hitting a draw ceiling would mean the strongest engine would lose rating points, won't it? |
|
Jan-06-19
 | | keypusher: <<CHC> it's probably premature to talk about the death of chess but I bet what will happen is that the computers will refute (or "bust") certain dubious openings so that nobody will play them anymore. > Looking at recent top level events it feels like that’s already happening. Some of us are hoping A0 will counteract that, at least temporarily. If it can sacrifice three or four pawns against Stockfish and get away with it, maybe there’s more life in the game than top GMs seem to realize. |
|
Jan-06-19 | | SugarDom: Is King's Gambit a "bust" for White then? |
|
Jan-06-19
 | | WannaBe: King's Gambit is a bust, but Queen's gambit are busts! Ok, I'll move along now.... |
|
 |
 |
< Earlier Kibitzing · PAGE 14 OF 39 ·
Later Kibitzing> |