The Unreasonable Feasibility Of Playing Chess Under The Influence

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the

Link post Contents

A proud history of drinking and playing chess

Please enjoy this clip of the reigning chess world champion Magnus Carlsen playing a game of hyperbullet (30 sec total per player) while inebriated. As the commenters are quick to point out, Magnus winning this game is partially due to his opponent blundering their Queen a couple of moves after Magnus takes over. But still. Even after a full night of sleep and being highly caffeinated I would struggle to complete a game of chess in 30 seconds[1]. Winning while chanting party bangers, and complaining about the state of one’s own body, is kind of impressive. Magnus is standing on the shoulders of giants in this game. Drinking and chess have been going hand in hand since at least Paul Morphy, whose family started the Murphy brewery. Since Morphy, two former world champions stand out as particularly heavy drinkers, Alexander Alekhine and Mikhail Tal. The story of Mikhail Tal, the Magician from Riga, is interesting as he was drinking and smoking en masse while producing more brilliancies[2] than any other player. The Magician from Riga in his element.His play under the influence might be the source of the common urban myth that drinking might improve one’s play. Tal’s heavy drinking was likely not a character flaw, but presumably represented a type of self-medication for his congenital chronic illnesses. The same can not be said for another grandmaster, who triggered one of many[3] episodes of chess drama by drinking heavily and "losing it" live on Twitch during a chess streaming session. On the other hand, he was awarded the infamous title "best drunk chess player in the world" by world championship challenger Fabiano Caruana. Beyond the players, there is "alcoholic chess", the "bongcloud opening", and of course, the recent Netflix hit show "The Queen’s Gambit" about a drug-addicted, female Bobby Fisher. Magnus Carlsen winning a game of hyperbullet after having a drink or two might appear impressive to an outsider—the aficionado just calls it "Tuesday".

How do they do it?

What’s my point? I don’t think we should be surprised that chess players drink alcohol. They are, visibly, human, which means the prior probability of them having had a drink in the last year is (based on location) somewhere between 60 and 100%. I couldn’t find more specific numbers for Twitch streamers in their mid- or late-twenties, but you probably have a decent gut feeling for that demographic. Also, given the stigma that chess is a game for four-eyes and dweebs, it’s not surprising to find countersignaling in the form of leather jackets and drug consumption. No mysteries there. Benny Watts, professional chess-playing "bad boy" from the Queen’s Gambit.The thing that boggles my mind is something else: How can chess players get blackout drunk and still play decent chess? Chess was once considered the pinnacle of human intellect. And while that perspective (perhaps rightfully) has fallen out of favor, I’d like to circle back to the fact that the only way that I win against drunk Magnus Carlsen is when he literally passes out during the game[4]. In this post, I want to propose an analogy with the type of AI architecture that currently dominates computer chess, and which might give us some ideas for why grandmasters can play decent chess while drunk. Perhaps there is even something we can learn from inebriated humans that is interesting for AI.

How to solve chess

Which tool in our belt is best suited to understand complicated cognitive phenomena? David Marr’s three[5] levels of analysis appear like a good match[6]:

After [...] seeing the games, I thought, ’well, I always wondered how it would be if a superior species landed on earth and showed us how they play chess. I feel now I know. Such a wild game. AlphaZero plays with white.Why do people even still care about chess if no human has been able to compete with a computer for decades? On the one hand, chess has been called "the drosophila of artificial intelligence", i.e. a well-controlled model organism with, nonetheless, a rich repertoire of interesting behaviors. On the other hand, AlphaZero stood out for being trained exclusively through self-play (i.e. without access to human games), and still blazing past the best chess computer at the time after just a few hours of training. And despite only playing with itself, there are some striking parallels in the concepts that emerge in the model compared to those that humans use to evaluate the game. Consistently, the play of AlphaZero has been described as much more human-like compared to previous chess computers. On Marr’s algorithmic level, AlphaZero appears like an interesting candidate to evaluate for our goal of understanding human chess playing. So how does AlphaZero work? It’s in essence a product of a recent revolution in deep learning that allows the training of deeper and more capable neural networks. These networks are then used to bypass the problem of combinatorial explosion: instead of exploring all the moves all the way to the end, the networks give reasonable suggestions for which moves appear particularly interesting, and which intermediate positions appear particularly favorable for which player. There are excellent explanations of the architecture out there from a machine learning engineering perspective, but for this post, I instead want to follow the lead of Paul Christiano. **AlphaZero as an example of Iterated Amplification and Distillation. **Paul Christiano argues that the training procedure for AlphaZero can be interpreted as an instance of Iterated Amplification and Distillation (IAD)[9]. Conceptually, IAD requires three components:

The intricate neuronal circuitry of the cerebellum is thought to encode internal models that reproduce the dynamic properties of body parts. [...] It is thought that the cerebellum might also encode internal models that reproduce the essential properties of mental representations in the cerebral cortex. And here is Steinlin (2007), who summarizes insights on the role of the cerebellum in development: The cerebrum might be like a glider, which needs the cerebellum as the motor aeroplane to bring it up in the air – once there, the glider is able to fly alone. But any troubles during this flight might bring it down again and the cerebellar motor aeroplane has to help once more to regain altitude! Looking past the flowery language, this matches exactly what we would expect from a converged solution in IDA. Once the application of the amplification and distillation mechanism does not change the prior p* anymore, p =D(A(D(A(....D(A(p))))))*, their absence should not matter. (There is even a bit of evidence on the cerebellum being involved in chess playing but only in novices. But, actually, this bit of evidence does not pass the bar that I put up in the previous section where I talked about fMRI. So let’s scratch this last part.) Getting a neural network drunk. Now we’ve moved all the pieces (no pun intended) in the right position to think about how Magnus Carlsen can still win at chess while intoxicated.

  1. What does (acute) alcohol consumption do to the brain? Bjork and Gilman (2014) have an answer:

[W]hile there are some discrepancies in specific regional effects of acute alcohol on the resting brain and on the brain at work, the preponderance of evidence indicates that acute alcohol exerts region-specific suppression (e.g. cerebellum) or enhancement (e.g. ventral striatum) of brain metabolic or hemodynamic activity, and by inference, neuronal activity. This is, of course, no coincidence since if this wasn’t the case I wouldn’t have led you down this entire chain of logic. 2.** What happens to AlphaZero when we force it to only rely on its prior p for playing chess?** We don’t know. But we know the next best thing, which is what happens to AlphaZero when we force it to only rely on its prior p for playing Go: Estimated difference in elo rating between the full AlphaZero model (blue) and the model restricted to the prior p (grey). Adapted from Silver et al., 2017.About a 40% reduction in rating. Substantial, but still enough to put it neck-at-neck with a human professional.

  1. How much worse does Magnus Carlsen play under the influence? Carlsen recently set a new record for the highest bullet chess rating ever on Lichess: 3379. Giving up 40% of rating points would knock him down to a rating of ~2000. His opponent from the beginning is rated 2500, making a win rather unlikely: only around 1%. But then again; anything is possible in bullet, and this particular game has collected north of half a million views. Also, it’s unlikely that alcohol knocks out the entire amplification mechanism of the brain (not all of which will be located in the cerebellum). It would be great to have more solid data on how alcohol affects play. Anecdotally, Carlsen quit drinking a while back for health reasons (and after "sandbagging" the fifth Lichess Titled Arena). This blog post talks at length about the detrimental effect of alcohol on chess play but does not list any sources. Opinions on the Chess Forum are divided.

Concluding thoughts

Should we feel less amazed at Carlsen’s intoxicated chess play, after all the above? Quite the contrary, thinking about *how *something works can allow us to appreciate at an even deeper level. And despite not really drinking too much myself, I do have a couple of ideas for a couple of small experiments that really shouldn’t be too much effort… Back to the point, is there anything we can say about artificial intelligence from these experiments? First, I find it interesting that performance deteriorates so much when relying only on the prior p. From one perspective this *does *make a lot of intuitive sense, disabling a central component of the model *should *affect performance. And it would be weird to play chess without any consideration of what the opponent will likely do. From another perspective, it is still a bit surprising. Performance deteriorating implies that the prior p is not yet a fixed point of p=D(A(p))**. Possibly this is because the network that implements p is not able to distill any further amplification? Or is distillation impossible in principle beyond a certain point? Or am I taking the IDA interpretation of AlphaZero too far? Who can tell? In any case, it is pretty amazing that computer chess continues to provide so many, despite the match DeepBlue vs. Kasparov already being 25 years ago. As the great poet #IAN used to say:

There are two kinds of players: those who can play and those who can’t. "Charles Darwin" A big thank you to Philipp Hummel for giving me the central insight of this post, and for very helpful proofreading.

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=thJzBnsHW7BdQKzRA

About a 40% reduction in rating.

Does Elo have a meaningful zero? I thought it was an interval scale.

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=L8Nxj9unmeDHc8Hnf

Ohhh, that’s a very good point 🤔I guess that makes the comparison a bit less direct. I’ll think about whether it can be fixed or if I’ll rewrite that part. Thank you for pointing it out!

Comment

In psychology, these sorts of scales are often standardized relative to the standard deviation instead of the mean. Or you could pick some other range as a metric; for instance if I’m looking it up right (and I very well may not be), the human range of ability appears to span around 1000 Elo points. So the drop is about 2x that between beginner and master humans.

Comment

Wait derp, that’s the range for chess, but go appears to more have a range of around 3000 Elo points.

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=FbzQocv9nDTZiog5v

Performance deteriorating implies that the prior p is not yet a fixed point of p*=D(A(p*)). At least in the case of AlphaZero, isn’t the performance deterioration from A(p*) to p*? I.e. A(p*) is full AlphaZero, while p* is the "Raw Network" in the figure. We could have converged to the fixed point of the training process (i.e. p*=D(A(p*))) and still have performance deterioration if we use the unamplified model compared to the amplified one. I don’t see a fundamental reason why p* = A(p*) should hold after convergence (and I would have been surprised if it held for e.g. chess or Go and reasonably sized models for p*).

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=EqghEwDhn7qfAr4Gj

That… makes a lot of sense. Yep, that’s probably the answer! Thank you :)

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=pvfHDW5kafmso4Jjs

If I was Scott Alexander or Zvi I’d comb through those papers and wring out insight.

Huh? What’s stopping you? How are Scott or Zvi relevant here, at all?

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=LM33qijj7hau6h4YN

Yeah, that thought was insufficiently explained, thanks for pointing that out! For me, Scott and Zvi are examples of people who are really good at "putting together pieces of evidence while tolerating huge amounts of uncertainty". I think I don’t have that talent and I *know *I don’t have the experience (or patience) to pull that off. But there is an interesting meta-point here: Epistemic work comes at a cost and knowing which rabbit holes not to go down is an important skill. When Scott and Zvi are doing one of their "Much more than you’d wanted to know" or "Covid Updates", then they are operating at some kind of efficient frontier where a lot of the pieces *need *to be considered to provide a valuable perspective. Everybody and their dog has a perspective on lockdown effectiveness, so adding to that will require a lot of great epistemic work. But the work is worth it, because the question is important and the authors care deeply about the answer. Drunk chess playing, in contrast, is pretty underexplored (in agreement with its striking unimportance in the grand scheme of things). So making some headway is relatively easy (low hanging fruit!) and the marginal increase in value that an extremely deep dive into the neuroscience literature provides is just less worth it.

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=hiHkheyfi6iGcrAEf

I think you’re saying that alcohol in the body mostly damages players’ ability to read out variations, but not how good their knee-jerk initial impression of "here’s the best move" is? I like that theory! I never thought of it before, but having played a good number of Go games while drunk, it feels right.

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=tBsAuc5fMCiKTbM8H

Yes, that’s a good description! And a cool datapoint, I’ve never played (or even watched) Go, but the principle should of course translate.

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=3JxGicmeoNYjfivJ2

If I’m not mistaken (and I’m not a biologist so I might be), alcohol mainly impacts the brain’s system 2, leaving system 1 relatively intact. That lines up well with this post.

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=ggCsmDuNapx4Jzj8a

Hm. The actual evaluation of positions in chess is really intensive in pattern-recognition, and seems like a great job for complicated cortical learned-pattern-recognizers. But following down game trees to amplify your own evaluation of positions is also intensive in pattern recognition, and is the sort of learned mode of behavior that also needs to be "understood" by the cortex. So how are you imagining communication with the cerebellum? Is it just keeping the cortex "on track" in executing the learned behavior over time, or is it doing something more complicated like handling short-term memory or something, or is it doing something even more complicated like using a fairly sophisticated understanding of math to tell the cortex what patterns to look for next?

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=uxWqNvy7Wsw3eE5vN

Is AlphaZero better than other chess playing programs though?

after four hours of training, it beat the current world champion chess-playing program, Stockfish.I remember there was some controversy around this at the time.

Comment

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=ovE2AsnRKKkEqBYNP

Yeah, I remember that controversy! I think this section on Wikipedia gives a bit of insight. The graph in this Tweet shows the elo of Stockfish as a function of time, with the introduction of the neural net nnue for Stockfish 12 highlighted.

https://www.lesswrong.com/posts/kdLupXLAmWBGueCBR/the-unreasonable-feasibility-of-playing-chess-under-the?commentId=4RupiZ8jS8CFMwLwE

I think the controversy is mostly irrelevant at this point. Leela performed comparably to Stockfish in the latest TCEC season and is based on Alpha Zero. It has most of the "romantic" properties mentioned in the post.

Comment

Not just in the latest TCEC season, they’ve been neck-and-neck for quite a bit now