Credence polls for 26 claims from the 2019 Review

https://www.lesswrong.com/posts/LnpGZNwLc7uCDBXfG/credence-polls-for-26-claims-from-the-2019-review

Contents

Book Review: The Secret of Our Success

10%20%30%40%50%60%70%80%90%Cultural Intelligence Hypothesis: humans evolved big brains in order to be able to maintain [complicated, detailed, arbitrary-seeming cultural knowledge like 20+ step Inuit seal hunting techniques]. Everything that separates us from the apes is part of an evolutionary package designed to help us maintain this kind of culture, exploit this kind of culture, or adjust to the new abilities that this kind of culture gave us. 10%20%30%40%50%60%70%80%90%Machiavellian Intelligence Hypothesis: humans evolved big brains in order to succeed at social manuevering and climbing dominance hierarchies.10%20%30%40%50%60%70%80%90%For most of history, a human attempting to use reasoning to do things like cooking, crafting, or planning (instead of using inherited cultural heuristics, like omens or folk wisdom), would have been substantially worse off, and faced a major increase in their risk of death (without a commensurate increase in life quality).Overall, treat the claims in this post more like polls, and less like the full-blown forecasting questions you’d find on Metaculus or PredictIt. (The latter have extremely high bars for crisp definitions.) They point in a direction, but don’t completely pin it down. Overall, this is an experiment. I’m trying to find interesting ways for people to relate to the Review. Maybe speeding through these questions gets you thinking good thoughts, that you can then turn into full-blown reviews? Maybe others’ answers allow you to find a discussion partner who disagrees on a core question? Maybe the data will be useful in the voting phase? We’ll see! Feel free to leave a comment about how you found the experience, if you want. If you want to discuss the questions with others over a call, you can do so during the Review forecasting sessions we’re organising this weekend (January 9-10). If you want to hide other user’s predictions until you’ve made your own, here’s how to do that: Click "Edit Settings"Go to "Site customizations"Press "Hide other users’ Elicit predictions until I have predicted myself"## Make More Land **Making more land out of the about 50mi^2 shallow water in the San Francisco Bay, South of the Dumbarton Bridge, would… ** 10%20%30%40%50%60%70%80%90%...be an environmental disaster.10%20%30%40%50%60%70%80%90%...create buildings where the expected yearly damage from earthquakes (both in terms of reduced quality of life and property destroyed) is >1.5x that of nearby buildings on old land.10%20%30%40%50%60%70%80%90%...significantly worsen traffic in San Francisco.10%20%30%40%50%60%70%80%90%...cause a water shortage such that, in 2030, residents of the Bay Area would spend on average 100% more on water, after adjusting for inflation, compared to 2020. (In 2020 the average American spends around $200/year on water)10%20%30%40%50%60%70%80%90%...substantially improve current housing shortages and rent prices by 2035. 10%20%30%40%50%60%70%80%90%...all-things-considered, be good for the world.## Why Wasn’t Science Invented in China? 10%20%30%40%50%60%70%80%90%The modern Scientific Revolution occurred in Europe between the 16th and 18th Centuries. Why did it not happen in China? Historian Toby Huff claims the reason is that China was unable to produce modern science primarily because a lack of the requisite intellectual freedom. Was he basically correct?## The Strategy-Stealing Assumption 10%20%30%40%50%60%70%80%90%The strategy-stealing assumption is "a good enough approximation that we can basically act as if it’s true". That is, for any strategy an unaligned AI could use to influence the long-run future, there is an analogous strategy that a similarly-sized group of humans can use in order to capture a similar amount of flexible influence over the future. By "flexible" is meant that humans can decide later what to do with that influence  (which is important since humans don’t yet know what we want in the long run).## Becoming the Pareto-best in the World 10%20%30%40%50%60%70%80%90%Does Pareto frontier trick allow people to circumvent the Generalized Efficient Market hypothesis? That is, take people in the 98th percentile of intelligence. Are there a few separate fields such that they could become experts in each, with less than 10 years of total time investment... and then have a realistic shot at a big money/status windfall, with relatively little marginal effort.## The Hard Work of Translation 10%20%30%40%50%60%70%80%90%The core cognitive loop that causes progress in accomplished Buddhists is basically cognitive behavioral therapy, supercharged with a mental state more intense than most pharmaceuticals.## The Forces of Blandness and the Disagreeable Majority 10%20%30%40%50%60%70%80%90%Decision-makers in media and PR, and corporate and government elites generally, have a lower tolerance for verbal conflict and taboo violations than the typical individual.10%20%30%40%50%60%70%80%90%As of 2019, the US was in an era of unusually large amounts of free speech that elites were starting to get spooked by and defend against.## Bioinfohazards 10%20%30%40%50%60%70%80%90%Overall, in 2019, biosecurity in the context of catastrophic risks had been underfunded and underdiscussed.10%20%30%40%50%60%70%80%90%The EA community has sometimes erred too much on the side of shutting down discussions of biology by turning them into discussions about info-hazards.## Two explanations for variation in human abilities 10%20%30%40%50%60%70%80%90%Background knowledge and motivation levels being equal, humans will learn how to perform new tasks at roughly equal rates.10%20%30%40%50%60%70%80%90%Another version: roughly, everything that top-humans can learn, most humans can too if they actually tried. That is, there is psychological unity of humankind in what we can learn, but not necessarily what we have learned. By contrast, a mouse really couldn't learn chess, even if they tried. And in turn, no human can learn to play 90-dimensional chess, unlike the hypothetical superintelligences that can.## Reframing Impact These questions are quite technical, and might be hard to answer if you’re unfamiliar with the terminology used in TurnTrout’s sequence on Impact Measures. 10%20%30%40%50%60%70%80%90%Attainable Utility theory describes how people feel impacted10%20%30%40%50%60%70%80%90%Agents trained by powerful RL algorithms on arbitrary reward signals generally try to take over the world.10%20%30%40%50%60%70%80%90%The catastrophic convergence conjecture is true. That is, unaligned goals tend to have catastrophe-inducing optimal policies because of power-seeking incentives.10%20%30%40%50%60%70%80%90%AUP_conceptual prevents catastrophe, assuming the catastrophic convergence conjecture. 10%20%30%40%50%60%70%80%90%Some version of Attainable Utility Preservation solves side effect problems for an extremely wide class of real-world tasks and for subhuman agents.10%20%30%40%50%60%70%80%90%For the superhuman case, penalizing the agent for increasing its own Attainable Utility (AU) is better than penalizing the agent for increasing other AUs. 10%20%30%40%50%60%70%80%90%There exists a simple closed-form solution to catastrophe avoidance (in the outer alignment sense).--- (Note that when you answer questions in this summary post, and it will automatically update the prediction questions that I have linked in comments on each individual post. The distributions will later be visible when users are voting to rank the posts.)

Comment

https://www.lesswrong.com/posts/LnpGZNwLc7uCDBXfG/credence-polls-for-26-claims-from-the-2019-review?commentId=A7jfFwayAvY6QcML7

Speaking of claims made in 2019 review posts: Conclusion to ‘Reframing Impact’ (the final post of my nominated Reframing Impact sequence) contains the following claims and credences:

  • AU theory describes how people feel impacted. I’m darn confident (95%) that this is true.

  • Agents trained by powerful RL algorithms on arbitrary reward signals generally try to take over the world. Confident (75%). The theorems on power-seeking only apply to optimal policies in fully observable environments, which isn’t realistic for real-world agents. However, I think they’re still informative. There are also strong intuitive arguments for power-seeking.

  • The catastrophic convergence conjecture is true. Fairly confident (70%). There seems to be a dichotomy between "catastrophe directly incentivized by goal" and "catastrophe indirectly incentivized by goal through power-seeking", although Vika provides intuitions in the other direction.

  • AUP_{\text{conceptual}}* prevents catastrophe, assuming the catastrophic convergence conjecture.* Very confident (85%).

  • Some version of AUP solves side effect problems for an extremely wide class of real-world tasks and for subhuman agents. Leaning towards yes (65%).

  • For the superhuman case, penalizing the agent for increasing its own AU is better than penalizing the agent for increasing other AUs. Leaning towards yes (65%).

  • There exists a simple closed-form solution to catastrophe avoidance (in the outer alignment sense). Pessimistic (35%).

Comment

https://www.lesswrong.com/posts/LnpGZNwLc7uCDBXfG/credence-polls-for-26-claims-from-the-2019-review?commentId=7uqJ8ymwZ4z37bX7S

Ey, awesome! I’ve updated the post to include them.