AGI Predictions

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions

Contents

Elicit Prediction (Will more than 50 people predict on this post?)

Add questions & operationalizations

This is not intended to be a comprehensive list, so I’d love for people to add their own questions – here are instructions on making your own embedded question. If you have better operationalizations of the questions, you can make your own version in the comments. If there’s general agreement on an alternative operationalization being better, I’ll add it into the post.

Questions

AGI definition

We’ll define AGI in this post as a unified system that, for almost all economically relevant cognitive tasks, at least matches any human’s ability at the task. This is similar to Rohin Shah and Ben Cottier’s definition in this post.

Safety Questions

10%20%30%40%50%60%70%80%90%Will AGI cause an existential catastrophe?10%20%30%40%50%60%70%80%90%Will AGI cause an existential catastrophe without additional intervention from the existing AI Alignment research community?10%20%30%40%50%60%70%80%90%Will there be an arms race dynamic in the lead-up to AGI?10%20%30%40%50%60%70%80%90%Will a single AGI or AGI project achieve a decisive strategic advantage?10%20%30%40%50%60%70%80%90%Will > 50% of AGI researchers agree with safety concerns by 2030?10%20%30%40%50%60%70%80%90%Will there be a 4 year interval in which world GDP doubles before the first 1 year interval in which world GDP doubles?10%20%30%40%50%60%70%80%90%Will AGI cause existential catastrophe conditional on there being a 4 year period of doubling of world GDP before a 1 year period of doubling?10%20%30%40%50%60%70%80%90%Will AGI cause existential catastrophe conditional on there being a 1 year period of doubling of world GDP without there first being a 4 year period of doubling?## Timelines Questions See Forecasting AI timelines, Ajeya Cotra’s OP AI timelines report, and Adam Gleave’s #AN80 comment, for more context on this breakdown. I haven’t tried to operationalize this too much, so feel free to be more specific in the comments. The first three questions in this section are mutually exclusive — that is, the probabilities you assign to them should not sum to more than 100%. 10%20%30%40%50%60%70%80%90%Will we get AGI from deep learning with small variations, without more insights on a similar level to deep learning?10%20%30%40%50%60%70%80%90%Will we get AGI from 1-3 more insights on a similar level to deep learning?10%20%30%40%50%60%70%80%90%Will we need > 3 breakthroughs on a similar level to deep learning to get AGI?10%20%30%40%50%60%70%80%90%Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling?10%20%30%40%50%60%70%80%90%Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling because we are unable to continue scaling?10%20%30%40%50%60%70%80%90%Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling because the increase in AI capabilities from scaling plateaus?## Non-technical factor questions 10%20%30%40%50%60%70%80%90%Will we experience an existential catastrophe before we build AGI?10%20%30%40%50%60%70%80%90%Will there be another AI Winter (a period commonly referred to as such) before we develop AGI?# Operationalizations

Safety Questions

1. Will AGI cause an existential catastrophe?

Timelines Questions

9. Will we get AGI from deep learning with small variations, without more insights on a similar level to deep learning?

Non-technical factor questions

15. Will we experience an existential catastrophe before we build AGI?

Additional resources

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=yB83PXfd3ta2YHyML

Great post! I am very curious about how people are interpreting Q10 and Q11, and what their models are. What are prototypical examples of ‘insights on a similar level to deep learning’? Here’s a break-down of examples of things that come to my mind: Historical DL-level advances:

  • the development of RL (Q-learning algorithm, etc.)

  • Original formulation of a single neuron i.e. affine transformation + non-linearity Future possible DL-level:

  • a successor to back-prop (e.g. the how biological neurons learn)

  • a successor to the Q-learning family (e.g. neatly generalizing and extending ‘intrinsic motivation’ hacks)

  • full brain simulation

  • an alternative to the affine+activation recipe Below DL-level major advances:

  • an elegant solution to learn from cross-modal inputs in a self-supervised fashion (babies somehow do it)

  • a breakthrough in active learning

  • a generalizable solution to learning disentangled and compositional representations

  • a solution to adversarial examples Grey areas:

  • breakthroughs in neural architecture search

  • a breakthrough in neural Turing machine-type research I’d also like to know how people’s thinking fits in with my taxonomy: Are people who leaned yes on Q11 basing their reasoning on the inadequacy of the ‘below DL-level advances’ list, or perhaps on the necessity of the ‘DL-level advances’ list? Or perhaps people interpreted those questions completely differently, and don’t agree with my dividing lines?

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=zHQ8kGEgGCZ2CYm4s

Thank you for asking this question and for giving that break-down. I was wondering something similar. I am not an AI scientist but DL seems like a very big deal to me, and thus I was surprised that so many people seemed to think we need more insights on that level. My charitable interpretation is that they don’t think DL is a big deal.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=wdtQLBpYXzePZsQbH

At time of writing, I’m assigning the highest probability to "Will AGI cause an existential catastrophe?" at 85%, with the next-highest predictions at 80% and 76%. Why … why is everyone so optimistic?? Did we learn something new about the problem actually being easier, or our civilization more competent, than previously believed?

Should—should I be trying to do more x-risk-reduction-relevant stuff (somehow), or are you guys saying you’ve basically got it covered? (In 2013, I told myself it was OK for dumb little ol’ me to personally not worry about the Singularity and focus on temporal concerns in order to not have nightmares, and it turned out that I have a lot of temporal concerns which could be indirectly very relevant to the main plot, but that’s not my real reason for focusing on them.)

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=e63NNWNmhToCMkYQH

IMO, we decidedly do not "basically have it covered." That said, IMO it is generally not a good idea for a person to try to force themselves on problems that will make them crazy, desperate need or no. I am often tempted to downplay how much catastrophe-probability I see, basically to decrease the odds that people decide to make themselves crazy in the direct vicinity of alignment research and alignment researchers. And on the other hand, I am tempted by the HPMOR passage: "Girls?" whispered Susan. She was slowly pushing herself to her feet, though Hermione could see her limbs swaying and quivering. "Girls, I’m sorry for what I said before. If you’ve got anything clever and heroic to try, you might as well try it." (To be clear, I have hope. Also, please just don’t go crazy and don’t do stupid things.)

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=ECjBfZDwCmZX8e7rf

For me, it’s because there’s disjunctively many ways that AGI could not happen (global totalitarian regime, AI winter, 55% CFR avian flu escapes a BSL4 lab, unexpected difficulty building AGI & the planning fallacy on timelines which we *totally *won’t fall victim to this time...), or that alignment could be solved, or that I could be mistaken about AGI risk being a big deal, or… Granted, I assign small probabilities to several of these events. But my credence for P(AGI extinction | no more AI alignment work from community) is 70% - much higher than my 40% unconditional credence. I guess that means yes, I think AGI risk is huge (remember that I’m saying "40% chance we just *die to AGI, unconditionally"), and that’s after incorporating the significant contributions which I expect the current community to make. The current community is far from sufficient, but it’s also probably picking a good amount of low-hanging fruit, and so I expect that its presence makes a significant difference. EDIT: I’m decreasing the 70% to 60% to better match my 40% unconditional, because only the *current alignment community stops working on alignment.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=vpi6kiDZ4kKwwzzug

I’ve gone from roughly 2⁄3 to 1⁄2 on existential catastrophe (I’ve put 58% here, was feeling pessimistic) based on the big projects having safety teams who I think are doing really good work. That probably falls under our civilization being more competent than previously believed.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=2DYMkqtc3bMsxNEa6

why is everyone so optimistic?? Some reasons.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=aDESFEabYghwsjM4w

There is a huge difference in the responses to Q1 ("Will AGI cause an existential catastrophe?") and Q2 ("...without additional intervention from the existing AI Alignment research community"), to a point that seems almost unjustifiable to me. To pick the first matching example I found (and not to purposefully pick on anybody in particular), Daniel Kokotajlo thinks there’s a 93% chance of existential risk without the AI Alignment community’s involvement, but only 53% with. This implies that there’s a ~43% chance of the AI Alignment community solving the problem, conditional on it being real and unsolved otherwise, but only a ~7% chance of it not occurring for any other reason, including the possibility of it being solved by the researchers building the systems, or the concern being largely incorrect. What makes people so confident in the AI Alignment research community solving this problem, far above that of any other alternative?

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=waFEqP7t38HYTpay7

I also noticed Daniel’s difference in probabilities there, and thought they were substantial. But it doesn’t seem unreasonable to me. The existing AI x-risk community has changed the global conversation on AI and also been responsible for much in the way of funding and direct research on many related technical problems. I could talk about the specific technical work, or the impact that things like the AI FOOM Debate had on Superintelligence had on OpenPhil, or CFAR on FLI on Musk on OpenAI. Or I could go into detail about the research being done on topics like Iterated Amplification and Agent Foundations and so on and ways that this seems to me to be clear progress on subproblems. I’m not sure exactly what alternatives you might have in mind.

Comment

To emphasize, the clash I’m perceiving is not the chance assigned to these problems being tractable, but to the relative probability of ‘AI Alignment researchers’ solving the problems, as compared to everyone else and every other explanation. In particular, people building AI systems intrinsically spend a degree of their effort, even if completely unconvinced about the merits of AI risk, trying to make systems aligned, just because that’s a fundamental part of building a useful AI.

I could talk about the specific technical work, or the impact that things like the AI FOOM Debate had on Superintelligence had on OpenPhil, or CFAR on FLI on Musk on OpenAI. Or I could go into detail about the research being done on topics like Iterated Amplification and Agent Foundations and so on and ways that this seems to me to be clear progress on subproblems. I have a sort of Yudkowskian pessimism towards most of these things (policy won’t actually help; Iterated Amplification won’t actually work), but I’ll try to put that aside here for a bit. What I’m curious about is what makes these sort of ideas only discoverable in this specific network of people, under these specific institutions, and particularly more promising than other sorts of more classical alignment. Isn’t Iterated Amplification in the class of things you’d expect people to try just to get their early systems to work, at least with ≥20% probability? Not, to be clear, exactly that system, but just fundamentally RL systems that take extra steps to preserve the intentionality of the optimization process. To rephrase a bit, it seems to me that a worldview in which AI alignment is sufficiently tractable that Iterated Amplification is a huge step towards a solution, would also be a worldview in which AI alignment is sufficiently easy (though not necessarily easy) that there should be a much larger prior belief that it gets solved anyway.

Comment

FWIW, I made these judgments quickly and intuitively and thus could easily have just made a silly mistake. Thank you for pointing this out. So, what do I think now, reflecting a bit more? --The 7% judgment still seems correct to me. I feel pretty screwed in a world where our entire community stops thinking about this stuff. I think it’s because of Yudkowskian pessimism combined with the heavy-tailed nature of impact and research. A world without this community would still be a world where people put some effort into solving the problem, but there would be less effort, by less capable people, and it would be more half-hearted/​not directed at actually solving the problem/​not actually taking the problem seriously. --The other judgment? Maybe I’m too optimistic about the world where we continue working. But idk, I am rather impressed by our community and I think we’ve been making steady progress on all our goals over the last few years. Moreover, OpenAI and DeepMind seem to be taking safety concerns mildly seriously due to having people in our community working there. This makes me optimistic that if we keep at it, they’ll take it very seriously, and that would be great.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=rwwCc9vgnA5fZwZT4

I interpreted the question as something like "if nobody cares about safety and there isn’t a community that takes a special interest in it, will we be safe". I don’t think it’s specifically this AI Alignment community solving it, it’s just that if nobody tries to solve the problem, the problem will stay unsolved. Edit: And I do now see that I misinterpreted the question. Updated my second estimate downwards because of that. Thanks for pointing this out!

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=tHGTKgrpPFRa7qTgJ

In the following, an event is "catastrophic" if it endangers several human lives; it need not be an existential catastrophe. 10%20%30%40%50%60%70%80%90%Before AGI, will we learn of an example of catastrophic deceptive misalignment?10%20%30%40%50%60%70%80%90%Conditional on the AI community learning of pre-AGI catastrophic deceptive misalignment, will the ($ spent on AI alignment research)/($ spent on AI research) ratio increase by more than 50% over the two years following the catastrophe?Edit: I meant to say "deceptive alignment", but the meaning should be clear either way.

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=4E4TJDrPY2bEs9mfj

"Catastrophic" is normally used in the term "global catastrophic risk" and means something like "kills 100,000s of people", so I do think "doesn’t necessarily kill but could’ve killed a couple of people" is a fairly different meaning. In retrospect I realize that I put my answer to the second question far too high — if it just means "a deceptive aligned system nearly gives a few people in hospital a fatal dosage but it’s stopped and we don’t know why the system messed up" then it’s quite plausible nothing this substantial will happen as a result of that.

Comment

"Catastrophic" is normally used in the term "global catastrophic risk" and means something like "kills 100,000s of people", so I do think "doesn’t necessarily kill but could’ve killed a couple of people" is a fairly different meaning. Agreed. In retrospect, I might have opted for "pre-AGI nearly-deadly accident caused by deceptive alignment." In retrospect I realize that I put my answer to the second question far too high — if it just means "a deceptive aligned system nearly gives a few people in hospital a fatal dosage but it’s stopped and we don’t know why the system messed up" then it’s quite plausible nothing this substantial will happen as a result of that. I intended the situation to be more like "we *catch *the AI pretending to be aligned, but actually lying, and it almost or does kill at least a few people as a result of that." With #1, I’m trying to have people predict the "deception is robustly instrumental behavior, but AIs will be bad at it at first and so we’ll catch them." #2 is trying to operationalize whether this would be viewed as a fire alarm. Some ways you might think scenario #1 won’t happen:

  • You don’t think deception will be incentivized

  • Fast takeoff means the AI is never smart enough to deceive but dumb enough to get caught

  • Our transparency tools won’t be good enough for many people to believe it was actually deceptively aligned

Comment

Some ways you might think scenario #1 won’t happen: Also: we solve alignment really well on paper, and that’s why deception doesn’t arise. (I assign non-trivial probability to this.)

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=8bLNCdht2mDdaoMJP

I suspect this is intentional, but the set {1,6,7,8} of predictions in redundant, in the sense that probabilities for three of them mathematically imply the probability of the forth due to the law of total probability.

In particular, if #1 is A and #6 is B, then #7 and #8 are A|B and A|\neg B, and we have the equality

P(A) = P(A|B)P(B) + P(A|\neg B)P(\neg B)

The probability I would assign to #8 intuitively is about 0,41. Math based on my other three predictions yields (doing the calculation now) 0.476. I am going to predict the math output rather than my intuition.

Did anyone else calculate their level of inconsistency?

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=7WggxztMizirzi3pj

The probability I would assign to #8 intuitively is about 0,41. Math based on my other three predictions yields (doing the calculation now) 0.476. I am going to predict the math output rather than my intuition. I think the correct response to this realization is not to revise your final answer so as to make it consistent with the first three. It is to revise all four answers so that they are maximally intuitive, subject to the constraint that they be jointly consistent. Which answer comes last is just an artifact of the order of presentation, so it isn’t a rational basis for privileging some answers over others.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=n45eZDKqoc3kv74r3

This is only true if, for example, you think AI would cause GDP growth. My model assigns a lot of probability to ‘AI kills everyone before (human-relevant) GDP goes up that fast’, so questions #7 and #8 are conditional on me being wrong about that. If we can last any small multiples of a year with AI smart enough to double GDP in that timeframe, then things probably aren’t as bad as I thought.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=4dHTLYyBTeFfyk9Z5

How to add your own questions:

  • Go to elicit.org/​binary

  • Type your question into the field at the top

  • Click on the question title, and click the copy URL button

  • Paste the URL into the LessWrong editor See our launch post for more details!

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=vPnx5TeHBL4Ngo6rd

I suspect this question is misworded: Will there be a 4 year interval in which world GDP growth doubles before the first 1 year interval in which world GDP growth doubles? Do you mean in which world GDP doubles? World GDP growth doubles when it goes from, say, 0.5% yearly growth to 1% yearly growth. Personally, I suspect world GDP is most likely to next double in a period after a severe war or depression, so you might want to rephrase to avoid that scenario if that isn’t what you’re thinking about.

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=kidzxLpjAJdeFNrTt

This was a good catch! I did actually mean world GDP, not world GDP growth. Because people have already predicted on this, I added the correct questions above as new questions, and am leaving the previous questions here for reference: 10%20%30%40%50%60%70%80%90%Will there be a 4 year interval in which world GDP growth doubles before the first 1 year interval in which world GDP growth doubles?10%20%30%40%50%60%70%80%90%Will AGI cause existential catastrophe conditional on there being a 4 year period of doubling of world GDP growth before a 1 year period of doubling?10%20%30%40%50%60%70%80%90%Will AGI cause existential catastrophe conditional on there being a 1 year period of doubling of world GDP growth without there first being a 4 year period of doubling?

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=DkZobi2asn4uWmrR8

Can’t help but wonder how many people here (if any) have significantly changed their predictions over the course of the past five months since this was posted. Would be super interested to hear, either way.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=Qkn72zSMaYTyNukex

I really appreciate the effort that went into collecting all of these questions, framing them clearly, and coding the clickable predictions.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=tXAHojYuGCYYZLbQd

"Will > 50% of AGI researchers agree with safety concerns by 2030?"From my research, I think they mostly already do, they just use different framings, and care about different time frames.

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=xLq99jovScsLcXtT3

Fwiw, I think the operationalization of the question is stronger than it appears at first glance, and that’s why estimates are low.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=ggaDuexhCmhZSKCEK

That was fun. This time, I tried not to update too much on other people’s predictions. In particular, I’m at 1% for "Will we experience an existential catastrophe before we build AGI?" and at 70% for "Will there be another AI Winter (a period commonly referred to as such) before we develop AGI?", but would probably defer to a better aggregate on the second one.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=EtsZurB8oWfFrR7n4

So the following, for example, don’t count as "existential risk caused by AGI", right?

  • many AIs

  • an economy run by advanced AIs amplifying negative externalities, such as pollution, leading to our demise

  • an em world with minds evolving to the point of being non-valuable anymore ("a Disneyland without children")

  • a war by transcending uploads

  • narrow AI

  • a narrow AI killing all humans (ex.: by designing grey goo, a virus, etc.)

  • a narrow AI eroding trust in society until it breaks apart

  • intermediary cause by an AGI, but not ultimate cause

  • a simulation shutdown because our AI didn’t have a decision theory for acausal cooperation

  • an AI convincing a human to destroy the world

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=NC9BfSbnWwifRmBfi

Thanks a lot for the feature and this post! I’ll be really interested by an analysis after a lot of answers are in.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=ruQJgjmuHBeBfSDXz

Wouldn’t it be better to have the other votes visible only after voting? People could be highly influenced by seeing how many and who voted what.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=Qgr4mwppwxc5Lcryu

I’ve been seeing an intermittent bug on a few of these where tapping to record an answer causes the question text to disappear. Sometimes scrolling away and back fixes it.

Chrome browser on Android phone.

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=D3m3CLoT34AaEGSZt

This is intentional. The question text shares space with the list of users and their respective predictions. On mobile, this means when you tap on a section, you see the users who voted in the corresponding range, until you tap away.

Comment

Ah, makes sense. I guess I just need to get used to the interface.

Comment

Yeah, we had to make some tradeoffs because I really wanted them to fit into a small space, and also to never resize when you interact with them, while also not dominating any post in which they are in. Not sure whether we hit the perfect balance of the tradeoffs.

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=QyDJdyz2T2SEpgjvL

What level of background in AI alignment are you assuming/​desiring for respondents? Is it just "all readers" where the assumption is that any cultural osmosis etc. is included in what you’re trying to measure?

Comment

https://www.lesswrong.com/posts/YMokuZdoY9tEDHjzv/agi-predictions?commentId=3Y829HbZ7yxPJLGgh

Yeah, any LWer is welcome to record their predictions :)