AI Safety Reading Group

J55ak4zE2j2uDNevN

title

authors

Søren Elverlin

date_published

2019-08-11T09:01

score

omega_karma

votes

Comment

id

eWfwFn7QcMZezwzv5
authors

Søren Elverlin
score

1
omega_karma
votes

1
date_published

2021-06-26T12:38

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=eWfwFn7QcMZezwzv5

Yes, we are still running, though at a bi-weekly schedule. We will discuss Paul Christiano’s "Another (Outer) Alignment failure story" on the 8th of July.

id

Guxyjp7ckuj9445zj
authors

Wei_Dai
score

2
omega_karma
votes

4
date_published

2019-08-11T11:13

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=Guxyjp7ckuj9445zj

I’m sad to have missed Eric Drexler’s recent Q&A session. The slides for that session don’t seem to contain Eric’s answers and there is no linked recording. Is there any chance someone kept notes, or can write a summary from their memory of Eric’s answers?

Comment

id

g8RgTC4XE4oszBKAL
authors

Søren Elverlin
score

2
omega_karma
votes

2
date_published

2019-08-11T13:10

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=g8RgTC4XE4oszBKAL

Eric Drexler requested that I did not upload a recording to YouTube. Before the session, I compiled this document with most of the questions: https://www.dropbox.com/s/i5oqix83wsfv1u5/Comprehensive_AI_Services_Q_A.pptx?dl=0 We did not get to post the last few questions. Are there any questions from this list you would like me to try to remember the answers to?

Comment

id

wAuJDvDPSCMPia6c4
authors

Wei_Dai
score

6
omega_karma
votes

3
date_published

2019-08-11T18:24

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=wAuJDvDPSCMPia6c4

Do you have a recording of the session? If so, can you send it to me via PM or email?

I’m interested in answers to pretty much all of the questions. If no recording is available, any chance you could write up as many answers as you can remember? (If not, I’ll try harder to narrow down my interest. :)

I’m also curious why Eric Drexler didn’t want you to upload a recording to YouTube. If the answers contain info hazards, it seems like writing up the answers publicly would be bad too. If not, what could outweigh the obvious positive value of releasing the recording? If he’s worried about something like not necessarily endorsing the answers that he gave on the spot, maybe someone could prepare a transcript of the session for him to edit and then post?

id

DJkhherRc9AHYRvq3
authors

NaiveTortoise
score

4
omega_karma
votes

3
date_published

2019-08-11T17:46

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=DJkhherRc9AHYRvq3

I’m very interested in his responses to the following questions:

The question addressing Gwern’s post about Tool AIs wanting to be Agent AIs.
The question addressing his optimism about progress without theoretical breakthroughs (related to NNs/DL).

Comment

id

K62FDnRfNrSgrQYM7
authors

Chris Cooper
score

5
omega_karma
votes

3
date_published

2019-08-12T17:05

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=K62FDnRfNrSgrQYM7

The question addressing Gwern’s post about Tool AIs wanting to be Agent AIs. When Søren posed the question, he identified the agent / tool contrast with the contrast between centralized and distributed processing, and Eric denied they are the same contrast. He then went on to discuss the centralized / distributed contrast. He regards it as of no particular significance. In any system, even within a neural network, different processes are conditionally activated according to the task in hand and don’t use the whole network. These different processes within the system can be construed as different services. Although there is mixing and overlapping of processes within the human brain, this is a design flaw rather than a desirable feature. I thought there was some mutual misunderstanding here. I didn’t find the tool / agent distinction being addressed in our discussion.
The question addressing his optimism about progress without theoretical breakthroughs (related to NNs/DL).Regarding breakthroughs versus incremental progress: Eric reiterated his belief that we are likely to see improvements in doing particular tasks but a system that – in his examples – is good at counting leaves on a tree is not going to be good at navigating a Mars rover, even if both are produced by the same advanced learning algorithm. I couldn’t identify any crisp arguments to support this.

Comment

id

HBAWMnnMxaAaDNEeN
authors

NaiveTortoise
score

2
omega_karma
votes

2
date_published

2019-08-12T19:23

https://www.lesswrong.com/posts/J55ak4zE2j2uDNevN/ai-safety-reading-group-1?commentId=HBAWMnnMxaAaDNEeN

Thanks!