Embedded Agents

p7x32SEt43ZMC9r7r

title

Embedded Agents

authors

abramdemski

date_published

2018-10-29T19:53

score

194

omega_karma

votes

tags

AI/Embedded Agency/Research Agendas

source

alignment forum

(A longer text-based version of this post is also available on MIRI’s blog here, and the bibliography for the whole sequence can be found here)

Comment

id

eMPBrLqEQXZm5g3M3
authors

Rohin Shah
score

17
omega_karma

8
votes

6
date_published

2019-11-21T04:08

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=eMPBrLqEQXZm5g3M3

I actually have some understanding of what MIRI’s Agent Foundations work is about

id

7o44mg7ym8ffbNtWx
authors

johnswentworth
score

13
omega_karma

4
votes

4
date_published

2019-11-21T18:58

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=7o44mg7ym8ffbNtWx

This post (and the rest of the sequence) was the first time I had ever read something about AI alignment and thought that it was actually asking the right questions. It is not about a sub-problem, it is not about marginal improvements. Its goal is a gears-level understanding of agents, and it directly explains why that’s hard. It’s a list of everything which needs to be figured out in order to remove all the black boxes and Cartesian boundaries, and understand agents as well as we understand refrigerators.

id

F4HAqe2ESrv4NSk5R
authors

ryan_b
score

10
omega_karma
votes

2
date_published

2019-11-28T21:46

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=F4HAqe2ESrv4NSk5R

I nominate this post for two reasons.

One, it is an excellent example of providing supplemental writing about basic intuitions and thought processes, which is extremely helpful to me because I do not have a good enough command of the formal work to intuit them.

Two, it is one of the few examples of experimenting with different kinds of presentation. I feel like this is underappreciated and under-utilized; better ways of communicating seems like a strong baseline requirement of the rationality project, and this post pushes in that direction.

id

q5D8HQkL9maM3mL37
authors

Davidmanheim
score

10
omega_karma

3
votes

2
date_published

2019-11-28T10:40

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=q5D8HQkL9maM3mL37

This post has significant changed my mental model of how to understand key challenges in AI safety, and also given me a clearer understanding of and language for describing why complex game-theoretic challenges are poorly specified or understood. The terms and concepts in this series of posts have become a key part of my basic intellectual toolkit.

id

PeeQx3QGBc4PQtyyA
authors

Ben Pace
score

5
omega_karma

2
votes

3
date_published

2019-11-27T22:08

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=PeeQx3QGBc4PQtyyA

This sequence was the first time I felt I understood MIRI’s research. (Though I might prefer to nominate the text-version that has the whole sequence in one post.)

id

ptM3SJTeewSmshptt
authors

Swimmer963
score

4
omega_karma
votes

2
date_published

2019-11-29T20:17

https://www.lesswrong.com/posts/p7x32SEt43ZMC9r7r/embedded-agents?commentId=ptM3SJTeewSmshptt

Read sequence as research for my EA/rationality novel, this was really good and also pretty easy to follow despite not having any technical background