f8223d6deceb56084bd075d44e7bed59bc05497b71c395fdc0e74910720ec1a1b12c6b0576f378196894d1f66f54198ebf2a32f640e4e9820e1c28824dccd168

Question: Can we just tell an AI do what we want?

Answer: If we could, it would solve a large part of the alignment problem.

The challenge is, how do we code this? Converting something to formal mathematics that can be understood by a computer program is much harder than just saying it in natural language, and proposed AI goal architectures are no exception. Complicated computer programs are usually the result of months of testing and debugging. But this one will be more complicated than any ever attempted before, and live tests are impossible: a superintelligence with a buggy goal system will display goal stability and try to prevent its programmers from discovering or changing the error.