Weak HCH accesses EXP

https://www.lesswrong.com/posts/CtGH3yEoo4mY2taxe/weak-hch-accesses-exp

Contents

Previously, I proved that imitative amplification with weak HCH, approval-based amplification, and recursive reward modeling access \text{PSPACE} while AI safety via market making accesses \text{EXP}. At the time, I wasn’t sure whether my market making proof would generalize to the others, so I just published it with the \text{PSPACE} proofs instead. However, I have since become convinced that the proof does generalize—and that it generalizes for all of the proposals I mentioned—such that imitative amplification with weak HCH, approval-based amplification, and recursive reward modeling all actually access \text{EXP}. This post attempts to prove that.

Updated list of proposals by complexity class

\text{P}: Imitation learning (trivial)

\text{PSPACE}: AI safety via debate (proof)

\text{EXP}: AI safety via market making (proof), Imitative amplification with weak HCH (proof below), Approval-based amplification (proof below), Recursive reward modeling (proof below)

\text{NEXP}: Debate with cross-examination (proof)

\text{R}: Imitative amplification with strong HCH (proof), AI safety via market making with pointers (proof)

Proofs

Imitative amplification with weak HCH accesses \text{EXP}

The proof here is similar in structure to my previous proof that weak HCH accesses \text{PSPACE}, so I’ll only explain where this proof differs from that one.

First, since l \in \text{EXP}, we know that for any x \in X, T_l(x) halts in O(2^{\text{poly}(n)}) steps where n = |x|. Thus, we can construct a function f_l(n) = c_1 + c_2 e^{c_3 n^{c_4}} such that for all x \in X, T_l(x) halts in less than or equal to f_l(x) steps by picking c_3,~ c_4 large enough that they dominate all other terms in the polynomial for all n \in \mathbb N. Note that f_l is then computable in time polynomial in n.

Second, let H’s new strategy be as follows:

Note that this strategy precisely replicates the strategy used in my proof that market making accesses \text{EXP} for inputs p : i and p : i : j. Thus, I’ll just defer to that proof for why the strategy works and is polynomial time on those inputs. Note that this is where l \in \text{EXP} becomes essential, as it guarantees that i and j can be represented in polynomial space.

Then, the only difference between this strategy and the market making one is for the base p input. On p, given that M(p : i) works, M(p : f(|x|)) will always return a state after T_l has halted, which will always be an accept state if T_l accepts and a reject state if it rejects. Furthermore, since |M(p : f(|x|)| \in O(1) and f is computable in polynomial time, the strategy for p is polynomial, as desired.

Since this procedure works for all l \in \text{EXP}, we get that amplification with weak HCH accesses \text{EXP}, as desired.

Approval-based amplification accesses \text{EXP}

The proof here is almost precisely the same as my previous proof that approval-based amplification accesses \text{PSPACE} with the only modification being that we need to verify that |M(q)| + |H(q, M)| are still polynomial in size for the new imitative amplification strategy given above. However, the fact that i and j are bounded above by t_p means that they are representable in polynomial space, as desired.

Recursive reward modeling accesses \text{EXP}

Again, the proof here is almost precisely the same as my previous proof that recursive reward modeling accesses \text{PSPACE} with the only modification being the same as with the above proof that approval-based amplification accesses \text{EXP}.