When reading Vladimir’s recent post on defining UDT through modal logic, I had to think a bit about his actual definition of the modal formula(s) corresponding to UDT, because it was phrased in terms of an algorithm doing things instead of an actual modal formula. Then I remembered having worked through the correspondence in the context of modal agents, and it became clear what was going on. I think Vladimir’s approach is really interesting and I want to refer to it in future posts, so I thought I’d write a tutorial, since it seems likely that others will have the same problem.
Like Vladimir, let’s start with the simplest case of an agent that chooses between two actions, in a world with two different possible outcomes (one of which is preferred to the other). Let’s write the world as a universe program, in the usual style of proof-based UDT; for example:
def U(): if A() == a1: return 10 else: return 5
And, in the style of UDT with a halting oracle, let’s write our agent as a program with access to a halting oracle, which it uses to figure out whether certain sentences are provable (in Peano Arithmetic, let’s say):
def A(): if Provable("A() == a1 --> U() == 10"): return a1 elif Provable("A() == a2 --> U() == 10"): return a2 return a2
(The reference to A() and U() inside the quotes is by quining.) Our agent tries to get utility 10 by trying all possible actions in order; for each action, it sees whether it can prove that this action leads to the desired outcome, and if so, it takes that action; if it doesn’t find a good action, it randomly takes the last one.
Given a programming language with access to a halting oracle, there’s a relation \Run(m,n) such that if m is the Gödel number of a program in this language, then \NN\vDash\Run(m,n) if and only if this program halts and returns n. (The definition of \Run(m,n) is very similar to the one for programming languages without a halting oracle.) We implicitly used this fact in the definition of A(): the statement "A() = a_1\to U() = 10" really means "\Run(A,a_1)\to\Run(U,10)".
Now, by looking at the definition of U(), we can show that \PA ,\vdash, [U() = 10]\leftrightarrow[A() = a_1], and by looking at the definition of A(), we can show that \PA ,\vdash, [A() = a_1]\leftrightarrow\box{[A() = a_1] \to [U() = 10]}. If we ignore the internal structure of [U() = 10] and [A() = a_1], and treat these as propositional variables, both of the statements above look like formulas of modal logic: u\leftrightarrow a and a\leftrightarrow\box{a\to u}.
We can write this as a\leftrightarrow F(a,u) and u\leftrightarrow G(a), where F(a,u) and G(a) are the modal formulas F(a,u) \equiv \box{a\to u} and G(a) \equiv a.
So far, the modal logic thing has been an analogy, but now we can observe that the two displayed statements above imply that [A() = a_1] is a fixed point of the formula F(a,G(a)); that is, \PA ,\vdash, [A() = a_1] \leftrightarrow F\big([A() = a_1],, G\big([A() = a_1]\big)\big). Now, since F(a,G(a)) is a fully modalized formula (meaning that all occurrences of its propositional variable, a, are inside boxes), this means that we can apply results from provability logic: in particular, there is a modal formula that’s a fixed point of—F(a,G(a))a closed formula \varphi of Gödel-Löb provability logic (\GL) such that—\GL\vdash\varphi\leftrightarrow F(\varphi,G(\varphi))and moreover, \PA\vdash[A() = a_1]\leftrightarrow\varphi.
It’s straight-forward to check that \GL\vdash \top\leftrightarrow\box{\top\to\top}. (Recall that \GL proves any closed formula in the language of modal logic if and only if \PA proves its translation to the language of arithmetic, so we prove the above statement by arguing: \PA can clearly show that \PA proves \top\to\top, and hence that \box{\top\to\top} is equivalent to \top.) Hence, it follows that \PA\vdash[A() = a_1]\leftrightarrow\top.
So the general idea for specifying modal formulas through algorithms with halting oracles is as follows:
-
Write a program in a language with access to a halting oracle, which uses that oracle to check the provability of relatively simple statements—which may, however, refer to the output of the program.
-
By inspection of the source code of this program, show in \PA that its output is a fixed point of a certain fully modalized formula of modal logic.
-
Show in modal logic that a certain closed formula \varphi is also a fixed point of this.
-
Conclude that \PA shows \varphi to be equivalent to the complicated formula that describes the output of your program.
Before looking at how this applies to Vladimir’s actual definition of UDT, there’s one complication to add: allowing more than two actions and more than two outcomes. To do so, instead of talking just about the single statement [A() = a_1], we can talk about the m statements [A() = a_1],\dotsc,[A() = a_m], and instead of talking about just [U() = u_1] (with u_1 = 10 in our example), we’ll talk about [U() = u_1],\dotsc,[U() = u_n].
Now we describe the behavior of the agent program by m statements of the form \PA,\vdash,[A() = a_i] \leftrightarrow F_i\big([A() = a_1],\dotsc,[A() = a_m],[U() = u_1],\dotsc,[U() = u_n]\big) and the behavior by the universe program by n statements of the form \PA,\vdash,[U() = u_j]\leftrightarrow G_j\big([A() = a_1],\dotsc,[A() = a_m]\big), where the F_i(a_1,\dotsc,a_m,u_1,\dotsc,u_n) are fully modalized formulas, and the G_j(a_1,\dotsc,a_m) are arbitrary formulas of provability logic. Intuitively, The fact that the F_i are modalized corresponds to the fact that agents can’t run the universe they’re in, even if they have its source code—they have to use abstract reasoning (such as proofs) to figure out what the consequences of different actions are. The universe program, on the other hand, can just run the agent and see what it does, which corresponds to being able to refer to it outside modal operators.
I hope it won’t be too confusing that I’m using a_i both to denote an action and the corresponding propositional variable, and similarly for u_i. From now on, I’ll abbreviate a_1,\dotsc,a_m by \vec a and u_1,\dotsc,u_n by \vec u (deviating from Vladimir’s use of overbars in order to keep overbars available for splicing natural numbers into arithmetic formulas).
Again, the place that the formulas F_i(\vec a,\vec u) and G_j(\vec a) are coming from is just by looking at what the source code of these programs does and writing down the modal formula that expresses what that source does. That said, we’re not just looking for formulas that in some intuitive sense correspond to what the program does; we’re looking for formulas such that the displayed statements above are provable in \PA.
By the fixed point theorem for provability logic, we then know that we can find formulas \varphi_i such that \GL\vdash\varphi_i\leftrightarrow F_i(\vec\varphi,G_1(\vec\varphi),\dotsc,G_n(\vec\varphi)), where \vec\varphi abbreviates \varphi_1,\dotsc,\varphi_m; and moreover, we know that then, \PA\vdash[A() = a_i]\leftrightarrow\varphi_i. (To apply the fixed point theorem, we need to use the fact that F(\vec a,G_1(\vec a),\dotsc,G_n(\vec a)) is fully modalized, which it clearly is.) Thus, we can carry out the same steps as in the two-action, two-outcome case above.
Now let’s revisit Vladimir’s definition of the agent program UDT, suitably adapted to my notation here and to my convention of making the a_i mutually exclusive:
There are finitely many sentences of the form "if such-and such a_i holds, then such-and-such u_j holds". Find all such sentences that are provable. If no such sentences were found, choose an arbitrary action—e.g., make F_1 true and all other F_i false.
From all pairs (a_i,u_j) found in the previous step, choose the a_i whose u_j is highest in our preference ordering. If there are multiple such a_i, choose any one, e.g. the one with the lowest index.
For each k, define F_k to be true if k = i (where i is the index of the a_i we chose in the previous step).
To cash this out, first we think of this definition as an algorithm with a halting oracle:
def UDT(): pairs = [] for i in range(m): for j in range(n): if Provable("UDT() == a_i --> U() = u_j"): pairs.append((i,j))
if len(pairs) == 0:
return a_1
else:
find the pair (i,j) in pairs with the largest j
return the corresponding a_i
Now we try to find F_i such that we can prove in PA that [\UDT() = a_i] is equivalent to F_i([\UDT() = a_1],\dotsc,[\UDT() = a_m],[U() = u_1],\dotsc,[U() = u_n]). This is somewhat tedious, but possible: for example, we can:
-
enumerate every subset X of the set of all pairs (i,j) with i\in{1,\dotsc,m} and j\in{1,\dotsc,n}
-
for every such subset X, write down the modal formula H_X(\vec a,\vec u) which is the conjunction of \box{a_i\to u_j}, for all (i,j)\in X, and \neg\box{a_i\to u_j}, for all (i,j)\notin X;
-
let F_i(\vec a,\vec u) be the disjunction of H_X(\vec a,\vec u) for every X such that the above algorithm returns i if the Provable(...) condition is evaluated according to X. (Since there are only finitely many X, we can just figure this out by running the code, interpreting the call to Provable(...) according to each X.)
You could certainly write the F_i more compactly than this, but the above version shows the idea.
Having done all this, not only do we have a better understanding of how to translate Vladimir’s specification into a formula of modal logic, we also have an idea of what to do with it: Given a universe program U() for which we can show statements of the form \PA,\vdash,[U() = u_j]\leftrightarrow G_j([\UDT() = a_1],\dotsc,[\UDT() = a_n]), we can now find formulas \varphi_i such that \GL\vdash\varphi_i\leftrightarrow F_i(\vec\varphi,G_1(\vec\varphi),\dotsc,G_n(\vec\varphi)), and can then conclude that \varphi_i tells us whether—\UDT()the program with the halting oracle—will output a_i, given the universe U().
This may be rather daunting, given the complicated definition of F_i(\vec a,\vec u) from above, but finding the fixed point formulas \varphi_i and evaluating their truth value are both decidable (unless I’m making a mistake?), so at least in principle (and hopefully in practice), we can write a program that figures out the answer for us.
(I’m not sure whether this post will be helpful or interesting to anybody—feedback appreciated...)
Thanks for writing this! I just added a paragraph to my post to explain how the algorithm corresponds to a modal formula. Also I like your notation with \vec{a} instead of \bar{a}, edited my post to use that instead.
Quite helpful, thanks!
Comment
(Thanks for the feedback!)
This seems good. I’ll think about it more.