An approach to logical counterfactuals inspired by the Demski prior

https://www.lesswrong.com/posts/5bd75cc58225bf0670374fa6/an-approach-to-logical-counterfactuals-inspired-by-the-demski-prior

This is a probabilistic approach to logical counterfactuals inspired by the Demski prior. Given a sentence \phi, we wish to ask what other logical sentences are true in the world C(\phi) where we correctly counterfactually assume \phi. I spoke about this before here and here, but now I am going to approach the problem slightly differently. Instead of insisting that every other sentence \psi is either true or false in the counterfactual world C(\phi), I allow C(\phi) to assign probabilities to each sentence.

When you try to describe counterfactual worlds as sets of true and false sentences, you necessarily have very sharp boundaries. You will have, for example, counterfactual worlds where \phi and \psi are true, but \phi\wedge \psi is false. Probabilities seem to be more realistic by allowing us to smooth the boundaries. As you look at sentences which are more and more logically distinct from \phi, you can gradually change probabilities so that they will represent the truth, rather than representing consequences of \phi.

Let \mu be a measure on logical sentences, for example, \mu(\phi)=2^{-\ell(\phi)}, where \ell(\phi) is the number of bits necessary to encode \phi. Let T_\mathbb{N} be the theory containing all sentences true about the natural numbers. Consider the following procedure which computes P(\psi|\phi). This definition is only for \phi which are consistent by themselves.

Let T_0 be the theory containing only the sentence \phi. For n\geq 0, compute T_{n+1} from T_n by sampling a sentence t_n according to \mu. If both T_\mathbb{N}+t_n and T_\mathbb{n}+t_n are consistent, then T_{n+1}=T_n+t_n. Otherwise, T_{n+1}=T_n. Let T_\infty be the union of all of the T_n. P(\psi|\phi) is the probability that \psi is a consequence of T_\infty.

As it is, this procedure is not approximable, but you can make a similar thing approximable by replacing T_\mathbb{N} with PA, or a complete theory sampled from your favorite approximable distribution.

Claim: P(\cdot|\phi) gives a coherent probability assignment which assigns probability 1 to \phi, and thus represents a probability distribution on complete theories.

Proof: The probability distribution on complete theories is exactly the distribution on T_\infty. All we need to show is that T_\infty is complete (with probability 1). Take a sentence \psi. Either \psi or \neg\psi is consistent with T_\mathbb{N}. WLOG assume \psi is consistent with T_\mathbb{N}. The sentence \psi is eventually sampled at some time n. Either \psi is added to T_n, or it is inconsistent with T_n. Therefore, either T_n proves \neg\psi, or T_{n+1} proves \psi, so T_\infty proves either \psi or \neg \psi. Note that T_\infty does not necessarily contain either \psi or \neg \psi as an axiom.

Now I will give the reasons I am considering this proposal. None of the following is stuff I actually know to be true. I think it is plausible that my intuitions about the result of this procedure are very wrong.

It seems that true sentences will generally have high probabilities. Thus, if \phi and \psi are both complex sentences, and there is a simple proof of \phi\rightarrow\psi, it is likely that many true sentences will be sampled and accepted before \psi is considered. Thus, it seems plausible that sufficiently many simple axioms to complete the simple proof of \phi\rightarrow\psi will be accepted before \psi is considered. If this happens, \psi will automatically be included. It seems then that \psi will have a high probability.

Thus it is plausible that this proposal follows the spirit of the conjecture that simple proofs of \phi\rightarrow\psi correspond to legitimate counterfactuals. Note that this informal argument only goes the one direction. If If there is a simple proof of \phi\rightarrow\psi, it seems likely that P(\psi|\phi) will be large, but it does not seem likely that P(\phi|\psi) will be large. This in consistent with my idea of logical counterfactuals.

Proof-length based definitions of counterfactuals usually have the unfortunate property that they are dependent on the formalities of our proof system, and equivalent proof systems can give very different proof lengths. Perhaps this proposal can get many of the nice properties of proof length based systems, while being independent choosing different equivalent ways to carry out proofs.

Comment

https://www.lesswrong.com/posts/5bd75cc58225bf0670374fa6/an-approach-to-logical-counterfactuals-inspired-by-the-demski-prior?commentId=5bd75cc58225bf0670374fc2

I am, as usual, a bit confused. If you require a sentence to be consistent with (e.g.) PA before being added to T_{n+1}, this proposal is unable to assign nonzero probability to the trillionth digit of pi being 2 - and conditional on the trillionth digit of pi counterfactually being 2, it is unable to go on to believe in PA.

It seems like some looser condition for adding to the theory is needed. Not just as a sop to practicality, but to get at some important desiderata of logical counterfactuals.

Here’s the picture of logical counterfactuals that I’m currently thinking under:

People have some method for generating mental models of math, and these mental models have independencies that the ground truth mathematics doesn’t. E.g., when I imagine the trillionth digit of pi being 2, it doesn’t change (in the mental model) whether the Collatz conjecture is true. In fact, for typical scenarios I consider, I can continue to endorse (in my mental model) the typical properties of the real numbers, even when considering a collection of statements, most of which are inconsistent with those properties (like assigning a distribution over some digit of pi).

This apparent independence produces an apparent partial causal graph (within a certain family of mental models), which leads to the use of causal language like "Even if one set the trillionth digit of pi to 2, it would not change the things I’m taking for granted in my mental models, nor would it change the things that would not change in my mental model when I change the setting of this digit of pi."