Finite Factored Sets: Inferring Time

https://www.lesswrong.com/posts/hePucCfKyiRHECz3e/finite-factored-sets-inferring-time

The fundamental theorem of finite factored sets tells us that (conditional) orthogonality data can be inferred from probabilistic data. Thus, if we can infer temporal data from orthogonality data, we will be able to combine these to infer temporal data purely from probabilistic data. In this section, we will discuss the problem of inferring temporal data from orthogonality data, mostly by going through a couple of examples.

6.1. Factored Set Models

We’ll begin with a sample space, \Omega. Naively, one might except that temporal inference in this paradigm involves inferring a factorization of \Omega. What we’ll actually be doing, however, is inferring a factored set model of \Omega. This will allow for the possibility that some situations are distinct without being distinct in \Omega—that there can be latent structure not represented in \Omega. **Definition 38 **(model). Given a set \Omega, a model of \Omega is a pair M=(F,f), where F is a finite factored set and f:\text{set}(F)\rightarrow \Omega is a function from the set of F to \Omega. **Definition 39. ***Let S and \Omega be sets, and let f:S\rightarrow \Omega be a function from S to \Omega. * *Given a \omega\in \Omega, we let f^{-1}(\omega)={s\in S\mid f(s)=\omega}. * *Given an E\subseteq \Omega, we let f^{-1}(E)={s\in S\mid f(s)\in E}. * Given an X\in \text{Part}(\Omega), we let f^{-1}(X)\in\text{Part}(S) be given by f^{-1}(X)={f^{-1}(x) \ | \ x\in X, f^{-1}(x)\neq {}}. **Definition 40 **(orthogonality database). Given a set \Omega, an orthogonality database on \Omega is a pair D=(O,N), where O and N are both subsets of \text{Part}(\Omega)\times \text{Part}(\Omega)\times\text{Part}(\Omega). **Definition 41. ***Given an orthogonality database D=(O,N) on a set \Omega, and partitions X,Y,Z\in \text{Part}(\Omega), we write X \perp_{D} Y|Z if (X,Y,Z)\in O, and we write X \rightleftharpoons_{D} Y|Z if (X,Y,Z)\in N. * **Definition 42. **Given a set \Omega, a model M=(F,f) of \Omega, and an orthogonality database D=(O,N) on \Omega, we say M models D if for all X,Y,Z\in\text{Part}(\Omega),

6.2. Examples

**Example 1. ***Let \Omega={00,01,10,11} be the set of all bit strings of length 2. For i\in{0,1}, let x_i={i0,i1} be the event that the first bit is i, and let y_i={0i,1i} be the event that the second bit is i. Let X={x_0,x_1} and let Y={y_0,y_1}. * *Let v_0={00,11} be the event that the two bits are equal, let v_1={01,10} be the event that the two bits are unequal, and let V={v_0,v_1}. * Let D=(O,N), where O={(X,V,{\Omega})} and N={(V,V,{\Omega})}. **Proposition 33. **In Example 1, D is consistent. *Proof. *First observe that F=(\Omega,{X,V}) is a factored set, and so M=(F,f) is a model of \Omega, where f is the identity on \Omega. It suffices to show that M models D. Indeed h^F(X)={X}, and h^F(V)={V}, so X \perp^{F} V, so {f^{-1}(X)} \perp^{F} {f^{-1}(V)}|{f^{-1}({\Omega})}. Further, it is not the case that V \perp^{F} V, since V\neq \text{Ind}\Omega. Thus it is not the case that {f^{-1}(V)} \perp^{F} {f^{-1}(V)}|{f^{-1}({\Omega})}. Thus M satisfies all of the conditions to model D, so D is consistent. \square **Proposition 34. **In Example 1, X<_D Y. *Proof. *Let (F,f) be any model of \Omega that models D. Let F=(S,B). For any A\in\text{Part}(\Omega), let H_A=h^F(f^{-1}(A)). Our goal is to show that H_X is a strict subset of H_Y. First observe that X\leq\Omega Y\vee_\Omega V, so for any s,t\in S, if s\sim_{f^{-1}(Y)}t and s\sim_{f^{-1}(V)}t, then f(s)\sim_Y f(t) and f(s)\sim_V f(t), so f(s)\sim_X f(t), so s\sim_{f^{-1}(X)}t. Thus f^{-1}(X)\leq_S f^{-1}(Y) \vee_S f^{-1}(V). It follows that H_X\subseteq h^F(f^{-1}(Y) \vee_S f^{-1}(V))= H_Y\cap H_V. However, since X \perp_{D} V|{{\Omega}}, we have that H_X\cap H_V={}, so H_X\subseteq H_Y. By swapping X and V in the argument above, we also get that H_V\subseteq H_Y. Since V \rightleftharpoons_{D} V|{{\Omega}}, we have that H_V\neq {}. Thus H_V contains some element b. Observe that b\notin H_X, but b\in H_Y. Thus H_X is a strict subset of H_Y, so f^{-1}(X)<^F f^{-1}(Y). Since (F,f) was an arbitrary model of \Omega that models D, this implies that X<D Y. \square **Example 2. ***Let \Omega={000,001,010,011,100,101,110,111} be the set of all bit strings of length 3. For i\in{0,1}, let x_i={i00,i01,i10,i11} be the event that the first bit is i, let y_i={0i0,0i1,1i0,1i1} be the event that the second bit is i, and let z_i={00i,01i,10i,11i} be the event that the third bit is i. Let X={x_0,x_1}, let Y={y_0,y_1}, and let Z={z_0,z_1}. * *Let v_0={000,001,110,111} be the event that the first two bits are equal, let v_1={010,011,100,101} be the event that the first two bits are unequal, and let V={v_0,v_1}. * Let D=(O,N), where O={(X,V,{\Omega}),(X,Z,Y),(V,Z,Y)} and N={(X,Z,{\Omega}),(V,Z,{\Omega}),(Z,Z,Y)}. **Proposition 35. **In Example 2, D is consistent. *Proof. *Let S=\Omega\cup{00,01,10,11} be the set of all bit strings of length either 2 or 3. For i\in{0,1}, let x^\prime_i={i00,i01,i10,i11,i0,i1} be the event that the first bit is i, and let X^\prime={x_0^\prime,x_1^\prime}. For i\in{0,1}, let y^\prime_i={0i0,0i1,1i0,1i1,0i,1i} be the event that the second bit is i, and let Y^\prime={y_0^\prime,y_1^\prime}. Let v^\prime_0={000,001,110,111,00,11} be the event that the first two bits are equal, let v^\prime_1={010,011,100,101,01,10} be the event that the first two bits are unequal, and let V^\prime={v_0^\prime,v_1^\prime}. For i\in{0,1}, let z^\prime_i={00i,01i,10i,11i} be the event that the third bit exists and is i, let z^\prime_2={00,01,10,11} be the event that there are only two bits, and let Z^\prime={z_0^\prime,z_1^\prime,z^\prime_2}. Let B={X^\prime,V^\prime,Z^\prime}. Clearly, (S,B) is a finite factored set. Let f:S\rightarrow \Omega be given by f(s)=s if s\in \Omega, f(00)=000, f(01)=011, f(10)=100, and f(11)=111, so f copies the last bit on inputs of length 2, and otherwise leaves the bit string alone. We will show that (F,f) models D. First, observe that f^{-1}(X)=X^\prime, f^{-1}(Y)=Y^\prime, f^{-1}(V)=V^\prime, and f^{-1}(Z)={{000,010,100,110,00,10},{001,011,101,111,01,11}}. It is easy to verify that h^F(X^\prime)={X^\prime}, h^F(V^\prime)={V^\prime}, h^F(Y^\prime)={X^\prime,V^\prime}, and h^F(f^{-1}(Z))=B. From this, we get that X^\prime \perp^{F} V^\prime holds, but X^\prime \perp^{F} f^{-1}(Z) and V^\prime \perp^{F} f^{-1}(Z) do not hold. Next, observe that for i\in{0,1}, X^\prime|y_i=V^\prime|y_i={{0i0,0i1,0i},{1i0,1i1,1i}}. It is easy to verify that h^F(X^\prime|y_i)=h^F(V^\prime|y_i)={X^\prime,V^\prime}. Also, observe that f^{-1}(Z)|y_0={{000,100,00,10},{001,101}}, and observe that f^{-1}(Z)|y_1={{010,110},{011,111,01,11}}. It is easy to verify that h^F(f^{-1}(Z)|y_0)=h^F(f^{-1}(Z)|y_1)={Z^\prime}. From this, we get that {X^\prime} \perp^{F} {f^{-1}(Z)}|{Y^\prime} and {V^\prime} \perp^{F} {f^{-1}(Z)}|{Y^\prime} hold, and {f^{-1}(Z)} \perp^{F} {f^{-1}(Z)}|{Y^\prime} does not hold. Thus, (F,f) models D, so D is consistent. \square **Proposition 36. **In Example 2, X<_D Y<_D Z. *Proof. *Let (F,f) be any model of \Omega that models D. Let F=(S,B). For any A\in\text{Part}(\Omega), let H_A=h^F(f^{-1}(A)). Our goal is to show that H_X is a strict subset of H_Y and that H_Y is a strict subset of H_Z. First observe that X\leq\Omega Y\vee_\Omega V, so f^{-1}(X)\leq_S f^{-1}(Y)\vee f^{-1}(V), so H_X\subseteq H_Y\cup H_V. Since X \perp_{D} Y|{{\Omega}}, H_X\cap H_V={}, so H_X\subseteq H_Y. Symmetrically, H_V\subseteq H_Y, so H_X\cup H_V\subseteq H_Y. Similarly, Y\leq_\Omega X\vee_\Omega V, so H_Y\subseteq H_X\cup H_V. Thus H_Y=H_X\cup H_V. We also know that H_X and H_V are nonempty, because X \rightleftharpoons_{D} Z|{{\Omega}} and Y \rightleftharpoons_{D} Z|{{\Omega}}. Thus H_X is a strict subset of H_Y, so X<D Y. Let C\subseteq B be arbitrary such that H_X\cap C and H_V\cap (B\setminus C) are both nonempty. Fix some b_X\in H_X\cap C and b_V\in H_V\cap (B\setminus C). Since b_X\in H_X, there must exist s_0,s_1\in S such that s_0\sim_b s_1 for all b\in B\setminus{b_X}, but not s_0\sim{f^{-1}(X)} s_1. Thus it is not the case that f(s_0)\sim_X f(s_1). Without loss of generality, assume that f(s_0)\in x_0 and f(s_1)\in x_1. Similarly, since b_V\in H_V, there must exist t_0,t_1\in S such that t_0\sim_b t_1 for all b\in B\setminus{b_V}, but not t_0\sim_{f^{-1}(V)} t_1. Again, without loss of generality, assume that f(t_0)\in v_0 and f(t_1)\in v_1. For i,j\in {0,1}, let r_{ij}=\chi^F_{H_X}(s_i,t_j). Next, observe that r_{ij}\sim_{f^{-1}(X)} s_i, so f(r_{ij})\sim_X f(s_i)\in x_i, so f(r_{ij})\in x_i. Similarly, f(r_{ij})\in v_j, so f(r_{ij})\in x_i\cap v_j. Thus, if i=j, f(r_{ij})\in y_0, and if i\neq j, f(r_{ij})\in y_1. Further, observe that \chi^F_C(r_{00},r_{11})=r_{01}, since r_{00} and r_{11} agree on all factors other than b_X and b_V. In particular, this means that \chi^F_C(f^{-1}(y_0),f^{-1}(y_0))\neq f^{-1}(y_0). Similarly, since \chi^F_C(r_{01},r_{10})=r_{00}, we have that \chi^F_C(f^{-1}(y_1),f^{-1}(y_1))\neq f^{-1}(y_1). We will use this to show that for any y\in f^{-1}(Y) and A\in \text{Part}(y), either h^F(A)\cap H_Y={}, or H_Y\subseteq h^F(A). This is because h^F(A)\vdash^F A, so \chi^F_{h^F(A)}(y,y)=y, so by the above argument, if h^F(A)\cap H_X is nonempty, then H_V\subseteq h^F(A), which since H_V is nonempty means h^F(A)\cap H_V is nonempty, so H_X\subseteq h^F(A), so H_Y\subseteq h^F(A). Symmetrically, we also have that if h^F(A)\cap H_V is nonempty, then H_Y\subseteq h^F(A). Thus, if h^F(A)\cap H_Y is nonempty, then either h^F(A)\cap H_X or h^F(A)\cap H_V is nonempty, so H_Y\subseteq h^F(A). Note that for any y\in f^{-1}(Y), two of the elements among the four r_{ij} defined above are in y, and those two elements are in different parts in f^{-1}(X), so f^{-1}(X)|y has at least two parts, so h^F(f^{-1}(X)|y) is nonempty. However, h^F(f^{-1}(X)|y)\subseteq h^F(f^{-1}(X)\vee_S f^{-1}(Y))=H_Y. Thus, h^F(f^{-1}(X)|y)\cap H_Y\neq{}, so H_Y\subseteq h^F(f^{-1}(X)|y), so h^F(f^{-1}(X)|y)=H_Y. Symmetrically, h^F(f^{-1}(V)|y)=H_Y. In particular, this means that h^F(f^{-1}(Z)|y)\cap H_Y={}, since X \perp_{D} Z | Y. Since X \rightleftharpoons_{D} Z|{{\Omega}}, there exists some b_Z\in H_X\cap H_Z. Since b_Z\in H_Z, there exist u_0,u_1\in S such that u_0\sim_b u_1 for all b\in B\setminus {b_Z}, but it is not the case that u_0\sim_{f^{-1}(Z)} u_1. Without loss of generality, assume that f(u_0)\in z_0 and f(u_1)\in z_1. Let y=[u_0]{f^{-1}(Y)}. Let b_y be an arbitrary element of H_Y. Since b_Y\in H_Y, there exist q_0,q_1\in S such that q_0\sim_b q_1 for all b\in B\setminus {b_Y}, but it is not the case that q_0\sim{f^{-1}(Y)} q_1. Without loss of generality, assume that q_0\in y and q_1\notin y. Consider p_0=\chi^F_{H_Y}(q_0,u_0)=\chi^F_{H_Y}(q_0,u_1). Since q_0\in y, p_0\in y. Since u_0 is also in y, \chi^F_{h^F(f^{-1}(Z)|y)}(p_0,u_0)\sim_{f^{-1}(Z)}p_0. However, since h^F(f^{-1}(Z)|y)\cap H_Y={}, we have \chi^F_{h^F(f^{-1}(Z)|y)}(p_0,u_0)=u_0, so u_0\sim_{f^{-1}(Z)}p_0. If u_1 were in y, we would similarly have u_1\sim_{f^{-1}(Z)}p_0, which would contradict the fact that it is not the case that u_0\sim_{f^{-1}(Z)} u_1. Thus u_1\notin y. Next, consider p_1=\chi^F_{H_Y}(q_1,u_0)=\chi^F_{H_Y}(q_1,u_1). Since q_1\notin y, p_1\notin y. Since u_1 is also not in y, \chi^F_{h^F(f^{-1}(Z)|(S\setminus y))}(p_1,u_1)\sim_{f^{-1}(Z)}p_1. However, since h^F(f^{-1}(Z)|(S\setminus y))\cap H_Y={}, we have \chi^F_{h^F(f^{-1}(Z)|(S\setminus y))}(p_1,u_1)=u_1, so u_1\sim_{f^{-1}(Z)}p_1. Thus, it is not the case that p_0\sim_{f^{-1}(Z)} p_1. However, we constructed p_0 and p_1 such that p_0\sim_b p_1 for all b\neq b_Y. Thus b_Y\in H_Z. Since b_Y was arbitrary in H_Y, we have that H_Y\subseteq H_Z. Finally, we need to show that this subset relation is strict. Since Z \rightleftharpoons_{D} Z|Y, there is some y such that h^F(f^{-1}(Z)|y)\neq{}. Let b be any element of h^F(f^{-1}(Z)|y). Since h^F(f^{-1}(Z)|y)\cap H_Y={}, b\notin H_Y. However, b\in h^F(f^{-1}(Z)|y)\subseteq h^F(f^{-1}(Z)\vee_S f^{-1}(Y))=h_Z\cup H_Y. Therefore b\in H_Z. Thus H_Y is a strict subset of H_Z, so Y<_D Z. \square In the next post, we’ll discuss applications and future research directions.

Comment

https://www.lesswrong.com/posts/hePucCfKyiRHECz3e/finite-factored-sets-inferring-time?commentId=notZyReMXb9CjzkF7

For Example 2 /​ Prop 35, would this model also work? Define W to be the factor corresponding to the question "are the second and third bits equal or not?" Then ((\Omega, {X, V, W }), \text{id}) is a model of \Omega. I believe this is consistent with D: For O={(X,V,{\Omega}),(X,Z,Y),(V,Z,Y)}: We have h^F(X) = { X } and h^F(V) = { V } for the first condition. We have h^F(X \mid Y) = { X } and h^F(Z \mid Y) = { W } for the second condition. We have h^F(V \mid Y) = { V } and h^F(Z \mid Y) = { W } for the third condition. For* N={(X,Z,{\Omega}),(V,Z,{\Omega}),(Z,Z,Y)}.* We have h^F(Z) = { X, V, W } for the first and second conditions. We have h^F(Z \mid Y) = { W } for the third condition.

Comment

https://www.lesswrong.com/posts/hePucCfKyiRHECz3e/finite-factored-sets-inferring-time?commentId=NXEyEABFiMc57dtd9

I think that works, I didn’t look very hard. Yore histories of X given Y and V given Y are wrong, but it doesn’t change the conclusion.

Comment

Yeah, both of those should be { X, V }, if I’m not mistaken (a second time).

Comment

Yeah, also note that the history of X given Y is not actually a well defined concept. There is only the history of X given y for y\in Y. You could define it to be the union of all of those, but that would not actually be used in the definition of orthogonality. In this case h^F(X|y), h^F(V|y), and h^F(Z|y) are all independent of choice of y\in Y, but in general, you should be careful about that.

Comment

Yeah, fair point. (I did get this right in the summary; turns out if you try to explain things from first principles it becomes blindingly obvious what you should and shouldn’t be doing.)