Analyzing Multiplayer Games using IMPACT

Contents

Overview
Motivating Examples—Understanding Multiplayer Games
Counting heads
Illusory Impact
Coordinated Impact
Defining IMPACT using Bayesian Networks
Definition—General Bayesian Networks
Application—Representing a Multiplayer Game as a Bayesian Network
Game Theory in terms of IMPACT
POWER- and IMPACT-scarcity
IEU in terms of IMPACT
Considering other players / IMPACT-trading This post is an informal writeup of an idea discovered while investigating POWER. It offers an additional tool for framing multi-agent games in terms of attainable utility, with the hope that data from both IMPACT and POWER can offer a more complete view of POWER-scarcity dynamics in multi-agent games. Note on writing style: I use royal "we" for explaining math; the results are essentially my own unreviewed work.

Overview

The purpose of this post is to define and give intuition for a measure of a player’s impact on some real-valued outcome variable in an arbitrary multiplayer game; we call this measure "IMPACT". We present motivating examples of multiplayer games, then define IMPACT in the more general context of arbitrary Bayesian networks. We then explore some basic connections between IMPACT and standard multiplayer game theory and present some conjectures to motivate further research.

Motivating Examples—Understanding Multiplayer Games

One of the difficulties in understanding multiplayer games is the principle that in general, a player’s optimal action is dependent on the actions of other players. One way around this problem is to restrict consideration to Nash Equilibria, which fully specify optimal actions for us. However, this loses the power to describe sub-optimal actions, which are relevant in real-world examples involving imperfect humans and intractably large action spaces. While Nash Equilibria "work" by first conditioning on optimal actions and then considering probabilistic strategies, we go the other way around: fix some *arbitrary *mixed strategy profile, then analyze expected utility. In the single-player setting, this is analogous to a value function and can be constructed straightforwardly. In the multiplayer setting, we’re forced to deal with dependencies on other players’ strategies that complicate the notion of "expected utility". To handle this dependency, our framework makes the following assumptions motivated by Bayesian games:

Each player’s posterior distribution over other players’ actions is optimal given that player’s information ("common prior assumption" )
We consider interim expected utility (IEU), which takes an expectation across the player’s posterior beliefs of other players’ actions Borrowing notation from Bayesian games, we now have enough machinery to consider various games in terms of IEU u_i: A_i \to \mathbb{R}. In each example, we fix a strategy profile \mathbf{a} \sim \sigma and assume some common payoff R (which we’ll later describe as the "outcome variable" in the formal definition of IMPACT).

Counting heads

Let a_i \sim \text{Ber}(\frac 1 2), R = \sum_{i = 1}^n a_i. This game can be thought of as follows: "everyone flips a coin, with reward given by the number of heads". Intuitively, the strategy for this game should be simple: flipping heads is better than flipping tails. Calculating IEU, we see that the results match our prediction: u_i(1) = 1 + \mathbb{E}{a{-i}}[\sum_{j \neq i} a_i] = 1 + \sum_{j \neq i} (\frac 1 2) = \boxed{\frac {n - 1} 2 + 1}u_i(0) = \mathbb{E}{a{-i}}[\sum_{j \neq i} a_i] = \sum_{j \neq i} (\frac 1 2) = \boxed{\frac {n - 1} 2}In fact, the difference in IEU is exactly 1, contributed by the added heads coming from coin i.

Illusory Impact

Let a_i \sim \text{Ber}(\frac 1 2), R = \oplus_{i = 1}^n a_i. This game can be thought of as follows: "everyone flips a coin; you win iff there are an odd number of heads". This game has nice correlated equilibria: if players can coordinate and determine their coin’s side, then they can easily win. However, we assume each player just blindly flips their coin, regardless of reward. Even though player i’s strategy is random, we can compute IEU for each choice of action: u_i(a_i) = \mathbb{P}[\oplus_{j = 1}^n a_j = 1] = \mathbb{P}[\oplus_{j \neq i} a_j \neq a_i] = \frac 1 2Importantly, IEU is constant over choice of a_i, which means that player i has no "good" strategy (even assuming they can choose what side their coin lands on).

Coordinated Impact

Let \omega \sim \text{Ber}(\frac 1 2). Now, let a_i, R = \omega. This game can be thought of as follows: "a referee flips a coin, everyone shouts the result of the coin flip, and everyone wins iff it’s heads". Intuitively, this is a ridiculous game: no player produces a meaningful action; the entire game is determined by the referee’s coin flip. However, if we shut our eyes and blindly compute IEU, we find that u_i(1) = 1, u_i(0) = 0, thus player i benefits from shouting "heads"?! Well, in a sense, they do. The universes where player i shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player i doesn’t choose their action, the coin (\omega) does. If we condition on the value of \omega, then each player’s action becomes deterministic, thus IEU is constant across each player’s (trivial) action space. Interestingly, we have a clear notion of the "IEU" of values of \omega, even though it’s an external variable rather than an action in the game. This suggests a limitation of conceptualizing IMPACT strictly in terms of games: variables have impact, not just actions. In the Bayesian network formalization to come, we’ll see that the node \omega impacts the outcome variable, while no nodes a_i do.

Defining IMPACT using Bayesian Networks

As suggested in the "Coordinated impact" example, the most principled approach is to define IMPACT as a property of dependent variables, then consider game theory as a useful application. Again motivated by the "Coordinated impact" example, we can show that IMPACT must explicitly consider variable dependencies to avoid issues with double-counting. We formalize with Bayesian networks to provide the desired dependency structure.

Definition—General Bayesian Networks

Borrowing notation from Wikipedia, consider an arbitrary Bayesian network G = (V, E) with variables {X_v}{v \in V}. Additionally, choose an "outcome node" v_O \in V (we can assume that v_O is a descendant of each v \in V, but the assumption isn’t required); we use X{v_O} as our outcome variable to measure IMPACT against. We now define some notation:

Given arbitrary R.V.s A, B, we define the *conditional expectation *of A given B = b as e(A, B) := \mathbb{E}_{B = b}[A]Note that e(A, B) is itself a random variable in the value of B.
Given R.V.s A, B, we call B a marginal variable of A if the R.V. identity B = e(A, B) holds. Intuitively, we can think of B as an estimate of A given limited information.
Consider nodes v_1, v_2 \in V. We say v_1 is an ancestor of v_2 (equivalently, v_2 is a descendant of v_1) iff v_1 \neq v_2 and there exists a directed path v_1 \to v_2. This relationship is direct iff such a path consists of a single edge.
Let A(v) be the set of ancestors of node v \in V. Let A_d(v) be the set of direct ancestors of node v \in V.
Given node v \in V, define the IMPACT of X_v on X_{v_O} to be the following R.V. I(v) := e(X_{V_O}, {X_u \mid u \in A_d(v) \lor u = v}) - e(X_{V_O}, {X_u \mid u \in A_d(v)})We now work toward a notion of IMPACT-scarcity—the idea that the "magnitude" of IMPACT of each node is bounded above. We will eventually demonstrate this claim in terms of the sum of variances of I(v). First, we prove some necessary lemmas: Lemma 1: Given an arbitrary topological ordering V = {v_i}{i = 1}^n, we can construct the following collection of R.V.s \Delta_i := e(X{v_O}, {X_{v_j} \mid j \leq i}) - e(X_{v_O}, {X_{v_j} \mid j < i})We now claim the following identity on R.V.s: \sum_{i = 1}^n \Delta_i \equiv X_{v_O} - e(X_{v_O}, \emptyset)Proof: The identity follows from a telescoping sums argument, as well as the observation that for v_i = v_O, we have e(X_{v_i}, {X_{v_j} \mid j \leq i}) \equiv X_{v_O}. Lemma 2: Consider R.V.s A, B s.t. B is a marginal variable of A. Then \text{Var}(A) \geq \text{Var}(B). Proof: Consider an arbitrary vector space of R.V.s containing A, B. We see that the function f(v) \to e(v, B) is a projection, while g(v) \to \text{Var}(v) is a norm. Thus, the claim is equivalent to g(v) \geq g(f(v)), which is a property of general vector spaces. Lemma 3: Consider arbitrary R.V.s A, B, C. If e(A, B) \equiv A (if B fully determines A), then e(C, A) is a marginal variable of e(C, B). Proof: We observe the following e(C, A) = \mathbb{E}{B \vert A}[e(C, {A, B})] = \mathbb{E}{B \vert A}[e(C, B)] = e(e(C, B), A)Now, consider the quantity e(e(C, B), e(C, A)). First, we see that A fully determines e(C, A). Thus, viewing e(X, Y) as a least-squares estimate of X given Y, we find that the estimate e(e(C, B), e(C, A)) is at most as accurate as e(e(C, B), A) = e(C, A). However, e(e(C, B), e(C, A)) "knows" e(C, A) by the definition of e, thus the optimal estimate is e(e(C, B), e(C, A)) = e(C, A)The result follows by the definition of a marginal variable. Note: We could also prove the result by expressing "A is a marginal variable of B" as "some vector space projection maps B to A" (equivalently, A = e(B, C) for some C), the result then follows from e(C, A) = e(e(C, B), A). Lemma 4: Consider arbitrary 1 \leq i < j \leq n. Then the R.V. e(\Delta_j, \Delta_i) = 0. Proof: By "pausing" evaluation of the Bayesian network before X_{v_j} is determined, we can argue that the R.V. e(\Delta_j, {X_{v_k} \vert k < j}) \equiv 0. Since \Delta_i is fully determined by {X_{v_k} \vert k < j}, we conclude by Lemma 3 that e(\Delta_j, \Delta_i)is a marginal variable of e(\Delta_j, {X_{v_k} \vert k < j}). By Lemma 2 (and noting that all vector space projections map 0 to 0), we conclude e(\Delta_j, \Delta_i) \equiv 0. Lemma 5: For each 1 \leq i \leq n, I(v_i) is a marginal variable of \Delta_i. Proof: We invoke Lemma 3, choosing A = {X_u \mid u \in A(v_i) \lor u = v_i} and B = {X_{v_j} \mid j \leq i}. We can now proceed with our claim of IMPACT-scarcity: Theorem 1 (IMPACT-scarcity): \sum_{i = 1}^n \text{Var}(I(v_i)) \leq \text{Var}(X_{v_O})Proof: By Lemma 1, we have the following R.V. identity \sum_{i = 1}^n \Delta_i \equiv X_{v_O} - e(X_{v_O}, \emptyset)We now compute variance of both sides. By Lemma 4, the \text{Cov}(\Delta_i, \Delta_j) terms are 0 for i \neq j. Thus, we’re left with \sum_{i = 1}^n \text{Var}(\Delta_i) \leq \text{Var}(X_{v_O})We finish by applying lemmas 5 and 2, which prove \text{Var}(I(v_i)) \leq \text{Var}(\Delta_i). Note: The only inequality in the above proof is the equation \text{Var}(I(v_i)) \leq \text{Var}(\Delta_i). Thus, equality is achieved when this equation is an equality for each 1 \leq i \leq n (for example, in chain-shaped Bayesian networks).

Application—Representing a Multiplayer Game as a Bayesian Network

As promised, we now apply our framework for IMPACT to the game theory framework from earlier. To begin, consider an arbitrary multiplayer (Bayesian) game with fixed strategy profile \sigma . We represent the mechanics of the game and the players’ strategies as a Bayesian network: Note: In an abuse of notation, we let a_i refer both to the R.V. representing player i‘s action and to the node a_i in the Bayesian network (thus, X_{a_i} (Bayesian network variable) \sim a_i (action)). Thus, statements like I(a_i) can be parsed as "the impact of player i’s action a_i" without the need for cumbersome notation (and similarly for other nodes in the Bayesian network). For now, we define an arbitrary outcome node O as a direct descendant of every other node. We will later set X_O to represent game-theoretically meaningful quantities; in particular player i’s reward R_i. Additionally, call a node v deterministic if X_v is fully determined by {X_u \mid u \in A_d(v)} (equivalently, if e(X_v, {X_u \mid u \in A_d(v)}) = X_v). Observe that for any deterministic node v, we have I(v) \equiv 0 by the definition of IMPACT. This has two important implications for our model:

We see that the t_i are deterministic functions of \omega and the outcome O is a deterministic function of all variables in the Bayesian network. Thus, the only non-deterministic variables are \omega and a_i, which by the above must contribute all IMPACT.
Since O contributes zero IMPACT (because it’s deterministic), its dependencies A_d(O) won’t matter for our analysis. Thus, we can safely let O depend on the entire Bayesian network, despite the fact that for certain relevant cases, O depends only on certain variables (example: O = R_i) We’re left with the IMPACT terms from \omega and each a_i, which we interpret in game-theoretic language:
The IMPACT I(\omega) can be thought of as "how good a random draw is the chosen value of \omega?" This doesn’t translate precisely (as far as I know), but intuition can be gained from viewing the coordinated impact example as a function \omega \to O.
The IMPACT I(a_i) "looks like" player i’s IEU under the reward function given by X_O. More specifically: letting O = R_i, we have u_i(a_i) \equiv e(R_i, {t_i, a_i}) \equiv e(R_i, {t_i}) + I(a_i)The residual term e(X_O, {t_i}) is best understood as analogous to I(\omega), but considering t_i as the fundamental random quantity (instead of \omega, its source of randomness). Since player i only acts based on t_i, this corresponds to player i’s logic of "how good a random draw do I think \omega is, given only knowledge of t_i?"

Game Theory in terms of IMPACT

While research on game-theoretic results from the perspective of IMPACT is extremely limited (I only know of the preliminary work I’ve already done), the best litmus test for a proposed framework is to see if it readily produces meaningful results. In this section, I’ll outline the immediate results from defining Impact in the setting of multiplayer games and suggest some avenues for further exploration.

POWER- and IMPACT-scarcity

The crux of these results is the fundamental notion of IMPACT-scarcity and connection between I(a_i) and player i’s IEU. We begin with stating our IMPACT-scarcity result in terms of outcome variable O: \sum \text{Var}(I(v_i)) = I(\omega) + \sum_{i = 1}^n I(a_i) \leq \text{Var}(O)One natural vein of results comes from plugging in variables for O and seeing what comes out. We give some basic examples:

Letting O be constant, we find \text{Var}(I(a_i)) = 0 \to I(a_i) is constant. This makes sense—you can impact a constant variable, but you can’t do anything to change its value.
Letting O = R_i, we find a competitive dynamic between player i‘s interests and "noise" generated by other players’ actions. This can be understood by arguing that as \text{Var}(I(a_i)) increases, player i‘s optimal strategy becomes increasingly robust to other players’ choices of action. As mentioned in the intro, one goal of IMPACT research is to unify the idea of POWER- and IMPACT-scarcity. I suspect that the intuitive understanding is "IMPACT = change in POWER", motivated by the simplification of "POWER = \mathbb{E}, IMPACT = \text{Var}". While the notion remains far from precise, I conjecture that IMPACT on a player’s POWER is a marginal variable of IMPACT on that player’s reward, from which an upper bound on "\DeltaPOWER" follows.

IEU in terms of IMPACT

We now start from our other main result: u_i(a_i) = e(R_i, t_i) + I(a_i). Since we don’t assume any information about IEU by default, a natural starting point is in the case of a Nash Equilibrium. By definition of (Bayesian) Nash Equilibrium, each player’s strategy must be a best response. Thus, for each a_i in the support of \sigma_i(t_i), we have u_i(a) = \max_{a_i}(u_i(a_i)) = M. This implies e(R_i, t_i) = I(a_i) = 0, which can be understood by arguing that in a Nash Equilibrium, no player can unilaterally increase their expected reward. Sensing a deeper connection between IMPACT and Nash Equilibria, we define the self-IMPACT of action a_i to be I(a_i) given O = R_i. Above, we showed that in a Nash Equilibrium, each player’s actions have zero self-IMPACT. Generalizing to suboptimal actions, we find that all actions have self-IMPACT \leq 0, with equality when the action is a best response. Unfortunately, the converse doesn’t hold: each player only taking zero self-IMPACT actions doesn’t imply a Nash equilibrium. It implies a Nash Equilibrium of the game where the action spaces A_i are restricted to only actions played with nonzero probability, but these notions aren’t equivalent if strong actions remain unplayed (consider the mixed-strategy equilibrium for rock-paper-scissors when generalized to rock-paper-scissors-[insta-win action]). We can also explore the fact that in a Nash Equilibrium, POWER equals IEU. Thus, we can write \text{POWER}(i, \sigma) = e(R_i, t_i), which is strictly a function of \omega. Intuitively, IMPACT accounts for variation in a_i while POWER takes a max over it, thus they become equivalent in limiting cases for \sigma_i.

Considering other players / IMPACT-trading

As a final and unexplored angle on IMPACT, consider the case where player 1 impacts R_2 and player 2 impacts R_1. Assuming it wouldn’t adversely affect their own utilities, the players can "trade" by modifying their strategies to mutually grant each other increased reward. The premise itself immediately raises red flags; I’ll attempt to briefly address them:

"this requires communication between players!"—yep. Barring non-causal decision theory, agents need some way to coordinate strategies.
"what if the trade accidentally hurts one player?"—the simplest answer is "they only trade if it’s mutually beneficial", but that’s equivalent to existing solutions to coordination problems like the Prisoners’ dilemma. Ideally, a notion of "reward exchange rate" could be computed using IMPACT, especially if allowing for generalizations of reward like quasilinear utility.
"IMPACT is essentially a measure of variance, while utility is a measure of expectation. How do you convert between them?"—I don’t know, but have ideas:
Trade off values of independent "sub-variables" of your action space. Example: if we’re playing 10 simultaneous Prisoners’ Dilemma-s, then trade off "I cooperate if you do" for each individual PD instance.
Find some linear measure of IMPACT and trade with that instead. This looks much more like POWER-trading, which offers a similar mechanism for mutually increased reward.

Temporarily putting the issues aside, I intend to explore IMPACT-trading as an attempt to understand coordination-centric games like the Prisoners’ Dilemma. More generally, I hope to apply the IMPACT framework to a broad range of multiplayer game-theoretic phenomena and see if new insight can be gained.

id

3vbAPo6Z9aitNXY28
authors

TurnTrout
score

3
omega_karma
votes

2
date_published

2021-04-08T23:04

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=3vbAPo6Z9aitNXY28

(midco developed this separately from our project last term, so this is actually my first read) I have a lot of small questions. What is your formal definition of the IEU u_i? What kinds of goals is it conditioning on (because IEU is what you compute after you view your type in a Bayesian game)? Multi-agent "impact" seems like it should deal with the Shapley value. Do you have opinions on how this should fit in? You note that your formalism has some EDT-like properties with respect to impact:

Well, in a sense, they do. The universes where player i shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player i doesn’t choose their action, the coin (\omega) does. If we condition on the value of \omega, then each player’s action becomes deterministic, thus IEU is constant across each player’s (trivial) action space. This seems weird and not entailed by the definition of IEU, so I’m pretty surprised that IEU would tell you to shout ‘heads.’ Given arbitrary R.V.s A, B, we define the estimate of A given B=b as e(A, B):=\mathbb{E}_{B=b}[A]Is this supposed to be e(A,B=b)? If so, this is more traditionally called the conditional expectation of A given B=b.

Comment

id

ZaModHE4AAtyQGgFb
authors

midco
score

1
omega_karma
votes

1
date_published

2021-04-10T01:17

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=ZaModHE4AAtyQGgFb

Answering questions one-by-one:

I played fast and loose with IEU in the intro section. I think it can be consistently defined in the Bayesian game sense of "expected utility given your type", where the games in the intro section are interpreted as each player having constant type. In the Bayesian Network section, this is explicitly the definition (in particular, player i’s IEU varies as a function of their type).
Upon reading the Wiki page, it seems like Shapley value and Impact share a lot of common properties? I’m not sure of any exact relationship, but I’ll look into connections in the future.
I think what’s going on is that the "causal order" of \omega and a_i is switched, which makes a_i "look as though" it controls the value of \omega. In terms of game theory the distinction is (I think) definitional; I include it because Impact has to explicitly consider this dynamic.
In retrospect: yep, that’s conditional expectation! My fault for the unnecessary notation. I introduced it to capture the idea of a vector space projection on random variables and didn’t see the connection to pre-existing notation.

id

FgasXRjRAfbg93Pqt
authors

midco
score

1
omega_karma
votes

1
date_published

2021-05-31T13:50

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=FgasXRjRAfbg93Pqt

Upon reflection, I now suspect that the Impact I(a_i) is analogous to Shapley Value. In particular, the post could be reformulated using Shapley values and would attain similar results. I’m not sure whether Impact-scarcity of Shapley values holds, but the examples from the post suggest that it does. (thanks to TurnTrout for pointing this out!)

Analyzing Multiplayer Games using IMPACT

Overview

Motivating Examples—Understanding Multiplayer Games

Counting heads

Illusory Impact

Coordinated Impact

Defining IMPACT using Bayesian Networks

Definition—General Bayesian Networks

Application—Representing a Multiplayer Game as a Bayesian Network

Game Theory in terms of IMPACT

POWER- and IMPACT-scarcity

IEU in terms of IMPACT

Considering other players /​ IMPACT-trading

Comment

Comment

Considering other players / IMPACT-trading