Analyzing Multiplayer Games using IMPACT

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact

Contents

Overview

The purpose of this post is to define and give intuition for a measure of a player’s impact on some real-valued outcome variable in an arbitrary multiplayer game; we call this measure "IMPACT". We present motivating examples of multiplayer games, then define IMPACT in the more general context of arbitrary Bayesian networks. We then explore some basic connections between IMPACT and standard multiplayer game theory and present some conjectures to motivate further research.

Motivating Examples—Understanding Multiplayer Games

One of the difficulties in understanding multiplayer games is the principle that in general, a player’s optimal action is dependent on the actions of other players. One way around this problem is to restrict consideration to Nash Equilibria, which fully specify optimal actions for us. However, this loses the power to describe sub-optimal actions, which are relevant in real-world examples involving imperfect humans and intractably large action spaces. While Nash Equilibria "work" by first conditioning on optimal actions and then considering probabilistic strategies, we go the other way around: fix some *arbitrary *mixed strategy profile, then analyze expected utility. In the single-player setting, this is analogous to a value function and can be constructed straightforwardly. In the multiplayer setting, we’re forced to deal with dependencies on other players’ strategies that complicate the notion of "expected utility". To handle this dependency, our framework makes the following assumptions motivated by Bayesian games:

Counting heads

Let a_i \sim \text{Ber}(\frac 1 2), R = \sum_{i = 1}^n a_i. This game can be thought of as follows: "everyone flips a coin, with reward given by the number of heads". Intuitively, the strategy for this game should be simple: flipping heads is better than flipping tails. Calculating IEU, we see that the results match our prediction: u_i(1) = 1 + \mathbb{E}{a{-i}}[\sum_{j \neq i} a_i] = 1 + \sum_{j \neq i} (\frac 1 2) = \boxed{\frac {n - 1} 2 + 1}u_i(0) = \mathbb{E}{a{-i}}[\sum_{j \neq i} a_i] = \sum_{j \neq i} (\frac 1 2) = \boxed{\frac {n - 1} 2}In fact, the difference in IEU is exactly 1, contributed by the added heads coming from coin i.

Illusory Impact

Let a_i \sim \text{Ber}(\frac 1 2), R = \oplus_{i = 1}^n a_i. This game can be thought of as follows: "everyone flips a coin; you win iff there are an odd number of heads". This game has nice correlated equilibria: if players can coordinate and determine their coin’s side, then they can easily win. However, we assume each player just blindly flips their coin, regardless of reward. Even though player i’s strategy is random, we can compute IEU for each choice of action: u_i(a_i) = \mathbb{P}[\oplus_{j = 1}^n a_j = 1] = \mathbb{P}[\oplus_{j \neq i} a_j \neq a_i] = \frac 1 2Importantly, IEU is constant over choice of a_i, which means that player i has no "good" strategy (even assuming they can choose what side their coin lands on).

Coordinated Impact

Let \omega \sim \text{Ber}(\frac 1 2). Now, let a_i, R = \omega. This game can be thought of as follows: "a referee flips a coin, everyone shouts the result of the coin flip, and everyone wins iff it’s heads". Intuitively, this is a ridiculous game: no player produces a meaningful action; the entire game is determined by the referee’s coin flip. However, if we shut our eyes and blindly compute IEU, we find that u_i(1) = 1, u_i(0) = 0, thus player i benefits from shouting "heads"?! Well, in a sense, they do. The universes where player i shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player i doesn’t choose their action, the coin (\omega) does. If we condition on the value of \omega, then each player’s action becomes deterministic, thus IEU is constant across each player’s (trivial) action space. Interestingly, we have a clear notion of the "IEU" of values of \omega, even though it’s an external variable rather than an action in the game. This suggests a limitation of conceptualizing IMPACT strictly in terms of games: variables have impact, not just actions. In the Bayesian network formalization to come, we’ll see that the node \omega impacts the outcome variable, while no nodes a_i do.

Defining IMPACT using Bayesian Networks

As suggested in the "Coordinated impact" example, the most principled approach is to define IMPACT as a property of dependent variables, then consider game theory as a useful application. Again motivated by the "Coordinated impact" example, we can show that IMPACT must explicitly consider variable dependencies to avoid issues with double-counting. We formalize with Bayesian networks to provide the desired dependency structure.

Definition—General Bayesian Networks

Borrowing notation from Wikipedia, consider an arbitrary Bayesian network G = (V, E) with variables {X_v}{v \in V}. Additionally, choose an "outcome node" v_O \in V (we can assume that v_O is a descendant of each v \in V, but the assumption isn’t required); we use X{v_O} as our outcome variable to measure IMPACT against. We now define some notation:

Application—Representing a Multiplayer Game as a Bayesian Network

As promised, we now apply our framework for IMPACT to the game theory framework from earlier. To begin, consider an arbitrary multiplayer (Bayesian) game with fixed strategy profile \sigma . We represent the mechanics of the game and the players’ strategies as a Bayesian network: Note: In an abuse of notation, we let a_i refer both to the R.V. representing player i‘s action and to the node a_i in the Bayesian network (thus, X_{a_i} (Bayesian network variable) \sim a_i (action)). Thus, statements like I(a_i) can be parsed as "the impact of player i’s action a_i" without the need for cumbersome notation (and similarly for other nodes in the Bayesian network). For now, we define an arbitrary outcome node O as a direct descendant of every other node. We will later set X_O to represent game-theoretically meaningful quantities; in particular player i’s reward R_i. Additionally, call a node v deterministic if X_v is fully determined by {X_u \mid u \in A_d(v)} (equivalently, if e(X_v, {X_u \mid u \in A_d(v)}) = X_v). Observe that for any deterministic node v, we have I(v) \equiv 0 by the definition of IMPACT. This has two important implications for our model:

Game Theory in terms of IMPACT

While research on game-theoretic results from the perspective of IMPACT is extremely limited (I only know of the preliminary work I’ve already done), the best litmus test for a proposed framework is to see if it readily produces meaningful results. In this section, I’ll outline the immediate results from defining Impact in the setting of multiplayer games and suggest some avenues for further exploration.

POWER- and IMPACT-scarcity

The crux of these results is the fundamental notion of IMPACT-scarcity and connection between I(a_i) and player i’s IEU. We begin with stating our IMPACT-scarcity result in terms of outcome variable O: \sum \text{Var}(I(v_i)) = I(\omega) + \sum_{i = 1}^n I(a_i) \leq \text{Var}(O)One natural vein of results comes from plugging in variables for O and seeing what comes out. We give some basic examples:

IEU in terms of IMPACT

We now start from our other main result: u_i(a_i) = e(R_i, t_i) + I(a_i). Since we don’t assume any information about IEU by default, a natural starting point is in the case of a Nash Equilibrium. By definition of (Bayesian) Nash Equilibrium, each player’s strategy must be a best response. Thus, for each a_i in the support of \sigma_i(t_i), we have u_i(a) = \max_{a_i}(u_i(a_i)) = M. This implies e(R_i, t_i) = I(a_i) = 0, which can be understood by arguing that in a Nash Equilibrium, no player can unilaterally increase their expected reward. Sensing a deeper connection between IMPACT and Nash Equilibria, we define the self-IMPACT of action a_i to be I(a_i) given O = R_i. Above, we showed that in a Nash Equilibrium, each player’s actions have zero self-IMPACT. Generalizing to suboptimal actions, we find that all actions have self-IMPACT \leq 0, with equality when the action is a best response. Unfortunately, the converse doesn’t hold: each player only taking zero self-IMPACT actions doesn’t imply a Nash equilibrium. It implies a Nash Equilibrium of the game where the action spaces A_i are restricted to only actions played with nonzero probability, but these notions aren’t equivalent if strong actions remain unplayed (consider the mixed-strategy equilibrium for rock-paper-scissors when generalized to rock-paper-scissors-[insta-win action]). We can also explore the fact that in a Nash Equilibrium, POWER equals IEU. Thus, we can write \text{POWER}(i, \sigma) = e(R_i, t_i), which is strictly a function of \omega. Intuitively, IMPACT accounts for variation in a_i while POWER takes a max over it, thus they become equivalent in limiting cases for \sigma_i.

Considering other players /​ IMPACT-trading

As a final and unexplored angle on IMPACT, consider the case where player 1 impacts R_2 and player 2 impacts R_1. Assuming it wouldn’t adversely affect their own utilities, the players can "trade" by modifying their strategies to mutually grant each other increased reward. The premise itself immediately raises red flags; I’ll attempt to briefly address them:

Temporarily putting the issues aside, I intend to explore IMPACT-trading as an attempt to understand coordination-centric games like the Prisoners’ Dilemma. More generally, I hope to apply the IMPACT framework to a broad range of multiplayer game-theoretic phenomena and see if new insight can be gained.

Comment

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=3vbAPo6Z9aitNXY28

(midco developed this separately from our project last term, so this is actually my first read) I have a lot of small questions. What is your formal definition of the IEU u_i? What kinds of goals is it conditioning on (because IEU is what you compute after you view your type in a Bayesian game)? Multi-agent "impact" seems like it should deal with the Shapley value. Do you have opinions on how this should fit in? You note that your formalism has some EDT-like properties with respect to impact:

Well, in a sense, they do. The universes where player i shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player i doesn’t choose their action, the coin (\omega) does. If we condition on the value of \omega, then each player’s action becomes deterministic, thus IEU is constant across each player’s (trivial) action space. This seems weird and not entailed by the definition of IEU, so I’m pretty surprised that IEU would tell you to shout ‘heads.’ Given arbitrary R.V.s A, B, we define the estimate of A given B=b as e(A, B):=\mathbb{E}_{B=b}[A]Is this supposed to be e(A,B=b)? If so, this is more traditionally called the conditional expectation of A given B=b.

Comment

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=ZaModHE4AAtyQGgFb

Answering questions one-by-one:

  • I played fast and loose with IEU in the intro section. I think it can be consistently defined in the Bayesian game sense of "expected utility given your type", where the games in the intro section are interpreted as each player having constant type. In the Bayesian Network section, this is explicitly the definition (in particular, player i’s IEU varies as a function of their type).

  • Upon reading the Wiki page, it seems like Shapley value and Impact share a lot of common properties? I’m not sure of any exact relationship, but I’ll look into connections in the future.

  • I think what’s going on is that the "causal order" of \omega and a_i is switched, which makes a_i "look as though" it controls the value of \omega. In terms of game theory the distinction is (I think) definitional; I include it because Impact has to explicitly consider this dynamic.

  • In retrospect: yep, that’s conditional expectation! My fault for the unnecessary notation. I introduced it to capture the idea of a vector space projection on random variables and didn’t see the connection to pre-existing notation.

https://www.lesswrong.com/posts/ypySHD723fMu9YywE/analyzing-multiplayer-games-using-impact?commentId=FgasXRjRAfbg93Pqt

Upon reflection, I now suspect that the Impact I(a_i) is analogous to Shapley Value. In particular, the post could be reformulated using Shapley values and would attain similar results. I’m not sure whether Impact-scarcity of Shapley values holds, but the examples from the post suggest that it does. (thanks to TurnTrout for pointing this out!)