Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert X. Jiang A Game: Rock-Paper-Scissor Rock-Paper-Scissor: A Play Winner $1 Rock-Paper-Scissor: A Play Winner $1 Rock-Paper-Scissor: A Play Winner $1

Rock-Paper-Scissor Payoffs 0,0 -1,1 1,-1 1,-1 0,0 -1,1 -1,1 1,-1 0,0 Bimatrix Game S1 = { R, P,

C} S2 = { R, P, C} R P C R P C R 0 -1 1

R 0 1 -1 P 1 0 -1 P -1 0 1

C -1 1 0 C 1 -1 0 A B Steady State: No player gains by unilateral deviatio

Bimatrix Game S1 = { R, P, C} S2 = { R, P, C} R P C R P C R 0 -1

1 R 0 1 -1 P 1 0 -1 P -1 0

1 C -1 1 0 C 1 -1 0 A No Steady State

B Mixed Play S1 = { R, P, } C 1={r1, p1, c10; S1 = { R, P, } 2 , p2 , C2={r c20; R 1/3 P C r1+p11/3 +c1=1} 1/3 rR2+p2+cP2=1} C

R 0 -1 1 R 1/3 0 1 -1 P 1 0 -1

P 1/3 -1 0 1 C -1 1 0 C 1/3 1 -1

0 A Steady State B John Nash (1951) Finite Game: Finitely many players, each with finitely many strategies. Nash: Every finite game has a steady state in mixed strategy. Hence forth called Nash equilibrium (NE) Proved using Kakutani fixed point theorem:

Highly non-constructive. Nash Equilibrium Computation Papadimitriou (JCSS94): PPAD-class Problems where existence is guaranteed like fixed point, Sperners Lemma, Nash equilibrium. Chen and Deng CDT (FOCS06): Even approximation is PPADhard. (FOCS06): It is PPAD-hard. Rank and Computation

Kannan and Theobald (SODA07): Define rank of (A,B) as rank(A+B). FPTAS for fixed rank games. Polynomial time algorithms for exact Nash. Dantzig (1963): Zero-sum (rank-0) is equiv. to LP. AGMS (STOC11): Rank-1 games. Bilinear Games Bimatrix Game with polyhedral strategy sets.

Two players: 1 and 2 Polyhedral strategy sets: X={x Rm | Ex = e; x 0}, Y={y Rn | Fy=f; y 0} Payoff matrices: A, B Rmn Bilinear Payoff: (x, y) fetches xTAy to player 1, and xTBy to player 2. Motivation: Koller et al. (STOC94) for two-player extensive form game with perfect recall. Nash Equilibrium in Bilinear NE: No player gains by unilateral deviation.

Existence: Corollary of Glicksbergs result. Symmetric Game: B=AT and Y=X. (x, y) is a symmetric profile if y=x. Existence of symmetric NE: An adaptation of Nashs proof for symmetric bimatrix games. Bilinear Contains: Bimatrix, Polymatrix, Bayesian, etc. Bimatrix: X = 1, Y = 2

Polymatrix: N players. Each pair plays a bimatrix game. Player i: Si finite strategy set, i Mixed strategy set. Goal of i: Choose xi from i to maximize total payoff. i Aij j Polymatrix to Bilinear M= |S1|+ + |Sn|. X = {(x1,,xn) | xi in i}, Y=X. M M

j R A , B=AT 0 0 Aij i A= 0 0 Symmetric NE of (A,B) maps to a NE of the polymatrix game Best Response (Koller et al.) Fix a strategy y of player 2. Player 1 solves

max: xT(Ay) Ex = e x0 min: eTp p TE (Ay)T At optimal: p s.t. Aiy pTEi & xi > 0 => Aiy = pTEi Given x X, for player 2 we get At optimal: q s.t. Bjx qTFj & yj > 0 => qTFj = Bjx Best Response Polytopes (BRPs) (x,y) is a NE iff p: Ay ETp; xi > 0 => Aiy = pTEi q: xTB qTF; yj > 0 => qTFj = Bjx P {( y, p) | Ai y pT E i ,

Q {( x, q ) | xi 0, y j 0, Fy f } xT B j qT F j , Ex e} xT(Ay - ETp) 0 and (xTB - qTF)y 0 xT(A+B)y eTp fTy 0 Nash Equilibrium in BRPs P {( y, p) | Ai y pT E i , Q {( x, q ) | xi 0, y j 0, Fy f }

xT B j qT F j , Ex e} NE iff xT(Ay - ETp)=0 and (xTB - qTF)y=0 xT(A+B)y eTp fTy=0 Assumption: P and Q are non-degnerate. (u, v) of P x Q gives a NE => (u, v) is a vertex. QP Formulation max: xT(A+B)y eTp fTy s.t. (y, p) P (x, q) Q Optimal value 0. Only vertex solutions. Our Results

Rank-1 games: rank(A+B)=1 Fixed rank games: rank(A+B)=k Extend Adsul et al. algorithm for exact NE. Extend FPTAS of Kannan et al. Rank of A or B is constant Enumerate all NE in polynomial time. Rank-1 Case

Zero-sum ~ rank(A+B)=0: LP formulation (Charnes53) rank(A+B)=1 then A+B = a.bT The QP formulation: max: (xTa)(bTy) eTp fTy s.t. (y, p) P (x, q) Q Rank-1 Case Replace (xTa) by z. Recall B = -A + a.bT Q {( x, q) | xi 0, xT B j qT F j , Ex e}

Q ' {( x, z , q ) | xi 0; xT ( A) zb j q T F j ; Ex e} xT(A+B)y eTp fTy=0 z(bTy) eTp fTy=0 N = Points of P x Q with z(bTy) eTp fTy=0 Forms paths and cycles, since z gives one degree of freedom. NE of (A,B): Points in intersection of N and z xTa =0. Parameterized LP LP(z) = max: z(bTy) eTp fTy s.t. (y, p) P (x, z, q) Q

Given any c, Optimal value of LP(c) is 0. OPT(c) lies on N, and Let N(c)={Points of N with z=c}, then OPT(c)=N(c). N is a single path on which z is monotonic. Rank-1: The Algorithm NE: Intersection of N and H: z xTa =0. amin min xT a; amax max xT a. c =a , c =a 1 min 2 max xX

xX N(c1) H N H NE N(c2) H+ Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2.

N(c1) H N H N(c) NE N(c2) H+ Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2. If N(c) in H,then c1=c else c2=c. H

N H N(c1) NE N(c2) H+ Analysis Terminates because, z is monotonic on N. Increase in z on each edge is lower bounded by 1/

d where d is polynomial sized in the input. Time complexity: Solve LP(c) to get N(c) in each pivot. log(d) * log(amax amin) pivots. Conclusions Bilinear games: Bimatrix with polytopal strategy sets. Fairly general. Contains polymatrix, bayesian, etc. Polynomial time algorithm for rank based subclasses.

Open problems: Designing a Lemke-Howson type algorithm. Degree, index, stability concepts. Computation of approximate equilibrium. Thank You