Bilinear Games: Polynomial Time Algorithms for Rank Based

Bilinear Games: Polynomial Time Algorithms for Rank Based

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert X. Jiang A Game: Rock-Paper-Scissor Rock-Paper-Scissor: A Play Winner $1 Rock-Paper-Scissor: A Play Winner $1 Rock-Paper-Scissor: A Play Winner $1

Rock-Paper-Scissor Payoffs 0,0 -1,1 1,-1 1,-1 0,0 -1,1 -1,1 1,-1 0,0 Bimatrix Game S1 = { R, P,

C} S2 = { R, P, C} R P C R P C R 0 -1 1

R 0 1 -1 P 1 0 -1 P -1 0 1

C -1 1 0 C 1 -1 0 A B Steady State: No player gains by unilateral deviatio

Bimatrix Game S1 = { R, P, C} S2 = { R, P, C} R P C R P C R 0 -1

1 R 0 1 -1 P 1 0 -1 P -1 0

1 C -1 1 0 C 1 -1 0 A No Steady State

B Mixed Play S1 = { R, P, } C 1={r1, p1, c10; S1 = { R, P, } 2 , p2 , C2={r c20; R 1/3 P C r1+p11/3 +c1=1} 1/3 rR2+p2+cP2=1} C

R 0 -1 1 R 1/3 0 1 -1 P 1 0 -1

P 1/3 -1 0 1 C -1 1 0 C 1/3 1 -1

0 A Steady State B John Nash (1951) Finite Game: Finitely many players, each with finitely many strategies. Nash: Every finite game has a steady state in mixed strategy. Hence forth called Nash equilibrium (NE) Proved using Kakutani fixed point theorem:

Highly non-constructive. Nash Equilibrium Computation Papadimitriou (JCSS94): PPAD-class Problems where existence is guaranteed like fixed point, Sperners Lemma, Nash equilibrium. Chen and Deng CDT (FOCS06): Even approximation is PPADhard. (FOCS06): It is PPAD-hard. Rank and Computation

Kannan and Theobald (SODA07): Define rank of (A,B) as rank(A+B). FPTAS for fixed rank games. Polynomial time algorithms for exact Nash. Dantzig (1963): Zero-sum (rank-0) is equiv. to LP. AGMS (STOC11): Rank-1 games. Bilinear Games Bimatrix Game with polyhedral strategy sets.

Two players: 1 and 2 Polyhedral strategy sets: X={x Rm | Ex = e; x 0}, Y={y Rn | Fy=f; y 0} Payoff matrices: A, B Rmn Bilinear Payoff: (x, y) fetches xTAy to player 1, and xTBy to player 2. Motivation: Koller et al. (STOC94) for two-player extensive form game with perfect recall. Nash Equilibrium in Bilinear NE: No player gains by unilateral deviation.

Existence: Corollary of Glicksbergs result. Symmetric Game: B=AT and Y=X. (x, y) is a symmetric profile if y=x. Existence of symmetric NE: An adaptation of Nashs proof for symmetric bimatrix games. Bilinear Contains: Bimatrix, Polymatrix, Bayesian, etc. Bimatrix: X = 1, Y = 2

Polymatrix: N players. Each pair plays a bimatrix game. Player i: Si finite strategy set, i Mixed strategy set. Goal of i: Choose xi from i to maximize total payoff. i Aij j Polymatrix to Bilinear M= |S1|+ + |Sn|. X = {(x1,,xn) | xi in i}, Y=X. M M

j R A , B=AT 0 0 Aij i A= 0 0 Symmetric NE of (A,B) maps to a NE of the polymatrix game Best Response (Koller et al.) Fix a strategy y of player 2. Player 1 solves

max: xT(Ay) Ex = e x0 min: eTp p TE (Ay)T At optimal: p s.t. Aiy pTEi & xi > 0 => Aiy = pTEi Given x X, for player 2 we get At optimal: q s.t. Bjx qTFj & yj > 0 => qTFj = Bjx Best Response Polytopes (BRPs) (x,y) is a NE iff p: Ay ETp; xi > 0 => Aiy = pTEi q: xTB qTF; yj > 0 => qTFj = Bjx P {( y, p) | Ai y pT E i ,

Q {( x, q ) | xi 0, y j 0, Fy f } xT B j qT F j , Ex e} xT(Ay - ETp) 0 and (xTB - qTF)y 0 xT(A+B)y eTp fTy 0 Nash Equilibrium in BRPs P {( y, p) | Ai y pT E i , Q {( x, q ) | xi 0, y j 0, Fy f }

xT B j qT F j , Ex e} NE iff xT(Ay - ETp)=0 and (xTB - qTF)y=0 xT(A+B)y eTp fTy=0 Assumption: P and Q are non-degnerate. (u, v) of P x Q gives a NE => (u, v) is a vertex. QP Formulation max: xT(A+B)y eTp fTy s.t. (y, p) P (x, q) Q Optimal value 0. Only vertex solutions. Our Results

Rank-1 games: rank(A+B)=1 Fixed rank games: rank(A+B)=k Extend Adsul et al. algorithm for exact NE. Extend FPTAS of Kannan et al. Rank of A or B is constant Enumerate all NE in polynomial time. Rank-1 Case

Zero-sum ~ rank(A+B)=0: LP formulation (Charnes53) rank(A+B)=1 then A+B = a.bT The QP formulation: max: (xTa)(bTy) eTp fTy s.t. (y, p) P (x, q) Q Rank-1 Case Replace (xTa) by z. Recall B = -A + a.bT Q {( x, q) | xi 0, xT B j qT F j , Ex e}

Q ' {( x, z , q ) | xi 0; xT ( A) zb j q T F j ; Ex e} xT(A+B)y eTp fTy=0 z(bTy) eTp fTy=0 N = Points of P x Q with z(bTy) eTp fTy=0 Forms paths and cycles, since z gives one degree of freedom. NE of (A,B): Points in intersection of N and z xTa =0. Parameterized LP LP(z) = max: z(bTy) eTp fTy s.t. (y, p) P (x, z, q) Q

Given any c, Optimal value of LP(c) is 0. OPT(c) lies on N, and Let N(c)={Points of N with z=c}, then OPT(c)=N(c). N is a single path on which z is monotonic. Rank-1: The Algorithm NE: Intersection of N and H: z xTa =0. amin min xT a; amax max xT a. c =a , c =a 1 min 2 max xX

xX N(c1) H N H NE N(c2) H+ Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2.

N(c1) H N H N(c) NE N(c2) H+ Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2. If N(c) in H,then c1=c else c2=c. H

N H N(c1) NE N(c2) H+ Analysis Terminates because, z is monotonic on N. Increase in z on each edge is lower bounded by 1/

d where d is polynomial sized in the input. Time complexity: Solve LP(c) to get N(c) in each pivot. log(d) * log(amax amin) pivots. Conclusions Bilinear games: Bimatrix with polytopal strategy sets. Fairly general. Contains polymatrix, bayesian, etc. Polynomial time algorithm for rank based subclasses.

Open problems: Designing a Lemke-Howson type algorithm. Degree, index, stability concepts. Computation of approximate equilibrium. Thank You

Recently Viewed Presentations

  • Tariff Hike Rejection Presentation Eskom's application for a

    Tariff Hike Rejection Presentation Eskom's application for a

    In 2013 already Optimum had triggered a hardship clause of the contract, claiming that it was losing R166 per ton of coal, which would cost the company R881-million in losses in that year alone. 2013 and subsequent years saw massive...
  • An Integrated Approach to Measuring Semantic Similarity ...

    An Integrated Approach to Measuring Semantic Similarity ...

    phr (verb particle) Ex) She looked up the word in the dictionary. subj (subject) obj (object) comp (subject complement) Ex) John remains a boy. dat (indirect object) Ex) John gave her an apple. oc (object complement) Ex) John called him...
  • Security Framework For Cloud Computing

    Security Framework For Cloud Computing

    Cloud computing is a technology of rapid development. Security is the main obstacle which must be solved. security is not just a technical problem it also involves standardization, Supervising mode, laws and regulations and many other aspects. Future research should...
  • Design Quality - Kennesaw State University

    Design Quality - Kennesaw State University

    Times New Roman Default Design Design for Quality Design for Quality and Safety Fault Processing and Safety Fault Correction Fault Tolerance A Design Technique for Fault Processing A Fault-Tree Example Fault Tree Example (cont.) Improving Design Assess Design Quality Notes...
  • Azure Pack Express Installation

    Azure Pack Express Installation

    Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES...
  • Compliance and Enforcement- demo CoR course

    Compliance and Enforcement- demo CoR course

    Driving & Driver fatigue. ... people is that they will be held accountable in the event of a serious safety incident or when they are visited by WorkSafe "While most in the industry are striving to meet their legal and...
  • VERTICAL boundaries of the firm - CA Sri Lanka

    VERTICAL boundaries of the firm - CA Sri Lanka

    VERTICAL boundaries of the firm. Class 3. Make vs Buy. A firms decision to perform an activity itself or to purchase it from an independent firm is called a make or buy decision. All services can be done in house...
  • Benchmarking - Interdisciplinary | Innovative

    Benchmarking - Interdisciplinary | Innovative

    The term "benchmark" also commonly applies to specially-designed programs used in benchmarking A benchmark should: be domain specific (the more general the benchmark, the less useful it is for anything in particular) be a distillation of the essential attributes of...