Tuesday, February 7, 2012

Mixed Strategy Game

Two players A and B play a game, in which both has a red and a blue marble. They simultaneously present a marble to each other. If both present red, A wins $3. If both present blue, A wins $1. Otherwise, if the colors don't match, B wins $2. Who is in the better situation here?



        A


Red Blue
B Red A: $3 B: $2
Blue B: $2 A: $1

If both players choose red or blue with equal probability, the expected payoff for both players is identical at $1. It's true that Player A has a greater variance in the results, and therefore Player B is better for risk-averse individuals. However, let p be the probability that the other player chooses red. Here in the mixed strategies:
  • 3p = (1-p) --> A would be indifferent about the decision if B had 1/4 chance of choosing red
  • 2p = 2*(1-p) --> B would be indifferent about the decision if A had 1/2 chance of choosing red
What does this mean? Although there is no dominant strategy, Player A clearly prefers playing red (expected payoff = 1.5, against 0.5 for blue), if Player B chooses randomly. However, knowing this, Player B would be more likely to choose blue. 3/4 chance of choosing blue for Player B finally would make Player A indifferent. What about the expected payoff in this mixed strategy?
  • Player A: (1/4)(1/2)(3) + (3/4)(1/2)(1) = 3/4
  • Player B: (1/2)(1/2)(2) + (1/2)(1/2)(2) = 1
So in this mixed strategy, whereby Player B recognizes that Player A prefers playing red, Player B plays blue more frequently. Although Player A had preferred red under random decisions from B, now Player A is better off playing blue if B also puts blue.

Finally, to simulate the game on MATLAB, this script was written. This case simulates the mixed strategy where B has 25% chance of choosing red. In the original consideration of random selection, simply change the value of the variable RedprobB.

trial = 1;
numTrials = 1000000;
payoffA = zeros(1,numTrials);
payoffB = zeros(1,numTrials);
RedprobA = 0.5;
RedprobB = 0.25;
while trial <= numTrials
    testA = rand;
    testB = rand;
    if(testA < RedprobA & testB < RedprobB)
        payoffA(trial) = 3;
    elseif(testA >= RedprobA & testB >= RedprobB)
        payoffA(trial) = 1;
    else
        payoffB(trial) = 2;
    end
    trial = trial + 1;
end

sumA = sum(payoffA);
sumB = sum(payoffB);
disp(sumA)
disp(sumB)

If both players choosing red and blue with 50% chance, here are some of the results of the sum of the payoff over 1,000,000 trials:
  • A: 999,425; B: 1,000,670
  • A: 999,662; B: 1,000,664
  • A: 1,001,261; B: 999,018
The results closely aligns with the expected payoff per trial of 1. Now, change back the code so that Player B has 25% chance of choosing red. Here are some of the results:
  • A: 749,312; B: 1,001,864
  • A: 751,635; B: 997,710
  • A: 749,364; B: 999,236
Again, the results closely aligns with the revised expected payoff per trial of 0.75 for Player A and 1 for Player B. Now, taking a step further, if Player A knows that Player B will play red 1/4 of the time, what can Player A do? By changing the value of the variable RedprobA, here are some observations:
  • RedprobA = 0.01: A earns about 0.75 per trial, B about 0.50
  • RedprobA = 0.25: both A and B earn about 0.75
  • RedprobA = 0.75: A earns about 0.75 per trial, B about 1.25
  • RedprobA = 0.99: A earns about 0.75 per trial, B about 1.50
So it looks like if Player B is fixed at 1/4 probability of picking red, Player A can't do much to improve its own expected payoff. This shouldn't come at a surprise, since back earlier when p=1/2 for A and p=1/4 for B were calculated, that was the Nash equilibrium. Neither player has an incentive to switch strategies, given that the other won't. However, Player A here is able to affect the payoff of B, but it was assumed for this problem that the payoff of the other player doesn't factor into any decisions.