Showing posts with label Computer Science. Show all posts
Showing posts with label Computer Science. Show all posts

Tuesday, March 19, 2013

Simulation of Simplified Version of Showcase Showdown Using MATLAB

See previous related post: Showcase Showdown Analysis: Circular Reasoning and Oscillating Nash Equilibrium

In the show The Price Is Right, contestants spin the Big Wheel, which has 20 sections randomly distributed, from 5 cents to $1.00 in 5-cent increments. The objective is to get the highest amount without going over $1, with one initial spin and an optional second spin. Clearly the first contestant is at a disadvantage, as the later contestants simply opt for the second optional spin if the first spin is not enough. For the first contestant, there needs to be a strategy consisting of a critical value for the first spin, at or above which the second spin is not preferred, as the probability of going over is too high. It's also a discrete case, as the critical value can only be from 5 to 100, in increments of 5.

In the actual show, there are three contestants. For the sake for simplicity, only two contestants are assumed in this simulation. In the case of tie, rerun the trial. For each possible critical value from 5 to 100 cents, 1 million trials are run in this Monte Carlo simulation to determine the winning percentage of player 1 under that strategy.

Here's the graph with the critical point on the x-axis and the winning percentage on the y-axis:


The maximum occurs at critical value of 55, corresponding with a winning percentage of 45.55%. At a value of 50, the winning percentage is up there at 45.22%. Even closer is 45.52% at a value of 60. The winning percentage gradually tapers off. Even at a value of 70, the winning percentage is still at 44.25%. Here's the MATLAB script to generate the data and graph:
x = zeros(1,20);
y = zeros(1,20);

for crit = 1:20
    x(crit) = 5*crit;
    n = 1000000;
    win = zeros(1,n);

    for i = 1:n
        done = false;
        while ~done
            firstSpin = ceil(20*rand)*5;

            if firstSpin >= x(crit)
                playerOne = firstSpin;
            else
                playerOne = firstSpin + ceil(20*rand)*5;
            end

            if playerOne > 100
                playerOne = 0;
            end

            playerTwo = ceil(20*rand)*5;
            if playerTwo < playerOne
                playerTwo = playerTwo + ceil(20*rand)*5;
            end
            if playerTwo > 100
                playerTwo = 0;
            end

            if playerOne > playerTwo
                win(i) = 1;
                done = true;
            elseif playerOne < playerTwo
                done = true;
            end
        end
    end
    
    y(crit) = sum(win)/n;
end
plot(x,y); grid on; xlabel('player 1 critical point'); ylabel('player 1 winning percentage')

In the code above, simply vary the value of n to change the number of trials at each strategy. Of course, this optimal strategy will be different if the third contestant is taken into consideration in the actual game.

Thursday, March 7, 2013

Acceptance Rejection Method for Simulation

In simulation, use the acceptance rejection method to simulate a random variable X with density f(x), for which there is no direct method to use the inverse transformation to generate it. However, suppose that there is a known algorithm to generate random variable Y with density g(x) such that f(x) / g(x) is bounded, that is f(x) / g(x) ≤ c. To find c, take the derivative of f(x) / g(x) to find the maximum value. Here's the algorithm:
  1. Generate Y according to the known algorithm
  2. Generate U, the uniform distribution from 0 to 1
  3. If U ≤ f(Y) / c*g(Y), then set X = Y
  4. Otherwise, start all over
Note that to produce each copy of X, the loop runs until condition 3 is satisfied. In the end, c become the average number of runs to produce a copy of X. In the following, a random variable with f(x) = 2x*e^(-x^2) is generated using MATLAB. The chosen g(x) = e^(-x), so f(x) / g(x) = 2x*e(x-x^2), and the maximum value c turns out to be exactly 2. 

There's no way to verify that this method does indeed generate a random variable with the given f(x). Instead, to test the accuracy of the algorithm here, two values (a and b) are inputted at the end. The program first calculates the integral of f(x) from [a, b], or the probability that X falls between the range of [a, b]. Finally, the program uses the simulated copies of X to count the proportion of the copies of X that fell in that interval of [a, b]. The two values should be close to each other.
func = @(x) 2*x.*exp(-x.^2);
n = 100000;
results = zeros(1,n);
numTimes = zeros(1,n);

for i=1:n
    trialDone = false;
    count = 0;
    while ~trialDone
        count = count + 1;
        y = -log(rand);
        u = rand;
        if u <= y*exp(y-y^2)
            results(i) = y;
            trialDone = true;
        end
    end
    numTimes(i) = count;
end

a = input('Lower bound for x: ');
b = input('Upper bound for x: ');
disp 'Evaluated integral using f(x): ';
quad(func, a, b)
disp 'Simulated integral using acceptance / rejection method: ';
length(results(results>a & results
disp 'Average trial runs per simulated copy: '
mean(numTimes)

Choose values between 0 and 2.5 for a and b, for over 99.8% of the data are in that range. Upon running the script using different values, the calculated integral closely matches the frequency proportion obtained from the simulation technique using the acceptance rejection method. While this does not verify the accuracy of the method, it at least illustrates that the resultant output resembles what the expected output should be. Finally, the resultant average trial runs per simulated copy hovers closely to 2 as expected.

Sunday, March 3, 2013

Simulating Correlated Pairs of Normally Distributed Random Variables

See previous related article: Simulating Normal Distribution with Polar Method

The previous article walked through how to generate independent copies of normally distributed random variables using the polar method. Here's an extension of that on how to generate pairs of normally distributed variables with a certain correlation between them. Polar method is still used to generate the variables. Recall that correlation ρ = cov(x,y) / (σ_x*σ_y). Both x and y can have its own σ and μ values.

n = input('Input the number of copies: ');
u_x = input('Input the mean of x: ');
s_x = input('Input the standard deviation of x: ');
u_y = input('Input the mean of y: ');
s_y = input('Input the standard deviation of y: ');
p = input('Input the correlation between x and y: ');
x = zeros(1,n);
y = zeros(1,n);

for i = 1:n
    z1 = sqrt(-2*log(rand))*cos(2*pi*rand);
    z2 = sqrt(-2*log(rand))*sin(2*pi*rand);
    x(i) = s_x*z1+u_x;
    y(i) = s_y*p*z1+s_y*sqrt(1-p^2)*z2+u_y;
end

disp 'Calculated mean and standard deviation for x:';
[mean(x), std(x)]
disp 'Calculated mean and standard deviation for y:';
[mean(y), std(y)]
disp 'Calculated correlation between x and y:';
corr(x',y')

The middle block of the codes here is the most essential. Variables z1 and z2 denote two independent copies of N(0,1). One copy is then used to generate x1 as σ_x*z1 + μ_x. Then the correlated pair y1 is generated as σ_y*ρ*z1 + σ_y*sqrt(1-ρ^2)*z2 + μ_y. Note that the z1 term is used in generating x1 as well as y1. Furthermore, note that the middle term σ_y*sqrt(1-ρ^2)*z2 disappears when ρ = ±1.

Simulating Normal Distribution with Polar Method

The polar method allows for quick generation of normally distributed random variables. For each trial, generate r^2 as exponentially distributed with λ = 1/2, and θ uniformly distributed between 0 and 2π. Within each trial, in fact 2 independent copies of the normally distributed random variables are generated: Z1 = sqrt(r^2)*cos(θ) and Z2 = sqrt(r^2)*sin(θ), both with μ = 0 and σ = 1. For normally distributed random variables with different parameters of μ and σ, simply use the transformation σ*Z+μ.
n = input('Input the number of copies: ');
u = input('Input the mean: ');
s = input('Input the standard deviation: ');
results = zeros(1,n);
for i = 1:2:n
    rsq = -2*log(rand);
    theta = 2*pi*rand;
    results(i) = s*sqrt(rsq)*cos(theta)+u;
    results(i+1) = s*sqrt(rsq)*sin(theta)+u;
end
disp 'Calculated mean and standard deviation using the polar method:';
[mean(results), std(results)]

The program at the end calculates the μ and σ values of the n copies of the random variables created as a result of this simulation, stored in the array named "results."

Tuesday, January 29, 2013

Inverse Transform Demonstration with Excel VBA

Given F(x) = 1-e^(-λ*x) as the cumulative distribution function, the inverse transform gives -1/λ*ln(U) as the function that has F(x) as its cdf, where U is the uniform distribution from [0, 1]. Here, the following VBA codes allow users to visualize this transformation in Microsoft Excel.

Upon the execution of this procedure, the user inputs a value for lambda. Then 10,000 simulations are run, initially generating a random number from [0, 1] and then inputting that random number into -1/λ*ln(U), and outputting the result in column A. At the end of the execution, column C contains the different x values from 0 to the maximum, in increments of 0.001. Column D reflects the cdf by counting entries of A that are smaller than the corresponding x value. Finally, column E calculates the true value of 1-e^(-λ*x). The idea is that the outputs in columns D and E are similar.

Sub InverseTransform()
'Demonstrates inverse transform of the cdf F(x) = 1-e^(-lambda*x)
Columns("A:E").Clear
Range("B1").Value = 1# * InputBox("Enter a positive value for lambda: ", "Input", 2)

'10,000 simulation trials to be performed
'Transform of F(x) gives -1/lambda*ln(U), where U is uniform distribution [0,1]
For i = 1 To 10000
Cells(i, 1) = -1 / Range("B1").Value * Log(Rnd)
Next i

'Determine the maximum for the range of numbers to work with
Range("B2").FormulaR1C1 = "=MAX(C[-1])"

'To determine the cumulative probability density, use 0.001 gradient from 0 to the maximum value as the counter
i = 1
While i / 1000 <= Range("B2").Value
Cells(i, 3).Value = i / 1000
'In column D, count entries in column A that are smaller than the counter, then divide by the number of trials
Cells(i, 4).FormulaR1C1 = "=COUNTIF(C[-3],""<""&RC[-1])/10000"
'In column E, calculate the true value of 1-e^(-lambda*x)
Cells(i, 5).FormulaR1C1 = "=(1-EXP(0-R1C2*RC[-2]))"
i = i + 1
Wend
Range("B2").Clear
End Sub

After the execution of this procedure, the user can perform further analysis. Graphing columns C through E does reveal that values in columns D and E are similar, as the points almost completely overlap. Error calculations from those two columns illustrate a similar result that the inverse transform method takes a function F(x), exponential function in this case, to produce a function whose cdf is F(x).

Wednesday, January 23, 2013

Monte Carlo Simulation of Pi with MATLAB

Using Monte Carlo simulation, the value of π can be approximated as (1/n) * Σ(4*sqrt(1-U_i^2)), where n is a large number representing the number of simulation trials, and U_i represents the i-th trial result from a uniform distribution in [0,1], for 1 ≤ i ≤ n. The MATLAB codes to run this simulation is quite straightforward:

numTrials = 1000000000;    %number of simulation trials
count = 0;
for trial = 1:numTrials
    roll = rand;    %uniform distribution [0,1]
    count = count + 4*sqrt(1-roll^2);
end
sum(count)/numTrials

In this simulation, n = 1,000,000,000. The output of 3.1416 matches the value of pi in all 4 displayed decimal places.

Saturday, September 8, 2012

Toss Coin Until Head Appears Twice In a Row

Take a fair coin and toss it until a head appears twice in a row. There is no finite sample space for this experience, for it theoretically can go on infinitely. That said, the first few elements are: {H,H}, {T,H,H}, {T,T,H,H}, {H,T,H,H}. To determine the probability of tossing exactly 4 times, we can take the elements in the sample space that corresponds to only 4 tosses. In that case, they are {T,T,H,H}, {H,T,H,H}. The probability of tossing either combination is 1/16, and two such combination gives a total probability of 1/8.

To see the result simulated, the following codes on MATLAB can help to illustrate that with 1 million trials. Documentations are added to clarify the algorithm.
probability = 0.5;      %prob of tossing head
numTrials = 1000000;    %number of simulation trials
trialResults = zeros(1,numTrials);

for trial = 1:numTrials
    done = false;
    lastHead = false;
    toss = 0;
    while ~done         %this loop controls each trial
        roll = rand;
        toss = toss + 1;
        if roll < probability && lastHead
            trialResults(trial) = toss;
            done = true;    %each trial over when last trial was H, and this trial is also
        elseif roll < probability && ~lastHead
            lastHead = true;    %last trial was not H, but this trial is
        else
            lastHead = false;   %did not toss a H
        end
    end
end

sum(trialResults == 4) / numTrials

The results vary very little from 0.1250.

Monday, June 25, 2012

MATLAB Code for 1-D Random Walk

See related post: 1-D Random Walk with Equal Probability

The previous post ran a simulation of one-dimensional random walk. Here is the MATLAB code that was run to achieve the the results, along with explanatory documentation:
numRuns = 100000;       %# of resets to position 0
numTrials = 1000000;    %# of trials per run
probLeft = 0.5;         %Here, equal probability going to left and right
zero = 0;               %# of times on position 0
negative = zeros(1);    %Array to keep track of # of times on negative numbers
positive = zeros(1);    %Similar array for positive numbers
for run = 1:numRuns     %Loop over the # of runs
    position = 0;       %Reset position for each run
    for count = 1:numTrials     %Loop iver the # of trials
        roll = rand;            %Simulated roll
        if roll < probLeft      %Going to the left
            position = position - 1;
        else                    %Going to the right
            position = position + 1;
        end
        if position < 0
            if abs(position) > length(negative) %If the position has never been visited before
                negative(abs(position)) = 0;    %Initialize the value to avoid error
            end
                negative(abs(position)) = negative(abs(position)) + 1;  %Increase counter
        elseif position > 0
            if position > length(positive)      %Similar tasks as above, but for positive positions
                positive(position) = 0;
            end
                positive(position) = positive(position) + 1;
        else
            zero = zero + 1;        %If landed on zero
        end
    end
end
y = [negative(end:-1:1) zero positive]; %# of times landed on each position
x = -length(negative):length(positive); %Position vector
plot(x,y)
xlabel('Position')
ylabel('Number of Times on the Position')
title('1 Dimensional Random Walk')

In this code, the variable 'probLeft' had a constant value of 0.5, reflecting on equal probability of going to the left and right. Further adjustments can tweak to define this variable to allow for the probability to reflect on the current position.

Friday, April 13, 2012

Displaying a Pyramid on MATLAB

This script of codes demonstrates the usage of the fprintf function to properly insert spacings within MATLAB. It also demonstrates the use of nested for loops to make a two-dimensional representation. Given a positive odd integer n as the input, this code displays a pyramid of base length n and height of n/2.
%The input integer 'n' denotes the width of the base of the pyramid
%If 'n' is even, it'll be the same as making a (n-1) sized pyramid
function pyramid = input(n)
%The height of the pyramid will be (n+1)/2
for i=1:2:n
    %Spacings on the left: 'c' is used for a character here, in this case the empty space ' '
    for j=1:(n-i)/2
        fprintf('%c',' ')
    end
   
    %Actual pyramid marks
    for k=1:i
        fprintf('%c','*')
    end
   
    %Spacings on the right: same chunk of code as the code for the left spacings
    for j=1:(n-i)/2
        fprintf('%c',' ')
    end
   
    %Carriage return after each line
    fprintf('\n')
end
If n = 5, for example, the output will be like this:
    * 
  ***
*****
While using the fprintf function, care needs to be taken to ensure the use of the correct conversion letter. Refer to the link below for further references.

Source:

Wednesday, April 4, 2012

Random Walk: Foundational Codes

Imagine a square grid, and start at one corner of it. On each step, if permissible within the boundaries of the grid, move one unit in a randomly chosen cardinal direction. Repeat until the other extreme corner is reached. How many steps does it take? In this post, we'll deal with the programming codes that simulate this scenario. Java and the equivalent MATLAB codes are presented, with commenting on the Java code.

Java version:
public class RandomWalk
{
 public static void main(String[] args)
 {
     int[] coor = new int[2];    //this houses the (x,y) coordinate
     coor[0] = coor[1] = 1;        //begin at (1,1)
     int move = 0;
     final int size = 10;        //this creates 10x10 grid
   
     while(!(coor[0]==size && coor[1]==size))    //game continues while not at the extreme corner
     {
         boolean valid = false;    //indicates whether or not a valid location has been found to move to
         while(!valid)            //while there isn't a valid location, search for one
         {
             double rand = Math.random();
             int[] test = new int[2];    //use to house the next potential location
             test[0] = coor[0];
             test[1] = coor[1];
           
             if(rand < 0.25)    //corresponds to moving north
                 test[1]++;
             else if(rand < 0.5)    //moving east
                 test[0]++;
             else if(rand < 0.75)    //moving south
                 test[1]--;
             else                    //moving west
                 test[0]--;
           
            //next location is valid if it's within the boundaries of the grid
             if(test[0] >= 1 && test[0] <= size && test[1] >= 1 && test[1] <= size)
             {//if a valid location is found, transfer the coordinate points
                 coor[0] = test[0];
                 coor[1] = test[1];
                 valid = true;
             }
         }//otherwise, repeat the loop to try to find another valid location
         move++;    //increment move if a valid move is made
     }
     //after the end of the game, indicate the total number of moves
     System.out.println("Number of moves: " + move);
 }
}

MATLAB version:
coor = [1,1];
numMoves = 0;
size = 10;

while ~(coor(1) == size && coor(2) == size)
    valid = false;
    while ~valid
        r = rand;
        test = zeros(1,2);
        test(1) = coor(1);
        test(2) = coor(2);
       
        if r < 0.25
            test(2) = test(2) + 1;
        elseif r < 0.5
            test(1) = test(1) + 1;
        elseif r < 0.75
            test(2) = test(2) - 1;
        else
            test(1) = test(1) - 1;
        end
       
        if(test(1) >= 1 && test(1) <= size && test(2) >= 1 && test(2) <= size)
            coor(1) = test(1);
            coor(2) = test(2);
            valid = true;
        end
    end
   
    numMoves = numMoves + 1;
end
numMoves

In subsequent posts, we'll work with this code and add variations.

Monday, January 30, 2012

Tower of Hanoi Method

The Tower of Hanoi game relies on recursion to successfully simulate the directions. While the code isn't difficult to write, picturing and understanding the recursive calls may be more challenging. The recursive method written in Java is as follows:
  public static void hanoi(int disks, String source, String destination, String temp)
  {
      if(disks!=0)
      {
          hanoi(disks-1, source, temp, destination);
          System.out.println("Move disk " + disks + " from " + source + " to " + destination);
          hanoi(disks-1, temp, destination, source);
      }
  }
To start out with the simplest case of 3 discs, picture the following memory stacks, illustrated horizontally with the indentation of bullet points. Call the smallest piece at the top disk 1, and so forth. Currently, the discs are on the left tower. The ultimate destination is the right tower.
  • hanoi(3, left, right, middle) //from an outside main method
    • hanoi(2, left, middle, right)
      • hanoi(1, left, right, middle)
        • "Move disk 1 from left to right"
      • "Move disk 2 from left to middle"
      • hanoi(1, right,  middle, left)
        • "Move 1 disk from right to middle"
    • "Move disk 3 from left to right"
    • hanoi(2, middle, right, left)
      • hanoi(1, middle, left, right)
        • "Move disk 1 from middle to left"
      • "Move disk 2 from middle to right"
      • hanoi(1, left, right, middle)
        • "Move disk 1 from left to right"
Collecting the direction displays:
  • Move disk 1 from left to right
  • Move disk 2 from left to middle
  • Move 1 disk from right to middle
  • Move disk 3 from left to right
  • Move disk 1 from middle to left
  • Move disk 2 from middle to right
  • Move disk 1 from left to right
In general, while moving n disks from tower A to tower B (using tower C as temporary holder):
  • If n is even, begin by moving disk 1 from A first to C
  • If n is odd, begin by moving disk 1 from A first to B
Source:

Sunday, January 29, 2012

Power Function Method

To program the mathematical function b^e, where b and e are non-negative integers, a recursive method can be written and used. Assume an existing method sqr(int a) that returns an integer, that is the square of int a.
  public static int exp(int b, int e)
  {
      if(e == 0)
          return 1;
      else if (e%2 == 0)
          return sqr(exp(b,e/2));
      else
          return b*exp(b,e-1);
  }
This program takes advantage of the property that (a^b)^c = a^(b*c). For example, 3^16 = (3^8)^2. Therefore, if the exponent e is even, the next call uses e/2 while the entire result is simply squared. The advantage then is that this program runs in O(log e) efficiency, rather than the O(e) efficiency, had the boolean test of e being even not been tested.

To illustrate the running of the program using memory stacks, illustrated up-side-down, here's a case of an even exponential. For example, to calculate 3^4, or running of exp(3,4):
  • return exp(3,4)
  • return sqr(exp(3,2))
  • return sqr(exp(3,1))
  • return 3*exp(3,0)
To see what each memory stack returns to the previous, follow each from the bottom of the list back up:
  • 3*exp(3,0) --> 3 --> exp(3,1)
  • sqr(exp(3,1)) --> 9 --> exp(3,2)
  • sqr(exp(3,2)) --> 81 --> exp(3,4)
  • return exp(3,4) //equals 81
Indeed,  3^4 = 81 as returned. Note that exp(3,3) was never called, since during exp(3,4), as 4 is an even integer, it directly halved the exponential called next, and moved onto exp(3,2). While a linear efficiency only would've meant an extra step in this scenario, the difference will be more pronounced when the exponential gets substantially bigger. Consider exp(3,80): instead of 81 calls, this logarithmically efficient method would only have to call exp(3,80), exp(3,40), exp(3,20), exp(3,10), exp(3,5), exp(3,4), exp(3,2), exp(3,1), exp(3,0) for a total of 9 calls.