Monday, December 26, 2011

Variation of St. Petersburgh Paradox Simulation

Related previous posts: Simulation of St. Petersburg Paradox

In the traditional St. Petersburg Paradox, there is 1/(2^n) chance of an outcome of (2^n), for all positive integers n, summing up to an expected value of infinity, but with realistic outcomes far lower than that. What if the base is changed to an integer other than two? In this exercise, bases of 2, 3, 4, 10, and 100 were used to run similar simulations, each with 100,000 trials. To perform this variation, only two lines of codes needed to be changed from the original Java code. For base 3, for example:
  • "while(flip < 0.5)" becomes "while(flip < 1.0/3)"
  • "double result =  Math.pow(2,times)" becomes "double result =  Math.pow(3,times)"
After running the simulation, the results and their analysis as follows:

Base Average/Base Max/Base St. Dev Log(max)/Log(base)
2 9.75 65536 784.25 17
3 7.49 59049 724.34 11
4 6.47 65536 956.40 9
10 5.48 100000 3315.75 6
100 3.46 10000 12282.46 3

Given the different bases, the average result / base value becomes the most significant output to compare. As the value of the base increases, the normalized average decreases, since the probability of getting the lowest value dramatically increases. For base of 100, the result will simply be 100 for 99% of the time. While it's true that the output will also be dramatically greater, the clustering around high-probability, low-value region outweighs the rarity of low-probability, high-value regions.

The max result / base value is a bit harder to decipher. This is the output value were the game started with output of 1, rather than the value of the base, offering a normalized maximum output value. There doesn't seem to be a clear-cut trend of these values as the bases change. The lack of trend is reasonable, given that it only takes one rarity event to record the maximum. The standard deviation drastically increases as the base increases, a more obvious reflection of the greater variance in the output as the base increases. The logarithm of maximum value / logarithm of base is equivalent to the maximum number of times that the game ran for. The decreasing trend is reasonable, given that the probability of extending the game decreases dramatically as the base increases (50% for base 2, 1% for base 100).

The last three trends (or two trends plus the lack of trend for max/base) are reasonable. But the average result / base value is still the most intriguing, since the expected values, regardless of the base, are still all infinity. Instead, as expected, the greater the variance (bigger bases), the further the deviation from the expected value to the realistic output value (smaller normalized results).