Thursday, March 21, 2013

Interest Rate Swap and Swaption

Interest rate swap is also known as plain vanilla swap, or fixed and fixing swap. It is the exchange of cash flows on a notional value (say $1 million) given a fixed interest rate (say 3%), for the cash flows from on the same notional value given a floating interest rate (say LIBOR + 0.5%). The notable aspect is that the notional value never gets traded, and thus is rather arbitrary. Over the life of the swap, the payments are made to the counterparty at intervals that are usually 6 months. The swap benefits entities that may want more exposure or mitigate exposure to interest rate risk.

Swaption is an option on the swap, whereby the writer has the option but not the obligation to enter into a swap agreement by a certain date. But furthermore, the contract specifies whether the buyer will be the recipient of the fixed or floating rate cash flows.

Sources:

Tuesday, March 19, 2013

Simulation of Simplified Version of Showcase Showdown Using MATLAB

See previous related post: Showcase Showdown Analysis: Circular Reasoning and Oscillating Nash Equilibrium

In the show The Price Is Right, contestants spin the Big Wheel, which has 20 sections randomly distributed, from 5 cents to $1.00 in 5-cent increments. The objective is to get the highest amount without going over $1, with one initial spin and an optional second spin. Clearly the first contestant is at a disadvantage, as the later contestants simply opt for the second optional spin if the first spin is not enough. For the first contestant, there needs to be a strategy consisting of a critical value for the first spin, at or above which the second spin is not preferred, as the probability of going over is too high. It's also a discrete case, as the critical value can only be from 5 to 100, in increments of 5.

In the actual show, there are three contestants. For the sake for simplicity, only two contestants are assumed in this simulation. In the case of tie, rerun the trial. For each possible critical value from 5 to 100 cents, 1 million trials are run in this Monte Carlo simulation to determine the winning percentage of player 1 under that strategy.

Here's the graph with the critical point on the x-axis and the winning percentage on the y-axis:


The maximum occurs at critical value of 55, corresponding with a winning percentage of 45.55%. At a value of 50, the winning percentage is up there at 45.22%. Even closer is 45.52% at a value of 60. The winning percentage gradually tapers off. Even at a value of 70, the winning percentage is still at 44.25%. Here's the MATLAB script to generate the data and graph:
x = zeros(1,20);
y = zeros(1,20);

for crit = 1:20
    x(crit) = 5*crit;
    n = 1000000;
    win = zeros(1,n);

    for i = 1:n
        done = false;
        while ~done
            firstSpin = ceil(20*rand)*5;

            if firstSpin >= x(crit)
                playerOne = firstSpin;
            else
                playerOne = firstSpin + ceil(20*rand)*5;
            end

            if playerOne > 100
                playerOne = 0;
            end

            playerTwo = ceil(20*rand)*5;
            if playerTwo < playerOne
                playerTwo = playerTwo + ceil(20*rand)*5;
            end
            if playerTwo > 100
                playerTwo = 0;
            end

            if playerOne > playerTwo
                win(i) = 1;
                done = true;
            elseif playerOne < playerTwo
                done = true;
            end
        end
    end
    
    y(crit) = sum(win)/n;
end
plot(x,y); grid on; xlabel('player 1 critical point'); ylabel('player 1 winning percentage')

In the code above, simply vary the value of n to change the number of trials at each strategy. Of course, this optimal strategy will be different if the third contestant is taken into consideration in the actual game.

Time Difference: New York & Sydney

Calculating the time difference between New York and Sydney is tricky, not only because both cities observe daylight saving time, but they also are on the opposite hemispheres, so one city's summer is winter for the other. Sydney is GMT +1000 and New York is GMT -0500, yet the time difference between the two cities is rarely 15 hours.

For New York, DST is observed from the second Sunday in March to first Sunday in November. For Sydney, DST is observed from the first Sunday in October to the first Sunday in April. For the sake of convenience, the four relevant Sundays for 2013 are as follows: March 10, Nov 3, Oct 6, and April 7, respectively.

Sorting the dates by groups:
  • Jan 1 to March 10: Sydney on DST, New York off DST --> 16 hour difference
  • March 10 to April 7: both on DST --> 15 hour difference
  • April 7 to Oct 6: Sydney off DST, New York on DST --> 14 hour difference
  • Oct 6 to Nov 3: both on DST --> 15 hour difference
  • Nov 3 to Dec 31: Sydney on DST, New York off DST --> 16 hour difference
In all, for only 56 days of the year is the time difference between New York and Sydney actually 15 hours.

Sources:

Time Difference: New York & London

The time difference between New York and London is crucial, especially for the financial service sector. For most of the year, London is 5 hours ahead of New York. However, while both areas observe daylight saving in summer, the onset and ending dates aren't exactly aligned, causing a shift in the time difference for moments in the spring and fall.

In the United States, daylight saving begins on the second Sunday in March, and lasts until the first Sunday in November. In Britain, it begins on the last Sunday in March, and lasts until the last Sunday in October. So in all, daylight saving in London begins later, and ends earlier. During those gap periods, New York is only 4 hours behind London, as opposed to the 5 hour difference, when both cities are on daylight saving or when both cities are off. Because there isn't a moment at which London is on and New York is off daylight saving, the London - New York time difference will never be 6 hours.

In conclusion, the time difference between New York and London is as follows:
  • First Sunday in November - second Sunday in March: 5 hour (both off DST)
  • Second Sunday - last Sunday in March: 4 hour
  • Last Sunday in March - last Sunday in October: 5 hour (both on DST)
  • Last Sunday in October - first Sunday in November: 4 hour
Sources:

Quick Glance into Ryanair

Ryanair, the Dublin-based airliner known for its operation of low-cost passenger airline service, is in the news for signing an order from The Boeing Company worth $15.6 billion. Ryanair is one of the few airliners that orders exclusively from Boeing, and has now expanded its fleet by 30% to over 400 units in order to grow its capacity to over 100 million across Europe by 2018. Boeing (NYSE: BA), which has seen its stock price soar by over 10% in the past month to close at around $85, welcomed the news, as its Commercial Airplanes CEO Ray Conner remarked the Next-Generation 737 "as the most efficient, most reliable large single-aisle airplane flying today," which "has been and will continue to be the cornerstone of the Ryanair fleet."

Ryanair Holdings (NASDAQ: RYAAY), the holding company for Ryanair, has market cap of around $12 billion. Its net profit margin in Q4-2012 fell to 1.87% after posting 12.76% for Q3. A visit to Ryanair's website indeed illustrates the low-cost business that it operates. There's a roundtrip from London to Dublin one month in advance available for only 44 GBP. On flight search websites that offer comparison, it becomes even more evident that the $56 value is significant that of its competitors, including $194 from British Airways. The low fares do come at some costs though; as users go on the website to search for flights, they are prompted to enter a "security code," which is obtained by viewing a short advertisement. Nevertheless, the colossal deal was signed today in New York.

Sources:

Thursday, March 7, 2013

Acceptance Rejection Method for Simulation

In simulation, use the acceptance rejection method to simulate a random variable X with density f(x), for which there is no direct method to use the inverse transformation to generate it. However, suppose that there is a known algorithm to generate random variable Y with density g(x) such that f(x) / g(x) is bounded, that is f(x) / g(x) ≤ c. To find c, take the derivative of f(x) / g(x) to find the maximum value. Here's the algorithm:
  1. Generate Y according to the known algorithm
  2. Generate U, the uniform distribution from 0 to 1
  3. If U ≤ f(Y) / c*g(Y), then set X = Y
  4. Otherwise, start all over
Note that to produce each copy of X, the loop runs until condition 3 is satisfied. In the end, c become the average number of runs to produce a copy of X. In the following, a random variable with f(x) = 2x*e^(-x^2) is generated using MATLAB. The chosen g(x) = e^(-x), so f(x) / g(x) = 2x*e(x-x^2), and the maximum value c turns out to be exactly 2. 

There's no way to verify that this method does indeed generate a random variable with the given f(x). Instead, to test the accuracy of the algorithm here, two values (a and b) are inputted at the end. The program first calculates the integral of f(x) from [a, b], or the probability that X falls between the range of [a, b]. Finally, the program uses the simulated copies of X to count the proportion of the copies of X that fell in that interval of [a, b]. The two values should be close to each other.
func = @(x) 2*x.*exp(-x.^2);
n = 100000;
results = zeros(1,n);
numTimes = zeros(1,n);

for i=1:n
    trialDone = false;
    count = 0;
    while ~trialDone
        count = count + 1;
        y = -log(rand);
        u = rand;
        if u <= y*exp(y-y^2)
            results(i) = y;
            trialDone = true;
        end
    end
    numTimes(i) = count;
end

a = input('Lower bound for x: ');
b = input('Upper bound for x: ');
disp 'Evaluated integral using f(x): ';
quad(func, a, b)
disp 'Simulated integral using acceptance / rejection method: ';
length(results(results>a & results
disp 'Average trial runs per simulated copy: '
mean(numTimes)

Choose values between 0 and 2.5 for a and b, for over 99.8% of the data are in that range. Upon running the script using different values, the calculated integral closely matches the frequency proportion obtained from the simulation technique using the acceptance rejection method. While this does not verify the accuracy of the method, it at least illustrates that the resultant output resembles what the expected output should be. Finally, the resultant average trial runs per simulated copy hovers closely to 2 as expected.

Sunday, March 3, 2013

Simulating Correlated Pairs of Normally Distributed Random Variables

See previous related article: Simulating Normal Distribution with Polar Method

The previous article walked through how to generate independent copies of normally distributed random variables using the polar method. Here's an extension of that on how to generate pairs of normally distributed variables with a certain correlation between them. Polar method is still used to generate the variables. Recall that correlation ρ = cov(x,y) / (σ_x*σ_y). Both x and y can have its own σ and μ values.

n = input('Input the number of copies: ');
u_x = input('Input the mean of x: ');
s_x = input('Input the standard deviation of x: ');
u_y = input('Input the mean of y: ');
s_y = input('Input the standard deviation of y: ');
p = input('Input the correlation between x and y: ');
x = zeros(1,n);
y = zeros(1,n);

for i = 1:n
    z1 = sqrt(-2*log(rand))*cos(2*pi*rand);
    z2 = sqrt(-2*log(rand))*sin(2*pi*rand);
    x(i) = s_x*z1+u_x;
    y(i) = s_y*p*z1+s_y*sqrt(1-p^2)*z2+u_y;
end

disp 'Calculated mean and standard deviation for x:';
[mean(x), std(x)]
disp 'Calculated mean and standard deviation for y:';
[mean(y), std(y)]
disp 'Calculated correlation between x and y:';
corr(x',y')

The middle block of the codes here is the most essential. Variables z1 and z2 denote two independent copies of N(0,1). One copy is then used to generate x1 as σ_x*z1 + μ_x. Then the correlated pair y1 is generated as σ_y*ρ*z1 + σ_y*sqrt(1-ρ^2)*z2 + μ_y. Note that the z1 term is used in generating x1 as well as y1. Furthermore, note that the middle term σ_y*sqrt(1-ρ^2)*z2 disappears when ρ = ±1.

Simulating Normal Distribution with Polar Method

The polar method allows for quick generation of normally distributed random variables. For each trial, generate r^2 as exponentially distributed with λ = 1/2, and θ uniformly distributed between 0 and 2π. Within each trial, in fact 2 independent copies of the normally distributed random variables are generated: Z1 = sqrt(r^2)*cos(θ) and Z2 = sqrt(r^2)*sin(θ), both with μ = 0 and σ = 1. For normally distributed random variables with different parameters of μ and σ, simply use the transformation σ*Z+μ.
n = input('Input the number of copies: ');
u = input('Input the mean: ');
s = input('Input the standard deviation: ');
results = zeros(1,n);
for i = 1:2:n
    rsq = -2*log(rand);
    theta = 2*pi*rand;
    results(i) = s*sqrt(rsq)*cos(theta)+u;
    results(i+1) = s*sqrt(rsq)*sin(theta)+u;
end
disp 'Calculated mean and standard deviation using the polar method:';
[mean(results), std(results)]

The program at the end calculates the μ and σ values of the n copies of the random variables created as a result of this simulation, stored in the array named "results."

Friday, February 1, 2013

“Math Will Rock Your World” Digest

The article “Math Will Rock Your World”, a cover story featured in BusinessWeek and published on January 23, 2006, talks about the increasingly important role that mathematics and data analysis have played in industries and daily lives. In particular, subjects seemingly incongruous with analytic, such as linguistics, have become intertwined. This was discussed in the startup company Inform Technologies LLC, in which the algorithm “combs through thousands of press articles and blog posts” and “analyze each article by its language and context.” At the foundation of this data analysis are mathematical algorithms. Subjects and relationships between subjects combine to construct the polytope, “an object floating in space that has an edge for every known scrap of information.” This development is today’s informational revolution.

Technology companies, from Google to Facebook, are increasingly trying to make use of the gigabytes of information they have. The challenge is to use the information, most of which are stored as qualitative idea, into quantitative algorithms that can be propagated. These developments can be observed presently through efforts such as personally-targeted advertisement on Google searches or Facebook profiles. The article stresses the importance of data analysis in today’s business when it talks about how Ford Motor “could have sold an additional $625 million worth of trucks if it had lifted its online ad budget from 2.5% to 6% of the total.” Online advertisement allows companies to “profile customers” as the companies “know where their prospective customers are browsing, what they click on, and often, what they buy.” These ideas altogether illustrate the idea that access to information and the efficient mathematical analysis of the information can lead to great business solutions.

While this development fosters efficiency, it also raises some concerns that the article addresses. Utmost concern is privacy, which companies from Google and Facebook have all grappled with in the recent years. The inevitability of the “power of mathematicians to make sense of personal data and to model the behavior of individuals” will compromise privacy, and this is a concern not just for the individuals who data are being utilized. If the individuals fear for their data being manipulated beyond their range of comfort, they may lock the information up and prevent them from being utilized. This would hamper efforts of the mathematicians to develop algorithms and determine business or practical solutions. Another concern is the complexity of the new development. Managers must “understand enough about math to question the assumptions behind the numbers,” given that it becomes much easier to deceit “someone by having analysis based on lots of data and graphs.” As a result, this is the challenge for United States, as the article mentions. The country “must breed more top-notch mathematicians at home” by revamping education and simultaneously “cultivate greater math savvy” as the subject becomes more prevalent in the business profession.

For students studying mathematics and related fields, now is a great opportunity to foster these interests. Computer scientists and quantitative analysts are in high demand, and there is much room for development in this inchoate field. But even for those not directly working in this field, an understanding of the subject becomes increasingly important as well. A solid knowledge foundation allows for critical analysis of the technological improvements. As the field of data mining continues to revolutionaries business and the way society progresses, it is in the best interest of individuals to not only know how to best utilize these developments, but also to protect one’s own information to ensure that privacy is not greatly compromised in the reach for progress.

Source:

Tuesday, January 29, 2013

Inverse Transform Demonstration with Excel VBA

Given F(x) = 1-e^(-λ*x) as the cumulative distribution function, the inverse transform gives -1/λ*ln(U) as the function that has F(x) as its cdf, where U is the uniform distribution from [0, 1]. Here, the following VBA codes allow users to visualize this transformation in Microsoft Excel.

Upon the execution of this procedure, the user inputs a value for lambda. Then 10,000 simulations are run, initially generating a random number from [0, 1] and then inputting that random number into -1/λ*ln(U), and outputting the result in column A. At the end of the execution, column C contains the different x values from 0 to the maximum, in increments of 0.001. Column D reflects the cdf by counting entries of A that are smaller than the corresponding x value. Finally, column E calculates the true value of 1-e^(-λ*x). The idea is that the outputs in columns D and E are similar.

Sub InverseTransform()
'Demonstrates inverse transform of the cdf F(x) = 1-e^(-lambda*x)
Columns("A:E").Clear
Range("B1").Value = 1# * InputBox("Enter a positive value for lambda: ", "Input", 2)

'10,000 simulation trials to be performed
'Transform of F(x) gives -1/lambda*ln(U), where U is uniform distribution [0,1]
For i = 1 To 10000
Cells(i, 1) = -1 / Range("B1").Value * Log(Rnd)
Next i

'Determine the maximum for the range of numbers to work with
Range("B2").FormulaR1C1 = "=MAX(C[-1])"

'To determine the cumulative probability density, use 0.001 gradient from 0 to the maximum value as the counter
i = 1
While i / 1000 <= Range("B2").Value
Cells(i, 3).Value = i / 1000
'In column D, count entries in column A that are smaller than the counter, then divide by the number of trials
Cells(i, 4).FormulaR1C1 = "=COUNTIF(C[-3],""<""&RC[-1])/10000"
'In column E, calculate the true value of 1-e^(-lambda*x)
Cells(i, 5).FormulaR1C1 = "=(1-EXP(0-R1C2*RC[-2]))"
i = i + 1
Wend
Range("B2").Clear
End Sub

After the execution of this procedure, the user can perform further analysis. Graphing columns C through E does reveal that values in columns D and E are similar, as the points almost completely overlap. Error calculations from those two columns illustrate a similar result that the inverse transform method takes a function F(x), exponential function in this case, to produce a function whose cdf is F(x).