Sunday, March 3, 2013

Simulating Correlated Pairs of Normally Distributed Random Variables

See previous related article: Simulating Normal Distribution with Polar Method

The previous article walked through how to generate independent copies of normally distributed random variables using the polar method. Here's an extension of that on how to generate pairs of normally distributed variables with a certain correlation between them. Polar method is still used to generate the variables. Recall that correlation ρ = cov(x,y) / (σ_x*σ_y). Both x and y can have its own σ and μ values.

n = input('Input the number of copies: ');
u_x = input('Input the mean of x: ');
s_x = input('Input the standard deviation of x: ');
u_y = input('Input the mean of y: ');
s_y = input('Input the standard deviation of y: ');
p = input('Input the correlation between x and y: ');
x = zeros(1,n);
y = zeros(1,n);

for i = 1:n
    z1 = sqrt(-2*log(rand))*cos(2*pi*rand);
    z2 = sqrt(-2*log(rand))*sin(2*pi*rand);
    x(i) = s_x*z1+u_x;
    y(i) = s_y*p*z1+s_y*sqrt(1-p^2)*z2+u_y;
end

disp 'Calculated mean and standard deviation for x:';
[mean(x), std(x)]
disp 'Calculated mean and standard deviation for y:';
[mean(y), std(y)]
disp 'Calculated correlation between x and y:';
corr(x',y')

The middle block of the codes here is the most essential. Variables z1 and z2 denote two independent copies of N(0,1). One copy is then used to generate x1 as σ_x*z1 + μ_x. Then the correlated pair y1 is generated as σ_y*ρ*z1 + σ_y*sqrt(1-ρ^2)*z2 + μ_y. Note that the z1 term is used in generating x1 as well as y1. Furthermore, note that the middle term σ_y*sqrt(1-ρ^2)*z2 disappears when ρ = ±1.