Sampling From the Normal Distribution Using the Box-Muller Transform

The Normal Distribution is the workhorse of many common statistical analyses and being able to draw samples from this distribution lies at the heart of many statistical/machine learning algorithms. There have been a number of methods developed to sample from the Normal distribution including Inverse Transform Sampling, the Ziggurat Algorithm, and the Ratio Method (a rejection sampling technique). In this post we will focus on an elegant method called the Box-Muller transform.

A quick review of Cartesian and polar coordinates.

Before we can talk about using the Box-Muller transform, let’s refresh our understanding of the relationship between Cartesian and polar coordinates. You may remember from geometry that if x and y are two points in the Cartesian plane they can be represented in polar coordinates with a radius r and an angle \theta using the following relationships:

r^2 = x^2 + y^2

\tan(\theta) = \frac{y}{x}, and therefore

x = r\cos(\theta)

y = r\sin(\theta)

Notice that if r \leq 1 and \theta \in [0 , 2\pi], then we map out values contained in the unit circle, as shown in the figure below. Also note that random variables in such a circle can be generated by transforming values sampled from the uniform distribution. Specifically, radii can be sampled from r \sim Unif(0,1) and angle can be sampled from \theta \sim 2\pi \times Unif(0,1). A similar mechanism (i.e. drawing points in a circle using uniform variables) is at the heart of the Box-Muller transform for sampling Normal random variables.

Example of the relationship between Cartesian and polar coordinates

Drawing Normally-distributed samples with the Box-Muller transform

Ok, now that we’ve discussed how Cartesian coordinates are represented in polar coordinates, let’s move on to how we can use this relationship to generate random variables. Box-Muller sampling is based on representing the joint distribution of two independent standard Normal random Cartesian variables X and Y

X \sim N(0,1)

Y \sim N(0,1)

in polar coordinates. The joint distribution p(x,y) (which is circular-symmetric) is:

p(x,y) = p(x)p(y) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}\frac{1}{\sqrt{2\pi}}e^{-\frac{y^2}{2}}

= \frac{1}{2\pi}e^{-\frac{x^2 + y^2}{2}}

If we notice that the x^2 + y^2 term in the numerator of the exponent is equal to r^2 (as above) we can make the connection between the the Cartesian representation of the joint Normal distribution and its polar representation:

p(x,y) = \left ( \frac{1}{2\pi} \right ) \left ( e^{\frac{-r^2}{2}} \right )

which is the product of two density functions, an exponential distribution over squared radii:

r^2 \sim Exp(\frac{1}{2})

and a uniform distribution over angles:

\theta \sim Unif(0,2\pi)

just like those mentioned above when generating points on the unit circle. Now, if we make another connection between the exponential distribution and the uniform distribution, namely that:

Exp(\lambda) = \frac{-\log (Unif(0,1))}{\lambda}

then r \sim \sqrt{-2\log (Unif(0,1))}

This gives us a way to generate points from the joint Gaussian distribution by sampling from two independent uniform distributions, one for r and another for \theta, and transforming them into Cartesian coordinates via the relationships above. In detail, the procedure goes as follows:

  1. Draw, u_1,u_2 \sim Unif(0,1)
  2. Transform the variables into radius and angle representation r = \sqrt{-2\log(u_1)} , and \theta = 2\pi u_2
  3. Transform radius and angle into Cartesian coordinates: x = r \cos(\theta), y = r \sin(\theta)

What results are two independent Normal random variables, X and Y. A MATLAB implementation of the Box-Muller algorithm is shown below:

u = rand(2,100000);
r = sqrt(-2*log(u(1,:)));
theta = 2*pi*u(2,:);
x = r.*cos(theta);
y = r.*sin(theta);

colormap hot;axis square
title(sprintf('Box-Muller Samples Y\n Mean = %1.2f\n Variance = %1.2f\n Kurtosis = %1.2f',mean(x),var(x),3-kurtosis(x)))
xlim([-6 6])

colormap hot;axis square
title(sprintf('Box-Muller Samples X\n Mean = %1.2f\n Variance = %1.2f\n Kurtosis = %1.2f',mean(y),var(y),3-kurtosis(y)))
xlim([-6 6])

Box-Muller Samples for Normal Distribution

Wrapping Up

The output of the MATLAB code is shown above. Notice the first, second, and fourth central moments (mean, variance, and kurtosis)  of the generated samples are consistent with the standard normal. The Box-Muller transform is another example of of how uniform variables on the interval (0,1) and can be  transformed in order to sample from a more complicated distribution.

About dustinstansbury

I recently received my PhD from UC Berkeley where I studied computational neuroscience and machine learning.

Posted on September 11, 2012, in Sampling Methods, Simulations, Statistics and tagged , , . Bookmark the permalink. 4 Comments.

  1. Thanks! Great article!😀

  2. Thank you. Very clear and concise.
    Just one typo I noticed: p(x,y) = p(x)p(y) = \frac{1}{2\pi}e^{-\frac{x^2}{2}}\frac{1}{2\pi}e^{-\frac{y^2}{2}}
    should be p(x,y) = p(x)p(y) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}\frac{1}{\sqrt{2\pi}}e^{-\frac{y^2}{2}}
    The normal distributions should have radicals over the 2pi terms.

  3. Thanks. This helps me understand better.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: