Inverse transform sampling

Inverse transform sampling or inversion sampling is a method to sample numbers at random from any probability distribution, given its cumulative distribution function.
Lets understand this with an example: You are given an random variable X which is exponential distributed. More formally,
\begin{align} f(x;\lambda) &= { \lambda * exp(-\lambda * x) \space \space x >= 0} \\ &= { 0 \space when \space x \space < \space 0} \end{align} And your goal is to sample from X.
Given this PDF, we can easily come up with a CDF (probability that X will take a value less than or equal to x).
Because it is a probability density, the sum of the probabilities of all possible values for $x$ must be $1$. The CDF gives us the area under the PDF curve at and to the left of a given $x$ value .
What if we map this CDF to a point x in X? It is very easy to sample a point in (0, 1) and if we have the inverse CDF mapping, then we can easily sample points. For the above example, we have

\begin{align} CDF(x) &= { 1-e^{-\lambda x}}\\ x &= \frac{-1}{\lambda}\log(1 - CDF(x))\\ \end{align}

We can now use this inverse CDF function for sampling points in X as shown in the code below.

Consider another Problem: Sample points uniformly from a unit circle.
This problem is tricky. The reason is the non-uniform distribution of points within a unit circle. Let's start with simple solution and check if it works.

Sample a point uniformly at random $p ~ (0, 1)$
$r = p \space and \space \theta = \space2 * \space\pi *\space p$
Then $x = rcos\theta\space,\space rsin\theta$

Here is a Matlab/Octave code which performs this.

As it is evident from this plot, there are more points towards the center. Why is this case?
This is because as we move further from the center, we need more points to maintain the same density. In other words, points closer to the center have smaller distances between them.
How do we rectify this?

We need to sample points proportional to the area. Lets start with the $CDF(x) = \frac{\pi * x * x}{ \pi * 1 * 1 }$. In the other words, given a radius x, ratio of all the points lying within the circle with x to the area of unit circle.
Therefore, $CDF(x) = x \space * \space x $
In other words, $x = \sqrt{CDF(x)}$
With this correction, our steps become:

Sample a point uniformly at $random \space p \sim (0, 1) $
$r = \sqrt{p} \space and \space \theta = 2 \space* \space\pi \space*\space p $
Then $x = rcos\theta \space,\space rsin\theta$

The below code with this small change now looks like:

And now the plot shows a uniform sample of points. Below, we will solve this very problem using an entirely different technique called the Rejection Sampling.

Essentially, inverse transform sampling consists of following steps:

Given a CDF of random variable X, compute $CDF^{-1}(X)$
Sample a point in $[0, 1]$
Using $CDF^{-1}(X)$, compute $x_i$ and include that in the sample.

Pros
Each point is accepted as a sample unlike rejection sampling discussed below.

Cons
The method assumes the existence of an inverse CDF function.

Rejection Sampling

As we saw above, inverse transform sampling requires a CDF. What is we do not have a CDF or the underlying distribution is very complex? Lets start with the same example as above.

Problem: Sample points uniformly from a unit circle.

Imagine a unit square encompassing the circle.
Pick a point P in this unit square uniformly at random. This is easy, you can pick an $x \sim Rand(0, 1)$ and then $y = x.$
Now check if P lies within the circle or outside it. This is not hard, we can simply check if $x_i^2 + y_i^2 <= 1$
If the point lies outside the circle, reject the point. Else include it in our sample.

Below Matlab/Octave code shows this in action.

Rejection Sampling can be described as follows:

Let say you have a complex distribution $f$ which is difficult to sample from.
Propose a distribution q such that it encompasses $f$ completely and is easy to sample from
Sample x uniformly at random from the distribution q.
Retain this point if it lies in $f$ or else reject it.

Cons of Rejection Sampling:

It might be difficult to come up with a good proposal distribution.
Lots of points get rejected unlike inversion sampling, so it might a long time to get enough number of samples.

References:

I will be updating this blog with other sampling methods like Importance sampling, MCMC in a short while. Stay tuned!

SystemAntics: DNNs, ML, Systems and Code...

Tuesday, December 25, 2018

Sampling methods

Inverse transform sampling

Rejection Sampling

No comments:

Post a Comment