Basics about statistical distributions for events random in time

Poisson Process

We will study the case of the input events being described by a Poisson Process, the most common case for events random in time.

Without going into mathematical formalism, roughly described, events adhering to the following conditions can be labeled to be generated by a Poission Process:

  • The time of an event is independent of the time of the previous event
  • The events have a fixed average rate

Some examples of phenomena well-modeled by a Poisson Process are:

  • Photons reaching a telescope
  • Goals scored in a soccer match
  • Customers entering a drug store
  • Deaths due to horse kick in the Prussian cavalry in the 1890's 1

The Exponential Distribution

We are approaching the usefulness for us as hardware developers of all this.

The most central fact of this post is:

The times between the events of a Poisson Process are described by the Exponential Distribution.

Denoting such times by a stochastic variable X, we write this as

 X \sim \textrm{Exp}(\lambda)

where \lambda is the rate, measured e.g. in events per second, the one parameter of the Exponential Distribution.

The probability density function of the distribution is

 f(x) = \begin{cases} \lambda e^{-\lambda x}, & x \ge 0, \\ 0, & x < 0. \end{cases}

It is plotted below for some different values of \lambda.

Probability Density Function of the Exponential Distribution (Author: Skbkekas)

Furthermore, the cumulative distribution function is

 F(x) = \begin{cases} 1-e^{-\lambda x}, & x \ge 0, \\ 0, & x < 0. \end{cases}

The mean is given by

 \mathrm{E}[X] = \frac{1}{\lambda}

which is actually also the expression for the standard deviation:

 \sigma_{X} = \frac{1}{\lambda} \text{.}

The expression for the mean makes sense: if we have 10 events per second, the average time between the events should be 0.1 seconds.

Although the Exponential Distribution will be the one of greatest value for us, the Poisson Distribution is the most commonly encountered distribution for Poisson Processes, so let me introduce it for the sake of completeness.

The Poisson Distribution

The probability of a certain number of events during some given interval of time for a Poisson Process will be described by the Poisson Distribution. Letting Y denote the stochastic variable we write this as

 Y \sim \textrm{Pois}(\lambda)

where \lambda is the parameter. It will be equal to the mean of the distribution:

 \mathrm{E}[Y] = \lambda

and if the time interval used is one unit of time, \lambda will represent the rate, just like for the Exponential Distribution.

As you might already have realized, the Poisson Distribution is a discrete distribution.

Its probability mass function 2 is

 f(k)= P(X=k)= \frac{\lambda^k e^{-\lambda}}{k!}

where k is obviously a discrete variable representing the number of events during the interval of time. The above function is plotted below for some different values of \lambda.

Probability Mass Function of the Poisson Distribution (Author: Skbkekas)

A very convenient and useful feature of the Poisson Distribution is that it for large \lambda's can be well approximated by a Normal Distribution. You can see that this approximation makes sense already for the \lambda=10 case in the plot above. For the approximation, set both the mean and the variance parameters of the Normal Distribution to \lambda and you're done - however you should also strip the negative left-side tail of the resulting Normal Distribution to avoid negative event counts. 3

Generating Poisson Process events

Deriving from a uniform distribution

There is a very straight-forward method for generating exponentially distributed values X given uniformly distributed values U:

 X = \frac{-\ln U}{\lambda}

This is the method I have used. You will need to integrate them to get absolute times rather than the times separating the events, which is what the Exponential Distribution gives. If you need a discrete measure of time such as number of clock cycles, perform a simple round-to-nearest on the decimal values.

A Matlab function for using this method looks like this:

function X = rande(n, gamma)
  % Generates a col of n exponentially distr. values X with rate gamma.
  U = rand(1, n); % return n values ~ unif[0 1]
  X = -log(U)./gamma;
end

(Integration is not performed in this function and the returned values are continuous, not discrete.)

More computing-efficient methods

There are methods developed to get rid of the summation (integration) needed above for generating Poisson Process times.

Some methods are suggested in "Non-Uniform Random Variate Generation" (Luc Devroye, 1986) on pages 392-400. The text is available online:
http://luc.devroye.org/chapter_nine.pdf

Supposedly, Donlad Knuth also gives some suggestions in his legendary book "The Art of Computer Programming, volume 2: Seminumerical Algorithms" (3rd edn., Section 3.4.1, p. 133).

Go on to Part III ->

Notes:

  1. This is the classical example. It was described in a report by the mathematician Ladislaus Bortkiewicz in 1898.
  2. the discrete counterpart of the probability density function
  3. for details, see http://en.wikipedia.org/wiki/Poisson_distribution#Related_distributions