Poisson Distribution: Meaning, Mean, Variance & Shape | AI

Explore the Poisson Distribution for AI & ML: understand its meaning, characteristics, shape, mean, variance, and probability distribution function (PDF).

16. Poisson Distribution: Meaning, Characteristics, Shape, Mean, and Variance

This document provides a comprehensive overview of the Poisson distribution, covering its definition, properties, graphical representation, and applications.

16.1 Probability Distribution Function (PDF) of the Poisson Distribution

The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

The probability mass function (PMF) of the Poisson distribution is given by:

$$ P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!} $$

Where:

  • $P(X=k)$ is the probability of observing exactly $k$ events.
  • $k$ is the number of occurrences (a non-negative integer, i.e., $k = 0, 1, 2, ...$).
  • $\lambda$ (lambda) is a positive real number representing the average number of events in the given interval (the rate parameter).
  • $e$ is the base of the natural logarithm (approximately 2.71828).
  • $k!$ is the factorial of $k$.

Key Characteristics of the PDF:

  • The probability $P(X=k)$ is always non-negative.
  • The sum of probabilities for all possible values of $k$ equals 1: $\sum_{k=0}^{\infty} \frac{e^{-\lambda} \lambda^k}{k!} = 1$.

16.2 Characteristics of the Poisson Distribution

The Poisson distribution possesses several key characteristics that define its behavior and applicability:

  • Discrete: The random variable can only take on a countable number of values (non-negative integers).
  • Independent Events: The occurrence of one event does not affect the probability of another event occurring.
  • Constant Average Rate: The average rate of events ($\lambda$) is constant over the specified interval.
  • Non-negative Events: The number of events must be zero or a positive integer.
  • Probability of Single Event: The probability of two or more events occurring in a very small interval is negligible.

16.3 Shape of the Poisson Distribution

The shape of the Poisson distribution is determined by the parameter $\lambda$ (the mean).

  • For small $\lambda$ (e.g., $\lambda < 1$): The distribution is highly skewed to the right. The probability of zero events is highest, and probabilities decrease rapidly as $k$ increases.
  • As $\lambda$ increases: The distribution becomes more symmetric.
  • For large $\lambda$ (e.g., $\lambda > 10$): The distribution closely approximates a normal distribution with mean $\lambda$ and variance $\lambda$.

Visual Representation (Conceptual):

Imagine plotting $P(X=k)$ on the y-axis against $k$ on the x-axis.

  • Low $\lambda$: A steep curve starting high at $k=0$ and quickly dropping off.
  • Medium $\lambda$: A bell-like shape, but still slightly right-skewed.
  • High $\lambda$: A shape very similar to a bell curve (normal distribution).

16.4 Mean and Variance of the Poisson Distribution

A fundamental property of the Poisson distribution is that its mean and variance are equal to the rate parameter $\lambda$.

  • Mean ($\mathbb{E}[X]$): The average number of events in the interval. $$ \mathbb{E}[X] = \lambda $$

  • Variance ($\text{Var}(X)$): A measure of the spread or variability of the number of events. $$ \text{Var}(X) = \lambda $$

This equality of mean and variance is a distinguishing feature of the Poisson distribution and is often used as a diagnostic test when assessing if a dataset follows this distribution.

16.5 Fitting a Poisson Distribution

Fitting a Poisson distribution to data involves estimating the rate parameter $\lambda$ from observed counts. The most common method is using the sample mean as an estimate for $\lambda$.

Given a set of observed counts $x_1, x_2, \dots, x_n$ (representing the number of events in $n$ intervals), the maximum likelihood estimate for $\lambda$ is the sample mean:

$$ \hat{\lambda} = \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i $$

Once $\hat{\lambda}$ is estimated, the Poisson PMF can be used to calculate probabilities for different numbers of events or to generate expected counts for comparison with observed data.

16.6 Poisson Distribution as an Approximation to the Binomial Distribution

The Poisson distribution can serve as a good approximation to the binomial distribution under certain conditions. This approximation is particularly useful when the number of trials ($n$) in a binomial experiment is very large, and the probability of success ($p$) in each trial is very small.

Specifically, if $X \sim \text{Binomial}(n, p)$ where $n \to \infty$ and $p \to 0$ such that the product $np = \lambda$ (a constant), then $X$ approaches a Poisson distribution with parameter $\lambda$:

$$ \text{Binomial}(n, p) \approx \text{Poisson}(\lambda = np) $$

This approximation simplifies calculations when dealing with rare events over many trials.

16.7 Examples of Poisson Distribution

The Poisson distribution is widely applicable in various fields for modeling the number of events occurring within a specific time or space. Here are some common examples:

  • Telecommunications: The number of phone calls received by a call center per hour.
  • Quality Control: The number of defects found in a manufactured product.
  • Biology: The number of mutations in a DNA strand of a certain length.
  • Insurance: The number of claims filed by policyholders in a given month.
  • Traffic Engineering: The number of vehicles passing a point on a highway in a minute.
  • Healthcare: The number of patients admitted to an emergency room per day.
  • Astronomy: The number of stars observed in a specific area of the sky.

Example Scenario:

Suppose a customer service center receives an average of 5 calls per hour. We want to find the probability of receiving exactly 3 calls in a given hour.

Here, $\lambda = 5$ (average rate) and we want to find $P(X=3)$. Using the Poisson PMF:

$$ P(X=3) = \frac{e^{-5} 5^3}{3!} = \frac{e^{-5} \times 125}{6} $$

Calculating this value: $e^{-5} \approx 0.006738$ $P(X=3) \approx \frac{0.006738 \times 125}{6} \approx \frac{0.84225}{6} \approx 0.140375$

So, the probability of receiving exactly 3 calls in an hour is approximately 0.1404 or 14.04%.

Poisson Distribution: Meaning, Mean, Variance & Shape | AI