Negative Binomial: Mean & Variance in ML

Explore the mean and variance of the Negative Binomial Distribution. Essential for modeling sequential events & successes in machine learning applications. Learn its properties.

14.3 Mean and Variance of the Negative Binomial Distribution

The Negative Binomial Distribution is a fundamental concept in probability and statistics used to model the number of independent trials required to achieve a fixed number of successes ($k$), where each trial has the same probability of success ($\theta$). This section details the mean and variance of this distribution, providing insights into the expected number of trials and the variability of outcomes.

Understanding the Parameters

  • $k$: The fixed number of successes required.
  • $\theta$: The probability of success in a single, independent trial. This probability remains constant across all trials.
  • $1 - \theta$: The probability of failure in a single, independent trial.

I. Mean of the Negative Binomial Distribution

The mean ($\mu$) of the Negative Binomial Distribution represents the expected number of trials needed to achieve the specified number of successes ($k$).

Formula

The mean is calculated using the following formula:

$$ \mu = \frac{k}{\theta} $$

Interpretation

The mean ($\mu$) gives you an average or expected value. It tells you, on average, how many trials you would anticipate conducting before reaching your goal of $k$ successes. A higher probability of success ($\theta$) will result in a lower expected number of trials, which makes intuitive sense – if each trial is more likely to be a success, you'll reach your target number of successes faster.

Example

Suppose a coin has a probability of landing heads ($\theta$) of 0.6. If you want to achieve $k=5$ heads, the expected number of flips to get those 5 heads would be:

$$ \mu = \frac{5}{0.6} = 8.33 $$

On average, you would expect to flip the coin approximately 8.33 times to achieve 5 heads.

II. Variance of the Negative Binomial Distribution

The variance ($\sigma^2$) measures the spread or dispersion of the number of trials around the mean. It quantifies how much the actual number of trials might deviate from the expected number.

Formula

The variance is calculated using the following formula:

$$ \sigma^2 = \frac{k(1 - \theta)}{\theta^2} $$

Interpretation

The variance ($\sigma^2$) indicates the variability or uncertainty in the number of trials required to achieve $k$ successes.

  • A lower probability of success ($\theta$) leads to a higher variance. This means that when success is less likely, the number of trials needed to reach $k$ successes can vary considerably. You might get lucky and achieve success quickly, or you might face a long string of failures before reaching your goal.
  • Conversely, a higher probability of success ($\theta$) leads to a lower variance, indicating that the number of trials needed is more consistently close to the mean.

Example

Using the coin flip example where $\theta = 0.6$ and $k = 5$:

The variance would be:

$$ \sigma^2 = \frac{5(1 - 0.6)}{(0.6)^2} = \frac{5(0.4)}{0.36} = \frac{2}{0.36} \approx 5.56 $$

If the probability of success were lower, say $\theta = 0.2$ for $k=5$ successes:

$$ \sigma^2 = \frac{5(1 - 0.2)}{(0.2)^2} = \frac{5(0.8)}{0.04} = \frac{4}{0.04} = 100 $$

As you can see, the lower success probability leads to a significantly higher variance, indicating much greater unpredictability in the number of trials.

Summary Table

StatisticFormulaDescription
Mean$\mu = \frac{k}{\theta}$Average number of trials to achieve $k$ successes
Variance$\sigma^2 = \frac{k(1 - \theta)}{\theta^2}$Variability in the number of trials until success