Learn the Negative Binomial Distribution in AI and statistics. Discover its PDF, and how it models trials to reach the r-th success with constant probability.

12.3 Negative Binomial Distribution

The Negative Binomial Distribution is a discrete probability distribution used in probability and statistics. It models the number of independent trials required to achieve a fixed number of successes, where each trial has a constant probability of success.

This distribution answers the question:

How many trials are needed until the r-th success occurs, given each trial has a constant probability of success?

Probability Distribution Function (PDF)

The Probability Distribution Function (PDF), often referred to as the Probability Mass Function (PMF) for discrete distributions, gives the probability of observing exactly x trials to achieve the r-th success.

Parameters:

x: The total number of trials.
r: The target number of successes.
p: The probability of success in each trial ($0 < p < 1$).
q: The probability of failure in each trial ($q = 1 - p$).

Formula:

The probability of achieving the r-th success on the x-th trial is given by:

$P(X = x) = \binom{x - 1}{r - 1} p^r q^{x-r}$

This can also be written using factorials:

$P(X = x) = \frac{(x - 1)!}{(r - 1)! (x - r)!} p^r (1 - p)^{x - r}$

Where:

$x = r, r+1, r+2, \dots$
$0 < p < 1$

Explanation of the Formula:

$\binom{x - 1}{r - 1}$: This is the binomial coefficient. It represents the number of ways to choose the positions of the first r-1 successes within the first x-1 trials. The x-th trial must be the r-th success.
$p^r$: This is the probability of achieving r successes.
$q^{x-r}$ or $(1-p)^{x-r}$: This is the probability of achieving $x-r$ failures in the remaining $x-r$ trials.

Real-Life Examples

Example 1: Basketball Player

Imagine a basketball player shooting hoops. Each successful shot is a "success," and each missed shot is a "failure." If the player has a constant probability p of making a shot, the negative binomial distribution can be used to calculate the probability that they will make their 3rd basket on their 5th shot.

Example 2: Pop Quiz

Suppose a student is taking a pop quiz. Each correct answer is a "success," and each incorrect answer is a "failure." If the student has a certain probability p of answering a question correctly, the negative binomial distribution can determine the probability of the student needing to answer a specific number of questions (e.g., 10 questions) to achieve their 2nd correct answer.

Key Characteristics

Independent Trials: Each trial is independent of the others.
Two Outcomes: Each trial results in either a success or a failure.
Constant Probability of Success: The probability of success (p) remains the same for every trial.
Focus on Number of Trials: The distribution is concerned with the number of trials needed to reach a specific number of successes, not the number of successes within a fixed number of trials (which is the domain of the binomial distribution).

Binomial Distribution: The binomial distribution counts the number of successes in a fixed number of trials, whereas the negative binomial distribution counts the number of trials needed for a fixed number of successes.
Geometric Distribution: The geometric distribution is a special case of the negative binomial distribution where $r=1$. It specifically models the number of trials needed to achieve the first success.

Interview Questions

What is the negative binomial distribution, and when is it typically applied?
Explain the role of the parameters r and p in the negative binomial distribution.
Can you walk through the derivation or interpretation of the negative binomial probability mass function?
What distinguishes the negative binomial distribution from the binomial distribution?
What are the fundamental assumptions that must hold for the negative binomial distribution to be applicable?
How would you calculate the probability of requiring exactly x trials to achieve r successes?
Provide a practical, real-world scenario where the negative binomial distribution would be an appropriate model.
How does the negative binomial distribution account for the inherent variability in the number of trials?
What is the significance of the binomial coefficient $\binom{x - 1}{r - 1}$ in the context of the negative binomial formula?
Describe how you would use the negative binomial distribution to model a process involving repeated attempts with a fixed probability of success.

Negative Binomial Distribution Explained: AI & Stats