Discrete Probability Distributions: PMFs Explained

Explore discrete probability distributions, their definitions, and probability mass functions (PMFs). Essential for AI & ML modeling of random variables.

9.7 Probability Distribution Functions of Discrete Distributions

Discrete probability distributions describe the likelihood of outcomes for random variables that can take on a finite or countable number of values. These distributions are fundamental in statistics and probability theory, allowing us to model and understand random phenomena.

Below are key types of discrete distributions, their definitions, and their probability mass functions (PMFs).


I. Discrete Uniform Distribution

The Discrete Uniform Distribution is used when all possible outcomes of a random variable are equally likely.

Example: Rolling a fair six-sided die. Each face (1, 2, 3, 4, 5, 6) has an equal probability of appearing.

Probability Formula (PMF):

$$ P(X = x) = \frac{1}{n} $$

Where:

  • $n$: The total number of possible, equally likely outcomes.
  • $x$: Any specific outcome from the set of possible values.

II. Bernoulli Distribution

The Bernoulli Distribution models a single trial of an experiment that has only two possible outcomes: "success" (usually denoted as 1) and "failure" (usually denoted as 0).

Example: Flipping a fair coin. The outcome can be heads (success) or tails (failure).

Probability Formula (PMF):

$$ P(X = x) = p^x (1 - p)^{1 - x}, \quad x \in {0, 1}; \quad 0 < p < 1 $$

Where:

  • $p$: The probability of success on a single trial.
  • $1 - p$: The probability of failure on a single trial.
  • $x$: The outcome of the trial (0 for failure, 1 for success).

III. Binomial Distribution

The Binomial Distribution is used to model the number of successes in a fixed number of independent Bernoulli trials, where each trial has the same probability of success.

Example: The probability of getting exactly 3 heads in 5 flips of a fair coin.

Probability Formula (PMF):

$$ P(X = x) = \binom{n}{x} p^x (1 - p)^{n - x}, \quad x = 0, 1, \dots, n $$

Where:

  • $\binom{n}{x}$ (or $C(n, x)$): The binomial coefficient, representing the number of ways to choose $x$ successes from $n$ trials. It is calculated as $\frac{n!}{x!(n-x)!}$.
  • $n$: The total number of independent trials.
  • $x$: The number of successful outcomes.
  • $p$: The probability of success on a single trial.
  • $1 - p$: The probability of failure on a single trial.

IV. Geometric Distribution

The Geometric Distribution models the number of independent Bernoulli trials needed to achieve the first success.

Example: The number of times you need to roll a die until you get a 6 for the first time.

Probability Formula (PMF):

$$ P(X = x) = (1 - p)^{x - 1} p, \quad x = 1, 2, 3, \dots $$

Where:

  • $p$: The probability of success on each trial.
  • $1 - p$: The probability of failure on each trial.
  • $x$: The number of trials required to get the first success.

V. Negative Binomial Distribution

The Negative Binomial Distribution is a generalization of the Geometric Distribution. It models the probability that a specific number of trials ($x$) are required to achieve a fixed number of successes ($k$).

Example: The number of times you need to flip a coin until you get 5 heads.

Probability Formula (PMF):

$$ P(X = x) = \binom{x - 1}{k - 1} \theta^k (1 - \theta)^{x - k}, \quad x = k, k+1, \dots $$

Where:

  • $\binom{x - 1}{k - 1}$: The number of ways to arrange the first $k-1$ successes within the first $x-1$ trials.
  • $\theta$: The probability of success on any single trial.
  • $1 - \theta$: The probability of failure on any single trial.
  • $k$: The target number of successes.
  • $x$: The total number of trials needed to achieve $k$ successes.

VI. Poisson Distribution

The Poisson Distribution is used to model the number of times an event occurs within a fixed interval of time or space, given a known constant average rate of occurrence. The occurrences must be independent of the time since the last occurrence.

Example: The number of customers arriving at a store per hour, or the number of defects in a square meter of fabric.

Probability Formula (PMF):

$$ P(X = x) = \frac{\lambda^x e^{-\lambda}}{x!}, \quad x = 0, 1, 2, \dots $$

Where:

  • $\lambda$ (lambda): The average rate of occurrence of the event within the given interval.
  • $e$: Euler's number, the base of the natural logarithm (approximately 2.71828).
  • $x$: The actual number of occurrences of the event.

  • Discrete vs. Continuous Distributions: Discrete distributions deal with countable outcomes, while continuous distributions deal with outcomes that can take any value within a range.
  • Probability Mass Function (PMF): The function that gives the probability that a discrete random variable is exactly equal to some value.

Further Exploration

  • Binomial vs. Bernoulli: The Binomial distribution is essentially a sum of independent Bernoulli trials.
  • Geometric vs. Negative Binomial: The Geometric distribution is a special case of the Negative Binomial distribution where $k=1$ (the number of trials to get the first success).

Frequently Asked Questions

Q1: What is a discrete probability distribution? A: A discrete probability distribution describes the probability of each possible outcome for a random variable that can only take on a finite or countably infinite number of distinct values.

Q2: Explain the probability formula for a Discrete Uniform Distribution. A: For a Discrete Uniform Distribution with $n$ equally likely outcomes, the probability of any specific outcome ($x$) is simply $1/n$, meaning each outcome has an equal chance of occurring.

Q3: What kind of events does a Bernoulli Distribution model? A: The Bernoulli distribution models single, independent trials with only two possible outcomes, typically labeled as "success" (1) and "failure" (0).

Q4: How is the Binomial Distribution different from the Bernoulli Distribution? A: A Bernoulli distribution models a single trial, while the Binomial distribution models the total number of successes over a fixed number of independent Bernoulli trials.

Q5: Give the formula for the probability of success in a Binomial Distribution. A: The formula for the probability of observing exactly $x$ successes in $n$ trials is $P(X = x) = \binom{n}{x} p^x (1 - p)^{n - x}$.

Q6: What does the Geometric Distribution represent? A: The Geometric distribution represents the probability of needing a certain number of trials ($x$) to achieve the very first success in a series of independent trials, each with the same probability of success ($p$).

Q7: How do you interpret the Negative Binomial Distribution? A: The Negative Binomial distribution gives the probability that it takes a total of $x$ trials to achieve a specified number of successes ($k$).

Q8: What is the role of $\lambda$ in the Poisson Distribution? A: In the Poisson distribution, $\lambda$ (lambda) represents the average rate or expected number of events occurring in a fixed interval of time or space.

Q9: In what scenarios is the Poisson Distribution used? A: The Poisson distribution is used for modeling count data, such as the number of events in a fixed time period, spatial unit, or volume, assuming a constant average rate and independence between events.

Q10: Can a Poisson Distribution be used for time-based event modeling? A: Yes, the Poisson distribution is very commonly used for time-based event modeling, for example, to predict the number of calls received by a call center per hour, or the number of cars passing a certain point on a highway per minute.