Probability Theorems: Rules & Examples for AI

Master fundamental probability theorems essential for AI & Machine Learning. Explore key concepts, rules, and practical examples for calculating likelihoods.

7. Probability Theorems

This chapter delves into fundamental probability theorems, providing the foundational rules for calculating probabilities in various scenarios. We'll explore key concepts and illustrate them with examples.

7.1 What is Probability?

Probability is a measure of the likelihood that an event will occur. It is expressed as a number between 0 and 1, inclusive, where:

  • 0 indicates impossibility (the event cannot happen).
  • 1 indicates certainty (the event is guaranteed to happen).
  • Values between 0 and 1 represent varying degrees of likelihood.

The basic formula for probability is:

$P(E) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}$

Where:

  • $P(E)$ is the probability of event $E$ occurring.

7.2 Probability Theorems

Probability theorems are a set of rules that govern how probabilities combine and interact. They are essential for solving more complex probability problems and for understanding statistical inference.

7.3 Theorem of Complementary Events

The theorem of complementary events states that the probability of an event not occurring is equal to 1 minus the probability of the event occurring.

Formula: $P(A') = 1 - P(A)$

Where:

  • $A'$ represents the complement of event $A$ (i.e., event $A$ does not occur).
  • $P(A')$ is the probability of event $A'$ occurring.
  • $P(A)$ is the probability of event $A$ occurring.

Explanation: An event and its complement are mutually exclusive and together they cover all possible outcomes. Therefore, their probabilities must sum to 1.

Example: If the probability of rain tomorrow is $0.6$ ($P(\text{Rain}) = 0.6$), then the probability of it not raining is: $P(\text{No Rain}) = 1 - P(\text{Rain}) = 1 - 0.6 = 0.4$

7.4 Theorem of Addition

The theorem of addition deals with the probability of either one event or another event occurring. It has two main forms:

7.4.1 For Mutually Exclusive Events

Mutually exclusive events are events that cannot occur at the same time.

Formula: $P(A \cup B) = P(A) + P(B)$

Where:

  • $P(A \cup B)$ is the probability of event $A$ or event $B$ occurring.
  • $P(A)$ is the probability of event $A$ occurring.
  • $P(B)$ is the probability of event $B$ occurring.

Explanation: When events are mutually exclusive, the occurrence of one event has no impact on the probability of the other.

Example: Consider a single roll of a fair six-sided die. What is the probability of rolling a 1 or a 6?

  • The event of rolling a 1 and the event of rolling a 6 are mutually exclusive.
  • $P(\text{Roll a 1}) = \frac{1}{6}$
  • $P(\text{Roll a 6}) = \frac{1}{6}$
  • $P(\text{Roll a 1 or a 6}) = P(\text{Roll a 1}) + P(\text{Roll a 6}) = \frac{1}{6} + \frac{1}{6} = \frac{2}{6} = \frac{1}{3}$

7.4.2 For Non-Mutually Exclusive Events

Non-mutually exclusive events are events that can occur at the same time.

Formula: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$

Where:

  • $P(A \cup B)$ is the probability of event $A$ or event $B$ occurring.
  • $P(A)$ is the probability of event $A$ occurring.
  • $P(B)$ is the probability of event $B$ occurring.
  • $P(A \cap B)$ is the probability of both event $A$ and event $B$ occurring.

Explanation: We subtract $P(A \cap B)$ because the outcomes where both $A$ and $B$ occur are counted twice when we sum $P(A)$ and $P(B)$ (once in $P(A)$ and once in $P(B)$).

Example: In a class of 30 students, 15 like mathematics, 10 like science, and 5 like both mathematics and science. What is the probability that a randomly selected student likes mathematics or science?

  • Let $M$ be the event that a student likes mathematics. $P(M) = \frac{15}{30}$
  • Let $S$ be the event that a student likes science. $P(S) = \frac{10}{30}$
  • The event that a student likes both mathematics and science is $M \cap S$. $P(M \cap S) = \frac{5}{30}$
  • The probability that a student likes mathematics or science is: $P(M \cup S) = P(M) + P(S) - P(M \cap S) = \frac{15}{30} + \frac{10}{30} - \frac{5}{30} = \frac{25}{30} - \frac{5}{30} = \frac{20}{30} = \frac{2}{3}$

7.5 Theorem of Multiplication (Statistical Independence)

The theorem of multiplication is used to calculate the probability of two or more independent events occurring.

Definition of Statistical Independence: Two events $A$ and $B$ are statistically independent if the occurrence of one event does not affect the probability of the other event occurring.

Formula: $P(A \cap B) = P(A) \times P(B)$

Where:

  • $P(A \cap B)$ is the probability of both event $A$ and event $B$ occurring.
  • $P(A)$ is the probability of event $A$ occurring.
  • $P(B)$ is the probability of event $B$ occurring.

Explanation: For independent events, the probability of both happening is simply the product of their individual probabilities.

Example: What is the probability of flipping a fair coin and getting heads twice in a row?

  • Let $H_1$ be the event of getting heads on the first flip. $P(H_1) = 0.5$
  • Let $H_2$ be the event of getting heads on the second flip. $P(H_2) = 0.5$
  • The two coin flips are independent events.
  • $P(\text{Heads on first flip AND Heads on second flip}) = P(H_1) \times P(H_2) = 0.5 \times 0.5 = 0.25$

7.5.1 For Dependent Events (Conditional Probability)

When events are dependent, the probability of both occurring involves conditional probability.

Formula: $P(A \cap B) = P(A) \times P(B|A)$ or $P(A \cap B) = P(B) \times P(A|B)$

Where:

  • $P(B|A)$ is the conditional probability of event $B$ occurring given that event $A$ has already occurred.
  • $P(A|B)$ is the conditional probability of event $A$ occurring given that event $B$ has already occurred.

Example: Suppose you have a bag with 5 red marbles and 3 blue marbles. You draw two marbles without replacement. What is the probability that both marbles are red?

  • Let $R_1$ be the event of drawing a red marble on the first draw. $P(R_1) = \frac{5}{8}$
  • Let $R_2$ be the event of drawing a red marble on the second draw, given that the first was red.
    • After drawing one red marble, there are 4 red marbles left and a total of 7 marbles.
    • $P(R_2|R_1) = \frac{4}{7}$
  • The probability of both marbles being red is: $P(R_1 \cap R_2) = P(R_1) \times P(R_2|R_1) = \frac{5}{8} \times \frac{4}{7} = \frac{20}{56} = \frac{5}{14}$

7.6 Theorem of Total Probability

The theorem of total probability allows us to calculate the probability of an event by considering all possible mutually exclusive and exhaustive ways it can occur.

Formula: If $B_1, B_2, \dots, B_n$ are mutually exclusive and exhaustive events (meaning one of them must occur, and they cannot occur together), then the probability of an event $A$ is:

$P(A) = P(A|B_1)P(B_1) + P(A|B_2)P(B_2) + \dots + P(A|B_n)P(B_n)$

This can be written using summation notation as: $P(A) = \sum_{i=1}^{n} P(A|B_i)P(B_i)$

Explanation: This theorem breaks down the probability of event $A$ into conditional probabilities based on a set of partitions of the sample space. It's like asking, "What's the probability of A happening through scenario 1, plus the probability of A happening through scenario 2, and so on?"

Example: A company has two factories that produce the same type of product. Factory A produces 60% of the products, and Factory B produces 40%. Factory A has a defect rate of 5%, while Factory B has a defect rate of 10%. If a product is chosen at random, what is the probability that it is defective?

  • Let $D$ be the event that a product is defective.
  • Let $A$ be the event that the product was made in Factory A. $P(A) = 0.60$
  • Let $B$ be the event that the product was made in Factory B. $P(B) = 0.40$
  • The events $A$ and $B$ are mutually exclusive and exhaustive (a product must come from either A or B).
  • The probability of a defect given it came from Factory A is $P(D|A) = 0.05$.
  • The probability of a defect given it came from Factory B is $P(D|B) = 0.10$.

Using the theorem of total probability: $P(D) = P(D|A)P(A) + P(D|B)P(B)$ $P(D) = (0.05)(0.60) + (0.10)(0.40)$ $P(D) = 0.030 + 0.040$ $P(D) = 0.070$

So, the probability that a randomly chosen product is defective is 0.07 or 7%.