Standard Normal Distribution in Machine Learning
Learn how to convert any normal distribution into a standard normal distribution (mean=0, std dev=1) for simplified analysis in machine learning.
18.2 Standard Normal Distribution
A normal distribution is defined by its mean ($\mu$) and standard deviation ($\sigma$), which determine the shape and position of its characteristic bell curve. To simplify the analysis of different normal distributions, we can convert any normal distribution into a standard normal distribution.
A standard normal distribution is a special case of the normal distribution with the following parameters:
- Mean ($\mu$) = 0
- Standard Deviation ($\sigma$) = 1
This conversion process is known as the Z-transformation, and it allows us to use a single, standardized set of probabilities (typically found in Z-tables) to analyze data from any normal distribution.
Z-Transformation Formula
The Z-transformation converts a raw score ($X$) from an original normal distribution into a Z-score, which represents the number of standard deviations that $X$ is away from the mean.
The formula is:
$$Z = \frac{X - \mu}{\sigma}$$
Where:
- $Z$: The standard normal variable (the Z-score).
- $X$: A value from the original distribution.
- $\mu$: The mean of the original distribution.
- $\sigma$: The standard deviation of the original distribution.
Properties of the Standard Normal Distribution
The standard normal distribution possesses several key characteristics:
- Symmetry: It is perfectly symmetrical about its mean, which is 0. This means that the area under the curve to the left of the mean is equal to the area to the right of the mean.
- Total Area: The total area under the entire curve is equal to 1. This represents 100% of the probability for the distribution.
- Probability Lookup: Probabilities associated with specific Z-scores (i.e., the area under the curve up to a certain point) can be found using pre-calculated Z-tables or statistical software.
Important Relationships
The symmetry of the standard normal distribution leads to several useful relationships:
-
Symmetry of Tails: $$P(Z < -z) = P(Z > z)$$ This means the probability of a Z-score being less than a negative value is the same as the probability of a Z-score being greater than the corresponding positive value.
-
Complementary Probability: $$P(Z > z) = 1 - P(Z < z)$$ The probability of a Z-score being greater than a value is 1 minus the probability of it being less than that value.
-
Combined Symmetry and Complement: $$P(Z > -z) = P(Z < z)$$ The probability of a Z-score being greater than a negative value is the same as the probability of a Z-score being less than the corresponding positive value.
Example: Calculating Probabilities
Let's consider a normal distribution with a mean ($\mu$) of 49 and a variance of 64. Given $X \sim N(49, 64)$, we have:
- Mean ($\mu$) = 49
- Standard Deviation ($\sigma$) = $\sqrt{64}$ = 8
We will calculate probabilities for various values of $X$.
(i) $P(X < 52)$
First, convert $X = 52$ to a Z-score: $$Z = \frac{52 - 49}{8} = \frac{3}{8} = 0.375$$
Now, find the probability $P(Z < 0.375)$ using a Z-table or calculator. $$P(X < 52) = P(Z < 0.375) \approx 0.6443$$
(ii) $P(X > 60)$
Convert $X = 60$ to a Z-score: $$Z = \frac{60 - 49}{8} = \frac{11}{8} = 1.375$$
To find $P(X > 60)$, we use the complementary probability rule: $$P(X > 60) = P(Z > 1.375) = 1 - P(Z < 1.375)$$
Using a Z-table for $P(Z < 1.375)$: $$P(Z < 1.375) \approx 0.91466$$
Therefore: $$P(X > 60) = 1 - 0.91466 = 0.08534$$
(iii) $P(X < 45)$
Convert $X = 45$ to a Z-score: $$Z = \frac{45 - 49}{8} = \frac{-4}{8} = -0.5$$
To find $P(X < 45)$, we need $P(Z < -0.5)$. Due to symmetry, we can use the relationship $P(Z < -z) = 1 - P(Z < z)$: $$P(X < 45) = P(Z < -0.5) = 1 - P(Z < 0.5)$$
Using a Z-table for $P(Z < 0.5)$: $$P(Z < 0.5) \approx 0.69146$$
Therefore: $$P(X < 45) = 1 - 0.69146 = 0.30854$$
Alternatively, using the direct symmetry property $P(Z < -0.5) = P(Z > 0.5)$: $$P(Z < -0.5) \approx 0.30854$$
(iv) $P(|X - 49| < 5)$
First, rewrite the absolute value inequality: $$|X - 49| < 5 \quad \text{is equivalent to} \quad -5 < X - 49 < 5$$
Add 49 to all parts of the inequality: $$49 - 5 < X < 49 + 5$$ $$44 < X < 54$$
Now, convert both limits to Z-scores: For $X = 44$: $$Z_1 = \frac{44 - 49}{8} = \frac{-5}{8} = -0.625$$
For $X = 54$: $$Z_2 = \frac{54 - 49}{8} = \frac{5}{8} = 0.625$$
We need to find $P(44 < X < 54)$, which is equivalent to $P(-0.625 < Z < 0.625)$: $$P(-0.625 < Z < 0.625) = P(Z < 0.625) - P(Z < -0.625)$$
Using a Z-table for $P(Z < 0.625)$: $$P(Z < 0.625) \approx 0.73401$$
Using the symmetry property $P(Z < -0.625) = 1 - P(Z < 0.625)$: $$P(Z < -0.625) = 1 - 0.73401 = 0.26599$$
Therefore: $$P(44 < X < 54) = 0.73401 - 0.26599 = 0.46802$$
Final Probabilities Summary:
- $P(X < 52) \approx 0.6443$
- $P(X > 60) \approx 0.08534$
- $P(X < 45) \approx 0.30854$
- $P(|X - 49| < 5) \approx 0.46802$
SEO Keywords
Standard normal distribution, Z-transformation formula, Convert normal to standard normal, Z-score calculation, Properties of standard normal curve, Z-table probability lookup, Normal distribution probability examples, Symmetry in standard normal distribution, Calculating probabilities with Z-scores, Applications of Z-transformation.
Interview Questions
- What is the primary purpose of converting a normal distribution to a standard normal distribution?
- How do you calculate the Z-score for a given value from a normal distribution?
- What are the fundamental properties that define the standard normal distribution?
- Can you explain the Z-transformation formula and the meaning of each component?
- Describe the process of using Z-tables to determine probabilities for a standard normal distribution.
- Why is the standard normal distribution described as being symmetric about zero?
- How would you compute the probability $P(X < a)$ given the mean and standard deviation of a normal distribution $X$?
- Explain the method for finding the probability that a value from a normal distribution falls between two specific points.
- What is the mathematical relationship between $P(Z < -z)$ and $P(Z > z)$ in the context of the standard normal distribution?
- Provide an example of how the Z-transformation can be applied in real-world data analysis scenarios.
Normal Distribution PDF: Understanding the Bell Curve in ML
Explore the Probability Density Function (PDF) of the Normal Distribution. Learn how its bell curve, mean, and standard deviation are crucial in Machine Learning.
Normal Distribution Properties: Mean, Median, Mode | AI Stats
Explore the essential properties of the Normal (Gaussian) Distribution, a key concept in AI & Machine Learning. Understand mean, median, and mode equality.