Normal Distribution Examples & PDF Explained | ML Statistics

Explore 18.7 practical examples of the Normal Distribution PDF (Gaussian) in Machine Learning. Understand its formula and applications for data analysis.

18.7 Examples of the Normal Distribution Probability Density Function (PDF)

The Normal Distribution, also known as the Gaussian Distribution, is a fundamental concept in statistics. Its probability density function (PDF) describes the likelihood of a continuous random variable taking on a specific value.

The Normal Distribution PDF Formula

The formula for the probability density function (PDF) of a normal distribution is:

$f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}$

Where:

  • $f(x)$: The value of the probability density function at point $x$. This represents the relative likelihood for a given value of $x$.
  • $\mu$ (mu): The mean of the distribution. It represents the center of the distribution.
  • $\sigma$ (sigma): The standard deviation of the distribution. It measures the spread or dispersion of the data from the mean. A larger standard deviation indicates a wider, flatter curve, while a smaller standard deviation indicates a narrower, taller curve.
  • $\pi$ (pi): A mathematical constant approximately equal to 3.14159.
  • $e$: The base of the natural logarithm, approximately equal to 2.71828.

Understanding the PDF Values

It's important to note that the PDF value itself is not a probability. For continuous distributions, the probability of a random variable taking on any exact single value is zero. Instead, the PDF tells us about the density of probability around a specific point. The area under the PDF curve between two points represents the probability that the random variable falls within that range.

Examples of PDF Calculation

Let's illustrate the calculation of the PDF with a couple of examples.

Example 1: Find the PDF at $x = 18$

Given:

  • $x = 18$ (the value at which we want to find the PDF)
  • $\mu = 20$ (the mean of the distribution)
  • $\sigma = 4$ (the standard deviation of the distribution)

Step-by-step Solution:

  1. Substitute the given values into the PDF formula: $f(18) = \frac{1}{4 \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{18 - 20}{4}\right)^2}$

  2. Calculate the term inside the parentheses: $\frac{18 - 20}{4} = \frac{-2}{4} = -0.5$

  3. Square the result: $(-0.5)^2 = 0.25$

  4. Calculate the exponent: $-\frac{1}{2} (0.25) = -0.125$

  5. Calculate the terms in the denominator: $\sqrt{2 \pi} \approx \sqrt{2 \times 3.14159} \approx \sqrt{6.28318} \approx 2.5066$ $4 \sqrt{2 \pi} \approx 4 \times 2.5066 \approx 10.0264$

  6. Calculate $e$ raised to the power of the exponent: $e^{-0.125} \approx 0.8825$

  7. Combine the terms: $f(18) = \frac{1}{10.0264} \times 0.8825$ $f(18) \approx 0.09974 \times 0.8825$ $f(18) \approx 0.08806$

Final Answer:

The PDF value at $x = 18$ is approximately $0.08806$.

Example 2: Find the PDF at $x = 30$

Given:

  • $x = 30$
  • $\mu = 25$
  • $\sigma = 5$

Step-by-step Solution:

  1. Substitute the given values into the PDF formula: $f(30) = \frac{1}{5 \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{30 - 25}{5}\right)^2}$

  2. Calculate the term inside the parentheses: $\frac{30 - 25}{5} = \frac{5}{5} = 1$

  3. Square the result: $(1)^2 = 1$

  4. Calculate the exponent: $-\frac{1}{2} (1) = -0.5$

  5. Calculate the terms in the denominator: $\sqrt{2 \pi} \approx 2.5066$ $5 \sqrt{2 \pi} \approx 5 \times 2.5066 \approx 12.533$

  6. Calculate $e$ raised to the power of the exponent: $e^{-0.5} \approx 0.6065$

  7. Combine the terms: $f(30) = \frac{1}{12.533} \times 0.6065$ $f(30) \approx 0.07978 \times 0.6065$ $f(30) \approx 0.04839$

Final Answer:

The PDF value at $x = 30$ is approximately $0.04839$.

Key Takeaways

  • The Normal Distribution PDF is characterized by its mean ($\mu$) and standard deviation ($\sigma$).
  • The mean ($\mu$) dictates the center of the distribution.
  • The standard deviation ($\sigma$) dictates the spread. A larger $\sigma$ leads to a wider, flatter curve, while a smaller $\sigma$ leads to a narrower, taller curve.
  • The PDF value itself represents probability density, not direct probability. The area under the curve over a range is the probability for that range.
  • As $x$ moves further away from the mean ($\mu$), the PDF value decreases, indicating a lower likelihood of observing values far from the mean.

Frequently Asked Questions

Q1: What is the formula for the probability density function (PDF) of a normal distribution? A1: The formula is $f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}$.

Q2: How do you calculate the PDF value for a given $x$ in a normal distribution? A2: You substitute the values of $x$, $\mu$, and $\sigma$ into the PDF formula and perform the necessary arithmetic operations.

Q3: What roles do mean ($\mu$) and standard deviation ($\sigma$) play in the normal distribution? A3: The mean ($\mu$) sets the center of the distribution, and the standard deviation ($\sigma$) determines the spread or width of the distribution.

Q4: Explain the significance of $e$ and $\pi$ in the normal distribution formula. A4: $\pi$ is a fundamental mathematical constant related to circles, and $e$ is the base of the natural logarithm. Together with the standard deviation, they scale and shape the bell curve, ensuring that the total area under the curve is equal to 1 (representing 100% probability).

Q5: How does the PDF of a normal distribution change as $x$ moves away from the mean? A5: As $x$ moves further from the mean ($\mu$) in either direction, the PDF value $f(x)$ decreases, meaning the probability density is lower for values that are further from the center of the distribution.

Q6: Can you walk through the step-by-step process to calculate PDF at a specific $x$ value? A6: Yes, it involves substituting the given $x$, $\mu$, and $\sigma$ into the formula and calculating the terms step-by-step, paying attention to the exponent and the denominator. Examples 1 and 2 above demonstrate this process.

Q7: Why is the normal distribution also called the Gaussian distribution? A7: It is named after the German mathematician Carl Friedrich Gauss, who extensively studied and applied it.

Q8: How would you interpret the PDF value in terms of probability? A8: A higher PDF value at a specific point indicates that values around that point are more likely to occur than values around a point with a lower PDF value. The area under the curve between two points gives the actual probability of the random variable falling within that interval.

Q9: What are some practical applications of the normal distribution PDF? A9: It's used in fields like quality control, finance (modeling stock prices), genetics, psychology (measuring IQ scores), and many areas of scientific research to model natural phenomena.

Q10: How can understanding the normal distribution help in real-world data analysis? A10: It allows analysts to make predictions, understand variability, test hypotheses, and build models that approximate many real-world processes, enabling better decision-making and insights from data.