Lognormal Distribution Mean & Variance | ML Concepts
Understand the mean and variance of the lognormal distribution. Learn the formulas based on its underlying normal distribution, crucial for ML and data analysis.
19.3 Mean and Variance of the Lognormal Distribution
The lognormal distribution is a continuous probability distribution of a random variable whose natural logarithm is normally distributed. This documentation outlines the formulas for calculating the mean (expected value) and variance of a lognormal distribution, based on the parameters of its underlying normal distribution.
Mean of a Lognormal Distribution
The mean, or expected value, of a lognormal distribution represents the arithmetic average of the lognormal variable itself, not the average of its natural logarithm. It is calculated using the mean ($\mu$) and standard deviation ($\sigma$) of the natural logarithm of the data.
Formula for the Mean
The mean ($\mu_x$) of a lognormal distribution is given by:
$$ \mu_x = e^{(\mu + \frac{\sigma^2}{2})} $$
Where:
- $\mu$: The mean of the natural logarithm of the data, $ln(x)$.
- $\sigma$: The standard deviation of the natural logarithm of the data, $ln(x)$.
- $e$: Euler's number, approximately $2.71828$.
Explanation: This formula highlights that the mean of the lognormal variable is influenced by both the central tendency ($\mu$) and the spread ($\sigma$) of the underlying normal distribution of its logarithm. The addition of $\frac{\sigma^2}{2}$ to $\mu$ in the exponent accounts for the Jensen's inequality effect, as the exponential function is convex.
Variance of a Lognormal Distribution
The variance of a lognormal distribution quantifies the spread or dispersion of the lognormal variable around its mean. It is also derived from the mean ($\mu$) and standard deviation ($\sigma$) of the natural logarithm of the data.
Formula for the Variance
The variance ($\sigma_x^2$) of a lognormal distribution is calculated as:
$$ \sigma_x^2 = (e^{\sigma^2} - 1) \times e^{(2\mu + \sigma^2)} $$
Where:
- $\sigma$: The standard deviation of the natural logarithm of the data, $ln(x)$.
- $\mu$: The mean of the natural logarithm of the data, $ln(x)$.
- $e$: Euler's number, approximately $2.71828$.
Explanation: This formula demonstrates that the variance of the lognormal variable is heavily dependent on the standard deviation of its logarithmic transformation. As $\sigma$ increases, the variance of the lognormal variable grows rapidly, reflecting a wider spread of data. The term $(e^{\sigma^2} - 1)$ acts as a multiplier that scales the exponential of the mean and variance of the log-transformed data.
Summary of Formulas
Parameter | Formula |
---|---|
Mean ($\mu_x$) | $e^{(\mu + \frac{\sigma^2}{2})}$ |
Variance ($\sigma_x^2$) | $(e^{\sigma^2} - 1) \times e^{(2\mu + \sigma^2)}$ |
Key Properties and Notes
- Underlying Normal Distribution: These formulas are valid only when the natural logarithm of the data ($ln(x)$) is normally distributed.
- Distinction from Logarithm's Statistics: The mean and variance of a lognormal variable are not equal to the arithmetic mean and variance of its logarithmic-transformed data.
- Skewness: The lognormal distribution is positively skewed. This means that the tail on the right side of the distribution is longer or fatter than the left side. Consequently, for a lognormal distribution:
- Mean > Median > Mode
- Data Transformation: Logarithmic transformations are crucial when dealing with data that exhibits lognormal characteristics, allowing us to utilize the properties of the normal distribution for analysis.
Examples
Consider a dataset where the natural logarithm of the values ($ln(x)$) has a mean ($\mu$) of 2 and a standard deviation ($\sigma$) of 0.5.
-
Calculate the Mean: $$ \mu_x = e^{(2 + \frac{0.5^2}{2})} = e^{(2 + \frac{0.25}{2})} = e^{(2 + 0.125)} = e^{2.125} \approx 8.374 $$ The mean of the lognormal variable is approximately 8.374.
-
Calculate the Variance: $$ \sigma_x^2 = (e^{0.5^2} - 1) \times e^{(2 \times 2 + 0.5^2)} $$ $$ \sigma_x^2 = (e^{0.25} - 1) \times e^{(4 + 0.25)} $$ $$ \sigma_x^2 = (e^{0.25} - 1) \times e^{4.25} $$ $$ \sigma_x^2 \approx (1.284 - 1) \times 70.306 $$ $$ \sigma_x^2 \approx 0.284 \times 70.306 \approx 19.973 $$ The variance of the lognormal variable is approximately 19.973.
These calculations demonstrate how to apply the formulas to find the central tendency and spread of a variable that follows a lognormal distribution, given the parameters of its logarithm.
Relevant SEO Keywords
- Lognormal distribution mean formula
- Lognormal distribution variance formula
- Mean of lognormal variable
- Variance of lognormal distribution
- Lognormal $\mu$ and $\sigma$ calculation
- Euler’s number in lognormal formulas
- Lognormal distribution properties
- Lognormal mean vs median vs mode
- Lognormal data transformation
- Lognormal distribution statistics
Potential Interview Questions
- How do you calculate the mean of a lognormal distribution?
- What is the formula for the variance of a lognormal distribution?
- Why is the mean of a lognormal variable different from the mean of its logarithm?
- Explain the significance of Euler’s number ($e$) in lognormal distribution formulas.
- How does the standard deviation of $ln(x)$ affect the variance of the lognormal distribution?
- What does it mean for a distribution to be positively skewed, specifically in the context of the lognormal distribution?
- Can you describe the relationship between mean, median, and mode in a lognormal distribution?
- Why do we use logarithmic transformation when working with lognormal data?
- How do the mean and variance formulas apply only when $ln(x)$ is normally distributed?
- What are common real-world examples where these lognormal mean and variance calculations are applied (e.g., income, stock prices, sizes of biological populations)?
Lognormal Distribution Curve Explained for AI & ML
Explore the lognormal distribution curve, its right-skewed shape, and its applications in AI and machine learning for modeling positive, skewed data.
Lognormal Distribution: Applications in AI & ML
Explore key applications of the lognormal distribution in AI and Machine Learning. Learn how this probability model is used for data with positive, right-skewed properties.