Understand the mean and variance of the lognormal distribution. Learn the formulas based on its underlying normal distribution, crucial for ML and data analysis.

19.3 Mean and Variance of the Lognormal Distribution

The lognormal distribution is a continuous probability distribution of a random variable whose natural logarithm is normally distributed. This documentation outlines the formulas for calculating the mean (expected value) and variance of a lognormal distribution, based on the parameters of its underlying normal distribution.

Mean of a Lognormal Distribution

The mean, or expected value, of a lognormal distribution represents the arithmetic average of the lognormal variable itself, not the average of its natural logarithm. It is calculated using the mean ($\mu$) and standard deviation ($\sigma$) of the natural logarithm of the data.

Formula for the Mean

The mean ($\mu_x$) of a lognormal distribution is given by:

$$ \mu_x = e^{(\mu + \frac{\sigma^2}{2})} $$

Where:

$\mu$: The mean of the natural logarithm of the data, $ln(x)$.
$\sigma$: The standard deviation of the natural logarithm of the data, $ln(x)$.
$e$: Euler's number, approximately $2.71828$.

Explanation: This formula highlights that the mean of the lognormal variable is influenced by both the central tendency ($\mu$) and the spread ($\sigma$) of the underlying normal distribution of its logarithm. The addition of $\frac{\sigma^2}{2}$ to $\mu$ in the exponent accounts for the Jensen's inequality effect, as the exponential function is convex.

Variance of a Lognormal Distribution

The variance of a lognormal distribution quantifies the spread or dispersion of the lognormal variable around its mean. It is also derived from the mean ($\mu$) and standard deviation ($\sigma$) of the natural logarithm of the data.

Formula for the Variance

The variance ($\sigma_x^2$) of a lognormal distribution is calculated as:

$$ \sigma_x^2 = (e^{\sigma^2} - 1) \times e^{(2\mu + \sigma^2)} $$

Where:

$\sigma$: The standard deviation of the natural logarithm of the data, $ln(x)$.
$\mu$: The mean of the natural logarithm of the data, $ln(x)$.
$e$: Euler's number, approximately $2.71828$.

Explanation: This formula demonstrates that the variance of the lognormal variable is heavily dependent on the standard deviation of its logarithmic transformation. As $\sigma$ increases, the variance of the lognormal variable grows rapidly, reflecting a wider spread of data. The term $(e^{\sigma^2} - 1)$ acts as a multiplier that scales the exponential of the mean and variance of the log-transformed data.

Summary of Formulas

Parameter	Formula
Mean ($\mu_x$)	$e^{(\mu + \frac{\sigma^2}{2})}$
Variance ($\sigma_x^2$)	$(e^{\sigma^2} - 1) \times e^{(2\mu + \sigma^2)}$

Key Properties and Notes

Underlying Normal Distribution: These formulas are valid only when the natural logarithm of the data ($ln(x)$) is normally distributed.
Distinction from Logarithm's Statistics: The mean and variance of a lognormal variable are not equal to the arithmetic mean and variance of its logarithmic-transformed data.
Skewness: The lognormal distribution is positively skewed. This means that the tail on the right side of the distribution is longer or fatter than the left side. Consequently, for a lognormal distribution:
- Mean > Median > Mode
Data Transformation: Logarithmic transformations are crucial when dealing with data that exhibits lognormal characteristics, allowing us to utilize the properties of the normal distribution for analysis.

Examples

Consider a dataset where the natural logarithm of the values ($ln(x)$) has a mean ($\mu$) of 2 and a standard deviation ($\sigma$) of 0.5.

Calculate the Mean: $$ \mu_x = e^{(2 + \frac{0.5^2}{2})} = e^{(2 + \frac{0.25}{2})} = e^{(2 + 0.125)} = e^{2.125} \approx 8.374 $$ The mean of the lognormal variable is approximately 8.374.
Calculate the Variance: $$ \sigma_x^2 = (e^{0.5^2} - 1) \times e^{(2 \times 2 + 0.5^2)} $$ $$ \sigma_x^2 = (e^{0.25} - 1) \times e^{(4 + 0.25)} $$ $$ \sigma_x^2 = (e^{0.25} - 1) \times e^{4.25} $$ $$ \sigma_x^2 \approx (1.284 - 1) \times 70.306 $$ $$ \sigma_x^2 \approx 0.284 \times 70.306 \approx 19.973 $$ The variance of the lognormal variable is approximately 19.973.

These calculations demonstrate how to apply the formulas to find the central tendency and spread of a variable that follows a lognormal distribution, given the parameters of its logarithm.

Relevant SEO Keywords

Lognormal distribution mean formula
Lognormal distribution variance formula
Mean of lognormal variable
Variance of lognormal distribution
Lognormal $\mu$ and $\sigma$ calculation
Euler’s number in lognormal formulas
Lognormal distribution properties
Lognormal mean vs median vs mode
Lognormal data transformation
Lognormal distribution statistics

Potential Interview Questions

How do you calculate the mean of a lognormal distribution?
What is the formula for the variance of a lognormal distribution?
Why is the mean of a lognormal variable different from the mean of its logarithm?
Explain the significance of Euler’s number ($e$) in lognormal distribution formulas.
How does the standard deviation of $ln(x)$ affect the variance of the lognormal distribution?
What does it mean for a distribution to be positively skewed, specifically in the context of the lognormal distribution?
Can you describe the relationship between mean, median, and mode in a lognormal distribution?
Why do we use logarithmic transformation when working with lognormal data?
How do the mean and variance formulas apply only when $ln(x)$ is normally distributed?
What are common real-world examples where these lognormal mean and variance calculations are applied (e.g., income, stock prices, sizes of biological populations)?

Lognormal Distribution Mean & Variance | ML Concepts