Lognormal Distribution in Business Statistics & AI

Explore the Lognormal Distribution in business statistics & AI. Understand its PDF, applications, and differences from the Normal Distribution for data analysis.

19. Lognormal Distribution in Business Statistics

This document provides a comprehensive overview of the Lognormal Distribution, a crucial concept in business statistics. We will explore its mathematical properties, visual representation, applications, and distinctions from the Normal Distribution.

19.1 Probability Density Function (PDF) of Lognormal Distribution

The Lognormal Distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. If a random variable $X$ follows a lognormal distribution, then $Y = \ln(X)$ follows a normal distribution with mean $\mu$ and standard deviation $\sigma$.

The probability density function (PDF) of a lognormal distribution is given by:

$$ f(x; \mu, \sigma) = \frac{1}{x\sigma\sqrt{2\pi}} e^{-\frac{(\ln(x) - \mu)^2}{2\sigma^2}} $$

where:

  • $x > 0$ is the value of the random variable.
  • $\mu$ is the mean of the natural logarithm of $X$.
  • $\sigma$ is the standard deviation of the natural logarithm of $X$.

Key characteristics of the PDF:

  • The distribution is defined for positive values only ($x > 0$).
  • The shape of the distribution is heavily influenced by the parameters $\mu$ and $\sigma$.

19.2 Lognormal Distribution Curve

The shape of the lognormal distribution curve is right-skewed. This means that the tail of the distribution extends towards higher values, with a concentration of probability mass at lower values.

The degree of skewness is directly related to the standard deviation ($\sigma$) of the underlying normal distribution:

  • As $\sigma$ increases, the distribution becomes more skewed to the right, and the peak shifts towards zero.
  • As $\sigma$ decreases, the distribution becomes less skewed and approaches a shape closer to a sharp spike at a small positive value.

The mean ($\mu$) of the underlying normal distribution shifts the location of the entire distribution along the x-axis.

Visual Representation:

A typical lognormal distribution curve will have:

  • A peak at a value less than the mean.
  • A long tail extending to the right.
[Imagine a graph here with a right-skewed curve.
X-axis represents the variable X (positive values).
Y-axis represents the probability density.
The curve starts at 0, rises to a peak, and then gradually decreases, extending far to the right.]

19.3 Mean and Variance of Lognormal Distribution

For a lognormal distribution with parameters $\mu$ (mean of $\ln(X)$) and $\sigma$ (standard deviation of $\ln(X)$):

  • Mean (Expected Value): $$ E[X] = e^{\mu + \frac{\sigma^2}{2}} $$

  • Variance: $$ Var(X) = (e^{\sigma^2} - 1) e^{2\mu + \sigma^2} $$

  • Standard Deviation: $$ SD(X) = \sqrt{Var(X)} = \sqrt{(e^{\sigma^2} - 1) e^{2\mu + \sigma^2}} $$

Interpretation: The mean and variance of a lognormal distribution are significantly affected by both $\mu$ and $\sigma$. Even small changes in $\sigma$ can lead to substantial changes in the variance and the shape of the distribution.

19.4 Applications of Lognormal Distribution

The lognormal distribution is frequently used to model phenomena that are inherently positive and exhibit positive skewness. Some common applications in business statistics include:

  • Income Distribution: Individual incomes are often positively skewed, with a few high earners and many individuals with lower incomes.
  • Stock Prices: Changes in stock prices, especially over longer periods, can be modeled using lognormal distributions.
  • Size of Businesses: The sizes of companies within an industry can vary greatly, with many small businesses and a few very large corporations.
  • Waiting Times: In some queuing systems, especially those with infrequent but long waiting times, the lognormal distribution can be applicable.
  • Sales Data: Sales figures for products or services can also exhibit lognormal characteristics.
  • Reliability Engineering: The lifetime of components or systems that degrade over time can sometimes be modeled using the lognormal distribution.

19.5 Examples of Lognormal Distribution

Example 1: Modeling Income

Suppose the natural logarithm of annual incomes in a particular industry is normally distributed with a mean ($\mu$) of $10.5$ and a standard deviation ($\sigma$) of $0.8$.

  • Mean Income: $E[X] = e^{10.5 + \frac{0.8^2}{2}} = e^{10.5 + 0.32} = e^{10.82} \approx 45100$ (in dollars)

  • Variance of Income: $Var(X) = (e^{0.8^2} - 1) e^{2 \times 10.5 + 0.8^2} = (e^{0.64} - 1) e^{21 + 0.64} = (1.896 - 1) e^{21.64} \approx 0.896 \times 1.38 \times 10^9 \approx 1.23 \times 10^9$

This example illustrates how to calculate the expected income and its variability when the underlying logarithm of income follows a normal distribution.

Example 2: Analyzing Stock Price Changes

If we consider the daily percentage change in a stock price, and its logarithm is normally distributed with $\mu = 0.001$ and $\sigma = 0.02$.

  • Expected Daily Percentage Change: $E[X] = e^{0.001 + \frac{0.02^2}{2}} = e^{0.001 + 0.0002} = e^{0.0012} \approx 1.0012$, indicating an expected daily increase of approximately $0.12%$.

  • Standard Deviation of Daily Percentage Change: $SD(X) = \sqrt{(e^{0.02^2} - 1) e^{2 \times 0.001 + 0.02^2}} = \sqrt{(e^{0.0004} - 1) e^{0.002 + 0.0004}} \approx \sqrt{(1.0004 - 1) e^{0.0024}} \approx \sqrt{0.0004 \times 1.0024} \approx \sqrt{0.000401} \approx 0.0200$

This suggests that the typical daily percentage change is around $0.12%$, with a standard deviation of $2%$.

19.6 Difference Between Normal Distribution and Lognormal Distribution

While related, the Normal and Lognormal distributions have fundamental differences, primarily in their support and shape.

FeatureNormal DistributionLognormal Distribution
SupportDefined for all real numbers ($-\infty$ to $+\infty$).Defined only for positive real numbers ($0$ to $+\infty$).
ShapeSymmetric and bell-shaped.Right-skewed; has a long tail towards higher values.
ParametersMean ($\mu$) and Standard Deviation ($\sigma$).$\mu$ and $\sigma$ of the logarithm of the variable.
PDF Formula$f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$f(x) = \frac{1}{x\sigma\sqrt{2\pi}} e^{-\frac{(\ln(x)-\mu)^2}{2\sigma^2}}$
InterpretationSuitable for phenomena that can take any value, positive or negative, and are symmetrically distributed around the mean.Suitable for phenomena that must be positive and tend to have a disproportionately large number of smaller values compared to a few very large values.
RelationshipIf $X \sim N(\mu, \sigma^2)$, then $e^X$ follows a lognormal distribution.If $X$ follows a lognormal distribution, then $\ln(X)$ follows a normal distribution.
Example Use CasesHeights, IQ scores, measurement errors.Incomes, stock prices, sizes of entities.

Key Takeaway: The lognormal distribution is derived from the normal distribution by exponentiating a normally distributed variable. This transformation naturally restricts the domain to positive values and introduces positive skewness, making it suitable for modeling a different class of real-world phenomena than the standard normal distribution.