Measures of Central Tendency: Mean, Median, Mode Explained

Understand measures of central tendency like mean, median, and mode. Learn how these statistical values summarize data for AI & machine learning.

Measures of Central Tendency

Measures of central tendency are statistical values that represent the center or typical value of a dataset. They provide a single value that summarizes the data, indicating where most of the data points tend to cluster. The most common measures of central tendency are the mean, median, and mode.

Mean

The mean, often referred to as the average, is calculated by summing all the values in a dataset and then dividing by the total number of values.

Formula

$$ \text{Mean} (\bar{x}) = \frac{\sum_{i=1}^{n} x_i}{n} $$

Where:

  • $\sum_{i=1}^{n} x_i$ is the sum of all values in the dataset.
  • $n$ is the total number of values in the dataset.

Example

Consider the following dataset: ${10, 12, 15, 18, 20}$

The sum of the values is $10 + 12 + 15 + 18 + 20 = 75$. The number of values is $n=5$.

Therefore, the mean is: $$ \bar{x} = \frac{75}{5} = 15 $$

The mean is sensitive to outliers (extreme values).

Median

The median is the middle value in a dataset that has been ordered from least to greatest. If the dataset has an even number of values, the median is the average of the two middle values.

Calculation

  1. Order the data: Arrange the dataset in ascending or descending order.
  2. Find the middle value:
    • If the number of data points ($n$) is odd, the median is the value at the $\frac{n+1}{2}$ position.
    • If the number of data points ($n$) is even, the median is the average of the values at the $\frac{n}{2}$ and $\frac{n}{2} + 1$ positions.

Example

Consider the following dataset: ${10, 12, 15, 18, 20}$

  1. The data is already ordered.
  2. The number of values is $n=5$ (odd). The middle position is $\frac{5+1}{2} = 3$.
  3. The value at the 3rd position is 15.

Therefore, the median is 15.

Consider another dataset: ${10, 12, 15, 18, 20, 22}$

  1. The data is already ordered.
  2. The number of values is $n=6$ (even). The middle positions are $\frac{6}{2} = 3$ and $\frac{6}{2} + 1 = 4$.
  3. The values at the 3rd and 4th positions are 15 and 18.
  4. The median is the average of these two values: $\frac{15 + 18}{2} = \frac{33}{2} = 16.5$.

The median is less affected by outliers than the mean.

Mode

The mode is the value that appears most frequently in a dataset. A dataset can have one mode (unimodal), more than one mode (multimodal), or no mode if all values appear with the same frequency.

Example

Consider the following dataset: ${10, 12, 15, 15, 18, 20, 20, 20, 22}$

The value 20 appears three times, which is more frequent than any other value.

Therefore, the mode is 20.

Consider the dataset: ${10, 12, 15, 18, 20}$ All values appear only once, so there is no mode.

Consider the dataset: ${10, 12, 12, 15, 15, 18, 20}$ Both 12 and 15 appear twice.

Therefore, this dataset is bimodal with modes 12 and 15.

The mode is useful for identifying the most common outcome in a categorical or discrete dataset.

Measures of Central Tendency: Mean, Median, Mode Explained