Mode: Central Tendency in Data - AI & ML Explained
Understand the mode, a key central tendency measure in AI & ML. Learn about unimodal, bimodal, multimodal, and no-mode datasets with calculation methods.
Mode
The mode is a measure of central tendency that represents the value occurring with the highest frequency within a dataset.
A dataset can exhibit different types of modes:
- Unimodal: Contains a single mode.
- Bimodal: Contains two modes.
- Multimodal: Contains more than two modes.
- No Mode: All values appear with the same frequency.
Calculating the Mode
The method for calculating the mode depends on whether the data is ungrouped or grouped.
For Ungrouped Data
For ungrouped data (a list of individual values), the mode is found by simply identifying the value that appears most frequently. There is no specific formula; it's an observation-based process.
Example:
Consider the following dataset:
3, 7, 7, 9, 10, 10, 10
In this dataset, the number 10
appears three times, which is more than any other number. Therefore, the mode is 10
.
For Grouped Data
For grouped data (data organized into classes or intervals), the mode is estimated using the following formula:
$$ \text{Mode} = L + \left[ \frac{f_1 - f_0}{(2f_1 - f_0 - f_2)} \right] \times h $$
Where:
- $L$: The lower boundary of the modal class. The modal class is the class interval with the highest frequency.
- $f_1$: The frequency of the modal class.
- $f_0$: The frequency of the class preceding the modal class.
- $f_2$: The frequency of the class succeeding the modal class.
- $h$: The class width (the difference between the upper and lower boundaries of a class interval).
Example for Grouped Data:
Let's assume we have the following frequency distribution table:
Class Interval | Frequency ($f$) |
---|---|
10 - 19 | 5 |
20 - 29 | 12 |
30 - 39 | 25 |
40 - 49 | 18 |
50 - 59 | 9 |
Steps to calculate the mode:
-
Identify the Modal Class: The class with the highest frequency is "30 - 39" with a frequency of 25.
-
Determine the Values for the Formula:
- $L$ (Lower boundary of the modal class): Assuming the class intervals are inclusive and we are using continuous data, the lower boundary of "30 - 39" is 29.5. (If the data is discrete and classes are defined as 30-39, 29 would be the boundary for the previous class, and 39.5 the boundary for the modal class. This example assumes continuous data where the boundary is typically halfway between the upper limit of the previous class and the lower limit of the current class, or the stated lower limit if it's explicitly defined as a boundary). For simplicity, let's consider the lower limit of the class as $L$, which is 30, if the data is presented with discrete limits. However, the standard definition for $L$ in grouped data mode calculation refers to the lower boundary. If the classes are 10-19, 20-29, 30-39, then the lower boundary of the 30-39 class is 29.5.
- $f_1$ (Frequency of the modal class): 25
- $f_0$ (Frequency of the class before): 12
- $f_2$ (Frequency of the class after): 18
- $h$ (Class width): 10 (e.g., 29.5 - 19.5 = 10, or 39 - 30 + 1 = 10 if inclusive discrete intervals are considered and the width is the count of integers).
-
Apply the Formula:
$$ \text{Mode} = 29.5 + \left[ \frac{25 - 12}{(2 \times 25 - 12 - 18)} \right] \times 10 $$ $$ \text{Mode} = 29.5 + \left[ \frac{13}{(50 - 12 - 18)} \right] \times 10 $$ $$ \text{Mode} = 29.5 + \left[ \frac{13}{(50 - 30)} \right] \times 10 $$ $$ \text{Mode} = 29.5 + \left[ \frac{13}{20} \right] \times 10 $$ $$ \text{Mode} = 29.5 + (0.65) \times 10 $$ $$ \text{Mode} = 29.5 + 6.5 $$ $$ \text{Mode} = 36 $$
Therefore, the estimated mode for this grouped data is 36.
Median: Middle Value in Data | AI & ML Explained
Understand the median, a key measure of central tendency in AI & Machine Learning. Learn how it represents the middle value of ordered datasets, dividing them equally.
TensorFlow: Machine Learning & Deep Learning Platform
Explore the TensorFlow documentation for a comprehensive guide to machine learning and deep learning. Learn setup, advanced architectures, and distributed computing.