Understanding Central Tendency and Measures of Spread with a Simple Dataset

Statistics may seem difficult at first, but the core concepts become easy once you understand them using real examples. In this beginner-friendly tutorial, we will learn how to calculate mean, median, mode, range, variance, standard deviation, and mean absolute deviation (MAD) using a small dataset.

Whether you are a student, data science beginner, or preparing for exams, these concepts form the foundation of descriptive statistics.


Dataset Used in This Example

We will use the following dataset throughout the article:

2, 4, 4, 5, 10

This small dataset is perfect for understanding how statistical calculations work step by step.


1. Measures of Central Tendency

Measures of central tendency help identify the “center” of the data.

Mean (Average)

The mean is the average of all values.

Formula

Mean=Sum of all valuesNumber of values\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}}Mean=Number of valuesSum of all values​

Calculation

(2+4+4+5+10)/5(2 + 4 + 4 + 5 + 10) / 5(2+4+4+5+10)/525/5=525 / 5 = 525/5=5

Mean = 5

The mean gives an overall average value for the dataset.


Median (Middle Value)

The median is the middle number after arranging the data in ascending order.

Ordered Dataset

2, 4, 4, 5, 10

The middle value is:

Median = 4

The median is useful because it is less affected by extreme values.


Mode (Most Frequent Value)

The mode is the number that appears most frequently.

In our dataset:

  • 4 appears twice
  • All other numbers appear once

Mode = 4

A dataset can have one mode, multiple modes, or no mode at all.


2. Understanding Deviations from the Mean

A deviation shows how far each value is from the mean.

Since the mean is 5, we calculate:

ValueDeviation
2-3
4-1
4-1
50
10+5

Negative values are below the mean, while positive values are above it.

These deviations help us understand data spread and variability.


3. Measures of Spread in Statistics

Measures of spread show how dispersed the data is.


Range

The range is the difference between the maximum and minimum values.

Formula

Range=Highest ValueLowest Value\text{Range} = \text{Highest Value} – \text{Lowest Value}Range=Highest Value−Lowest Value

Calculation

102=810 – 2 = 810−2=8

Range = 8

A larger range indicates more spread in the data.


Variance

Variance measures the average squared deviation from the mean.

Step 1: Square Each Deviation

(3)2+(1)2+(1)2+02+52(-3)^2 + (-1)^2 + (-1)^2 + 0^2 + 5^2(−3)2+(−1)2+(−1)2+02+529+1+1+0+25=369 + 1 + 1 + 0 + 25 = 369+1+1+0+25=36

Step 2: Divide by Number of Values

36/5=7.236 / 5 = 7.236/5=7.2

Variance = 7.2

Variance tells us how much the data varies from the average.


Standard Deviation

Standard deviation is the square root of variance.

Formula

Standard Deviation=Variance\text{Standard Deviation} = \sqrt{\text{Variance}}Standard Deviation=Variance​

Calculation

7.22.68\sqrt{7.2} \approx 2.687.2​≈2.68

Standard Deviation ≈ 2.68

Standard deviation is one of the most important statistical measures in data analysis and machine learning.


4. Mean Absolute Deviation (MAD)

MAD calculates the average absolute distance from the mean.

Unlike variance, it does not square the values.

Calculation

(3+1+1+0+5)/5(3 + 1 + 1 + 0 + 5) / 5(3+1+1+0+5)/510/5=210 / 5 = 210/5=2

MAD = 2

MAD is easier to interpret because it uses actual distances.


5. What Are Outliers?

Outliers are values that differ significantly from the rest of the dataset.

In this example:

  • Most values are between 2 and 5
  • But 10 is much higher

10 may be considered a potential outlier

Outliers can strongly influence the mean and variance.


Summary Table

MetricResult
Mean5
Median4
Mode4
Range8
Variance7.2
Standard Deviation2.68
MAD2

Why These Statistical Concepts Matter

These concepts are widely used in:

  • Data Science
  • Machine Learning
  • Business Analytics
  • Scientific Research
  • Economics
  • Academic Studies

They help answer important questions such as:

  • What is the average value?
  • How consistent is the data?
  • How spread out are the values?
  • Are there any unusual observations?

Final Thoughts

Learning statistics becomes much easier when you practice using simple datasets. Concepts like mean, median, mode, variance, and standard deviation are the building blocks of data analysis and research.

Once you understand these basics, you can move toward advanced topics like probability, hypothesis testing, regression, and machine learning.

You can now try these calculations using your own data such as:

  • Exam marks
  • Heights and weights
  • Sales figures
  • Daily temperatures
  • Survey responses

The more examples you practice, the stronger your understanding of statistics will become.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *