Back to: Computer Science
Measures of Dispersion – Let’s Understand Spread! 😊
Hello again, smart learners! 👋
So far, we’ve learned how to find a “typical” value—like the mean, median, or mode. But here’s the thing 😮: two groups can have the same average, yet be totally different in how their values are spread out!
Example: Imagine two classrooms both have an average score of 70. In Classroom A, everyone scored between 65 and 75. In Classroom B, some scored 20 and others 100! 🤯 Which class is more “consistent”? That’s where measures of dispersion come in!
In this chapter, we’ll explore four main ways to measure spread:
- Range
- Quartile Deviation
- Mean Deviation
- Standard Deviation & Variance
And yes—we’ll include all examples and all exercises with full answers from your textbook, explained in simple English with a warm, friendly tone. Let’s go! 💪
1. Why Do We Need Measures of Dispersion? 🤔
Without dispersion, we’d be like a chef who only knows the average temperature of an oven—but not whether it swings wildly between freezing and boiling! 😠
Objectives of studying dispersion:
- To judge the reliability of averages (low dispersion = more reliable)
- To compare variability between two groups
- To control quality (e.g., in factories)
- To prepare for advanced topics like probability and inference
A farmer measures yields from two plots:
- Plot A: 40, 42, 41, 39, 43 kg
- Plot B: 20, 60, 30, 50, 45 kg
2. Absolute vs Relative Measures of Dispersion
Some measures (like range or standard deviation) are in the same units as the data (e.g., kg, Birr, cm). These are absolute.
But what if you want to compare the spread of heights (in cm) and weights (in kg)? You can’t! So we use relative measures—like ratios or percentages—called coefficients.
– Class A: mean = 50 Birr, SD = 10 Birr → CV = (10/50)×100 = 20%
– Class B: mean = 200 Birr, SD = 30 Birr → CV = (30/200)×100 = 15%
→ Class B is more consistent, even though its SD is larger!
3. The Range – Simple but Limited 📏
Range = Largest – Smallest.
It’s quick to calculate, but only uses two values—so it’s very sensitive to outliers!
Distribution 1: 32, 35, 36, 36, 37, 38, 40, 42, 42, 43, 43, 45 → Range = 45 – 32 = 13
Distribution 2: 32, 32, 33, 33, 33, 34, 34, 34, 34, 34, 35, 45 → Range = 45 – 32 = 13
Same range! But clearly, Distribution 1 is less spread out in the middle. Range doesn’t see that! 😮
Range for Grouped Data
Use the upper class limit of the last class and lower class limit of the first class:
\[ \text{Range} = \text{UCL}_k – \text{LCL}_1 \]
Relative Range (Coefficient of Range)
\[ \text{RR} = \frac{L – S}{L + S} \], where L = largest, S = smallest.
If Range = 4 and RR = 0.25, find L and S.
Solution:
\( L – S = 4 \)
\( \frac{L – S}{L + S} = 0.25 \Rightarrow \frac{4}{L + S} = 0.25 \Rightarrow L + S = 16 \)
Solve: \( L = 10, S = 6 \)
4. Quartile Deviation – Focus on the Middle 50% 🎯
This is great for skewed data (like income) where outliers distort the range.
Given frequency table (same as Chapter 3), you previously found:
\(Q_1 = 174.90\), \(Q_3 = 203.83\)
So:
\[ \text{QD} = \frac{203.83 – 174.90}{2} = \frac{28.93}{2} = 14.47 \]
Coefficient of QD = \(\frac{Q_3 – Q_1}{Q_3 + Q_1} = \frac{28.93}{378.73} \approx 0.076\)
5. Mean Deviation – The “Average Distance” from Center 🧮
We use absolute values because positive and negative deviations would cancel out otherwise.
Formula (about the mean):
\[ \text{MD}(\bar{X}) = \frac{\sum |X_i – \bar{X}|}{n} \]
Visits by 10 mothers: 8, 6, 5, 5, 7, 4, 5, 9, 7, 4
Mean = 6, Median = 5.5, Mode = 5
Total absolute deviations = 14 (for all three centers!)
So:
MD(mean) = MD(median) = MD(mode) = 14/10 = 1.4
(This is a coincidence—usually they differ!)
Coefficient of Mean Deviation:
\[ \text{CMD} = \frac{\text{MD}}{\text{Average used}} \]
For the above:
– CMD about mean = 1.4 / 6 ≈ 0.233
– CMD about median = 1.4 / 5.5 ≈ 0.255
– CMD about mode = 1.4 / 5 = 0.28
6. Variance and Standard Deviation – The Gold Standard! 🥇
Why square? To make all deviations positive and give more weight to larger deviations (which is useful for detecting outliers).
Sample Variance Formula:
\[ s^2 = \frac{\sum (X_i – \bar{X})^2}{n – 1} \]
Note: We divide by n–1 (not n) to get an unbiased estimate of the population variance.
Data: 5, 17, 12, 10 → Mean = 11
Squared deviations: (5–11)²=36, (10–11)²=1, (12–11)²=1, (17–11)²=36 → Total = 74
Variance = 74 / (4–1) = 74/3 ≈ 24.67
SD = √24.67 ≈ 4.97
Age distribution (same as Chapter 3)
Mean = 55
Sum of
f(X – X̄)² = 4400n = 75 → Variance = 4400 / 74 ≈ 59.46
SD = √59.46 ≈ 7.71
Special Properties of SD
- Chebyshev’s Theorem: For ANY distribution, at least \((1 – \frac{1}{k^2})\) of data lies within
kSDs of the mean. - For normal distributions: ~68% within 1 SD, ~95% within 2 SDs, ~99.7% within 3 SDs.
- SD is affected by scale changes:
- Add constant → SD unchanged
- Multiply by constant
k→ SD becomes|k| × old SD
Mean = 500, SD = 10.
(a) Add 10 to each value → new SD = 10 (unchanged!)
(b) Multiply each by –5 → new SD = |–5| × 10 = 50
7. Coefficient of Variation (CV) – Compare Spread Fairly! ⚖️
Use it to compare dispersion across different datasets—even if they’re in different units!
Firm A: Mean wage = 52.5 Birr, Variance = 100 → SD = 10 → CV = (10/52.5)×100 ≈ 19.05%
Firm B: Mean wage = 47.5 Birr, Variance = 121 → SD = 11 → CV = (11/47.5)×100 ≈ 23.16%
→ Firm B has greater variability in wages, even though its SD isn’t much larger!
8. Let’s Practice! Full Exercises with Answers ✍️
Monthly wages in two firms:
| Value | Firm A | Firm B |
|---|---|---|
| Mean wage | 52.5 | 47.5 |
| Variance | 100 | 121 |
As above: CVA ≈ 19.05%, CVB ≈ 23.16% → Firm B is more variable.
A student’s average score is 65 (n=10). One score was misread as 40 instead of 80. Find the correct average.
Wrong total = 65 × 10 = 650
Correct total = 650 – 40 + 80 = 690
Correct mean = 690 / 10 = 69
Find MD about mean, median, and mode for:
| Class | 40–44 | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 |
|---|---|---|---|---|---|---|---|
| Frequency | 7 | 10 | 22 | 15 | 12 | 6 | 3 |
This requires detailed table work. Steps:
- Find class marks (42, 47, 52, 57, 62, 67, 72)
- Compute
|X – mean|,|X – median|,|X – mode| - Multiply each by frequency and sum
- Divide by n = 75
You Did It! 🌟
Congratulations! You’ve now mastered how to measure not just the “center” but also the “spread” of data. 🎓
Remember:
- Use range for a quick snapshot (but don’t trust it fully).
- Use QD when you care about the middle 50% and want to ignore outliers.
- Use MD for simple, intuitive average deviation.
- Use SD for the most powerful, mathematically robust measure—especially when comparing groups via CV.
Keep practicing, and soon these tools will feel like natural extensions of your statistical thinking. You’ve got this! 💪
— Your friendly stats teacher 🙂