MEASURES OF CENTERAL TENDENCY

0
Chapter 3: Measures of Central Tendency – Statistics for Ethiopian Students

Chapter 3: Measures of Central Tendency

Hello, dear students! 😊 Have you ever wondered how a single number can tell you so much about a whole group of data? Like, “What’s the typical score in your class?” or “How much does an average Ethiopian family earn?” That’s exactly what measures of central tendency help us find! In this chapter, we’ll explore the mean, median, mode, and even quartiles—with real examples, clear explanations, and fun questions along the way. Ready? Let’s go! 🚀

What Is a Measure of Central Tendency?

Imagine you have 50 classmates, and everyone scored differently on a math test. Saying all 50 scores is messy! But if you say, “The average score was 72,” suddenly everyone gets the big picture. 🧠

A measure of central tendency is a single value that represents the center or “typical” value of a data set. Think of it as the “best representative” of the group. It makes huge piles of numbers easy to understand.

For this to work well, a good average should:

  • Be clearly defined
  • Use all the data
  • Not get too messed up by extreme values (like one billionaire in a village!)
  • Be easy to calculate
  • Allow further math (like adding averages)

The Summation Notation (Σ)

Before we dive into averages, let’s learn a super useful shortcut mathematicians use: the summation symbol Σ (sigma).

If you have data values: \(X_1, X_2, X_3, \dots, X_n\), then instead of writing \(X_1 + X_2 + \dots + X_n\), we write:

\[ \sum_{i=1}^{n} X_i \]

This just means: “Add up all the X’s from the first to the last.” Simple, right? 😃

Example: Scores of 5 students: 5, 7, 7, 6, 8.

Then:

\[ \sum_{i=1}^{5} X_i = 5 + 7 + 7 + 6 + 8 = 33 \]

Properties of Summation

Here are some useful rules:

\[ \begin{aligned} &1.\quad \sum_{i=1}^{n} k = nk \quad \text{(adding a constant \(k\), \(n\) times)} \\ &2.\quad \sum_{i=1}^{n} kX_i = k \sum_{i=1}^{n} X_i \\ &3.\quad \sum_{i=1}^{n} (aX_i + b) = a\sum X_i + nb \\ &4.\quad \sum (X_i + Y_i) = \sum X_i + \sum Y_i \end{aligned} \]

Important Point → Why Summation Matters

Summation isn’t just math gymnastics! It’s the backbone of statistics. Every average, every formula uses it. Once you’re comfy with Σ, formulas become easier to read and use.

Real-life Example: A teacher adds all student scores to find the class average. That’s summation in action!

Question: If \(X = \{2, 4, 6\}\) and \(Y = \{1, 3, 5\}\), what is \(\sum (X_i – Y_i)\)?

Answer: \((2-1) + (4-3) + (6-5) = 1 + 1 + 1 = 3\)

Types of Measures of Central Tendency

There are four main types:

  1. Mean (Arithmetic, Geometric, Harmonic)
  2. Median
  3. Mode
  4. Quantiles (Quartiles, Deciles, Percentiles)

Which one you use depends on your data and what you want to learn. Let’s explore each one!

1. The Arithmetic Mean

This is what most people call “the average.” Add all the numbers and divide by how many there are.

For Raw Data

\[ \bar{X} = \frac{\sum_{i=1}^{n} X_i}{n} \]

Example: Numbers: 2, 7, 8, 2, 7, 3, 7

Sum = 2 + 7 + 8 + 2 + 7 + 3 + 7 = 36

Count = 7

Mean = \(36 \div 7 = 5.14\)

For Ungrouped Frequency Data

If some numbers repeat, we use frequency.

\[ \bar{X} = \frac{\sum f_i X_i}{\sum f_i} \]
\(X_i\)\(f_i\)\(f_i X_i\)
224
313
7321
818
Total736

Mean = \(36 \div 7 = 5.14\)

For Grouped Data (Class Intervals)

We use the midpoint (class mark) of each class:

\[ \bar{X} = \frac{\sum f_i X_i}{\sum f_i}, \quad \text{where } X_i = \text{class midpoint} \]

Example: Age distribution

ClassFrequency \(f_i\)Class Mark \(X_i\)\(f_i X_i\)
6–10358280
11–152313299
16–201518270
21–251223276
26–30928252
31–35633198
Total1001575

Mean = \(1575 \div 100 = 15.75\)

Important Point → Coding to Simplify Calculations

When numbers are large, we can “shift” them using a guessed mean \(A\), then adjust later.

Let \(d_i = X_i – A\), then: \(\bar{X} = A + \frac{\sum f_i d_i}{\sum f_i}\)

Why?: It reduces big numbers to smaller ones—less chance of calculator errors!

Real-life Example: Calculating average monthly income when values are in the thousands—use coding to make it easier.

Question: Deviations from assumed mean 7 are: 1, -1, -2, -2, 0, -3, -2, 2, 0, -3. Find the true mean.

Answer: Sum of deviations = -10, n = 10 → mean of deviations = -1 → true mean = 7 + (-1) = 6

Special Properties of the Arithmetic Mean

  1. Sum of deviations from mean = 0: \(\sum (X_i – \bar{X}) = 0\)
  2. Sum of squared deviations is minimized at the mean.
  3. Combined Mean: If two groups have means \(\bar{X}_1, \bar{X}_2\) and sizes \(n_1, n_2\), then:
    \[ \bar{X}_c = \frac{n_1 \bar{X}_1 + n_2 \bar{X}_2}{n_1 + n_2} \]
  4. If you add a constant \(k\) to all values, new mean = old mean + \(k\).
  5. If you multiply all values by \(k\), new mean = \(k \times\) old mean.

Example: 30 girls average 60, 70 boys average 72. Class mean?

\[ \bar{X}_c = \frac{30 \times 60 + 70 \times 72}{100} = \frac{1800 + 5040}{100} = 68.4 \]

Weighted Mean

When some values are more important, we give them weights.

\[ \bar{X}_w = \frac{\sum W_i X_i}{\sum W_i} \]

Example: Exam scores with weights:

  • English (60), weight 1
  • Biology (75), weight 2
  • Math (63), weight 1
  • Physics (59), weight 3
  • Chemistry (55), weight 3

Weighted mean = \(\frac{1\cdot60 + 2\cdot75 + 1\cdot63 + 3\cdot59 + 3\cdot55}{1+2+1+3+3} = \frac{615}{10} = 61.5\)

Important Point → When to Use Weighted Mean

Use it when items have different importance—like GPA (hard courses count more) or CPI (food prices matter more than jewelry).

Real-life Example: Your final grade might be 40% midterm, 60% final. That’s a weighted mean!

Question: A student scores 80 on a quiz (weight 1) and 90 on a final (weight 3). What’s the weighted average?

Answer: \(\frac{1\cdot80 + 3\cdot90}{4} = \frac{350}{4} = 87.5\)

Merits and Demerits of Arithmetic Mean

Merits ✅Demerits ❌
Rigidly definedAffected by extreme values
Uses all dataCan’t be used with open-ended classes
Good for further mathNot for qualitative data (e.g., beauty)
Stable across samplesMay not reflect “typical” if skewed

2. The Geometric Mean (G.M.)

Use this for **averaging ratios or growth rates** (like population growth, interest).

\[ G.M. = \sqrt[n]{X_1 \cdot X_2 \cdot \ldots \cdot X_n} \]

Or using logs:

\[ \log(G.M.) = \frac{1}{n} \sum \log X_i \quad \Rightarrow \quad G.M. = \text{Antilog}\left( \frac{\sum \log X_i}{n} \right) \]

Example: Find G.M. of 2, 4, 8

\(G.M. = \sqrt[3]{2 \cdot 4 \cdot 8} = \sqrt[3]{64} = 4\)

3. The Harmonic Mean (H.M.)

Use this for **averaging rates** (like speed, price per kg).

\[ H.M. = \frac{n}{\sum \frac{1}{X_i}} \]

Example: A cyclist goes to college at 10 km/h and back at 15 km/h. What’s average speed?

Use H.M. (since distance is same):

\[ H.M. = \frac{2}{\frac{1}{10} + \frac{1}{15}} = \frac{2}{\frac{3+2}{30}} = \frac{2}{5/30} = \frac{60}{5} = 12 \text{ km/h} \]

Important Point → Relationship Between Means

For positive numbers: **H.M. ≤ G.M. ≤ A.M.**

They are equal only if all values are the same!

Real-life Example: If a company’s profits grow 10%, then 20%, the average growth is G.M., not A.M.!

Question: Two numbers have A.M. = 10 and G.M. = 8. What are the numbers?

Answer: Let numbers be \(a, b\). Then \(a + b = 20\), \(ab = 64\). Solve: \(a = 16, b = 4\) (or vice versa)

4. The Mode

The **most frequent value** in a data set.

  • No mode? (all values unique)
  • Bimodal? (two modes)
  • Multimodal? (many modes)

Data: 5, 3, 5, 8, 9 → Mode = 5

Data: 8, 9, 9, 7, 8, 2, 5 → Bimodal (8 and 9)

Data: 4, 12, 3, 6, 7 → No mode

Mode for Grouped Data

Use this formula for continuous data:

\[ \hat{X} = L + \left( \frac{\Delta_1}{\Delta_1 + \Delta_2} \right) w \]

Where:

  • \(L\) = lower boundary of modal class (class with highest frequency)
  • \(w\) = class width
  • \(\Delta_1 = f_{\text{mode}} – f_{\text{previous}}\)
  • \(\Delta_2 = f_{\text{mode}} – f_{\text{next}}\)

Example: Farm size distribution

Size (hectares)No. of farms
5–158
15–2512
25–3517
35–4529
45–5531
55–655
65–753

Modal class = 45–55 (freq = 31)

\(L = 45\), \(w = 10\), \(f_{\text{prev}} = 29\), \(f_{\text{next}} = 5\)

\(\Delta_1 = 31 – 29 = 2\), \(\Delta_2 = 31 – 5 = 26\)

\[ \hat{X} = 45 + \left( \frac{2}{2 + 26} \right) \cdot 10 = 45 + \frac{20}{28} = 45 + 0.71 = 45.71 \]

Important Point → When Mode Shines

Mode is great for **categorical data** and **business decisions**—like “What’s the most common shoe size?” or “Most popular phone brand?”

It’s not affected by outliers!

Real-life Example: A clothing store stocks more of the modal size, not the average size—because that’s what sells!

Question: In a class, shoe sizes are: 38 (5 students), 39 (12), 40 (8), 41 (3). What’s the mode?

Answer: 39 (most frequent)

5. The Median

The **middle value** when data is ordered. Half are below, half above.

For Ungrouped Data

  • If \(n\) is odd: median = middle value
  • If \(n\) is even: median = average of two middle values
\[ \tilde{X} = \begin{cases} X_{\left(\frac{n+1}{2}\right)} & \text{if } n \text{ odd} \\ \frac{1}{2} \left[ X_{\left(\frac{n}{2}\right)} + X_{\left(\frac{n}{2}+1\right)} \right] & \text{if } n \text{ even} \end{cases} \]

Data: 6, 5, 2, 8, 9, 4 → ordered: 2, 4, 5, 6, 8, 9 → \(n=6\)

Median = \(\frac{5 + 6}{2} = 5.5\)

For Grouped Data

\[ \tilde{X} = L + \left( \frac{\frac{n}{2} – c}{f_{\text{med}}} \right) w \]

Where:

  • \(L\) = lower boundary of median class
  • \(c\) = cumulative frequency before median class
  • \(f_{\text{med}}\) = frequency of median class
  • \(w\) = class width
  • Median class = first class where cumulative freq ≥ \(n/2\)

Example: Marks distribution

ClassFrequencyCum. Freq
40–4477
45–491017
50–542239
55–591554
60–641266
65–69672
70–74375

\(n = 75\), so \(n/2 = 37.5\). Median class = 50–54 (cum.freq = 39 ≥ 37.5)

\(L = 49.5\), \(c = 17\), \(f_{\text{med}} = 22\), \(w = 5\)

\[ \tilde{X} = 49.5 + \left( \frac{37.5 – 17}{22} \right) \cdot 5 = 49.5 + \frac{20.5 \cdot 5}{22} = 49.5 + 4.66 = 54.16 \]

Important Point → Why Median Beats Mean in Skewed Data

Median ignores extremes! In Ethiopia, average wealth is misleading because of a few billionaires—but median income tells the real story of “a typical person.”

Real-life Example: House prices in Addis: mean = 5 million birr (skewed by luxury villas), median = 1.2 million (what most people pay).

Question: Incomes (birr): 2000, 2500, 3000, 3500, 100000. What’s better: mean or median?

Answer: Median = 3000 is better. Mean = 22200 is distorted by the 100,000.

Merits and Demerits of Median

Merits ✅Demerits ❌
Not affected by outliersNot based on all values
Works with open-ended classesNot good for further algebra
Easy to understandLess stable in small samples

6. Quantiles: Quartiles, Deciles, Percentiles

These split data into equal parts:

  • Quartiles (Q): 4 parts → Q1 (25%), Q2 (50% = median), Q3 (75%)
  • Deciles (D): 10 parts → D1 to D9
  • Percentiles (P): 100 parts → P1 to P99

Formula for Grouped Data

\[ Q_i = L + \left( \frac{\frac{iN}{4} – c}{f_Q} \right) w \quad \text{(for quartiles)} \]

Same idea for deciles (\(iN/10\)) and percentiles (\(iN/100\)).

Example: Find Q1, Q2, Q3 from this data:

ValuesFrequencyCum. Freq
140–1501717
150–1602946
160–1704288
170–18072160
180–19084244
190–200107351
200–21049400
210–22034434
220–23031465
230–24016481
240–25012493

Total \(N = 493\)

Q1: \(iN/4 = 123.25\) → class = 170–180

\(L = 170\), \(c = 88\), \(f_Q = 72\), \(w = 10\)

\[ Q_1 = 170 + \left( \frac{123.25 – 88}{72} \right) \cdot 10 = 170 + 4.9 = 174.90 \]

Q2 (median): \(246.5\) → class = 190–200 → \(Q_2 = 190.23\)

Q3: \(369.75\) → class = 200–210 → \(Q_3 = 203.83\)

Important Point → Why Quantiles Matter in Real Life

They show inequality! In Ethiopia, if P90 income is 10x P10, that tells us about economic gaps.

Schools use percentiles: “You scored in the 85th percentile” means you beat 85% of students! 🎓

Question: In a test, P25 = 45, P50 = 60, P75 = 80. What does this tell you about difficulty?

Answer: The test was easy for top students (big jump from P50 to P75) but hard for lower half (small jump from P25 to P50).

Summary & Final Thoughts

So there you have it! 🌟

  • Use mean for symmetric, numerical data.
  • Use median for skewed data or outliers.
  • Use mode for categorical data or “most popular.”
  • Use quartiles to understand spread and inequality.

Remember: No single average is “best.” Choose based on your data and goal! 😊

Practice Exercises (With Answers)

Exercise 1: Missing Frequencies

Marks of 75 students:

MarksNo. of students
40–447
45–4910
50–5422
55–59f4
60–64f5
65–696
70–743

If 20% of students scored between 55–59, find f4, f5, and the mean.

Total students = 75 → 20% of 75 = 15 → f4 = 15

So f5 = 75 – (7+10+22+15+6+3) = 75 – 63 = 12

Now compute mean using grouped formula → answer ≈ 55.2

Exercise 2: Correcting a Mistake

Average weight of 10 students = 65 kg. Later, one weight was misread as 40 instead of 80. Find correct average.

Correct mean = Wrong mean + (Correct – Wrong)/n = 65 + (80 – 40)/10 = 65 + 4 = 69 kg

Exercise 3: Transformations

Mean of X is 500. What’s the new mean if:

(a) 10 is added to each number?

(b) Each number is multiplied by -5?

(a) New mean = 500 + 10 = 510

(b) New mean = -5 × 500 = -2500

You’ve made it through Chapter 3! 🎉 Keep practicing, and soon these concepts will feel like second nature. Remember, every great statistician started exactly where you are today. 👏

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top