Study Guide

Statistics Class 9: Mean, Median, Mode & Frequency Distribution

A complete guide to NCERT Chapter 14 — data presentation, graphical representation and measures of central tendency.

CBSEClass 9
The SparkEd Authors (IITian & Googler)15 March 202612 min read
CBSE Class 9 Statistics Guide — SparkEd

What is Statistics and Why Does It Matter?

Statistics is the branch of mathematics that deals with collecting, organising, analysing and interpreting data. In Class 9, NCERT Chapter 14 introduces you to the foundational tools of statistics.

Why care? Every field — from cricket rankings to election predictions to medical research — uses statistics. Understanding how to summarise data with a single number (mean, median or mode) is a life skill, not just an exam topic.

The chapter covers three main areas:
1. Collection and presentation of data (frequency distribution tables)
2. Graphical representation (bar graphs, histograms, frequency polygons)
3. Measures of central tendency (mean, median, mode)

Let's go through each one.

Collection and Presentation of Data

Raw data is unorganised information. For example, the marks of 30 students in a test:

56,42,78,65,42,89,56,73,42,91,78,56,65,42,78,89,56,73,65,91,42,56,78,65,73,89,42,56,78,9156, 42, 78, 65, 42, 89, 56, 73, 42, 91, 78, 56, 65, 42, 78, 89, 56, 73, 65, 91, 42, 56, 78, 65, 73, 89, 42, 56, 78, 91

Looking at this, it is hard to draw any conclusion. That's why we organise data.

Frequency: The number of times a particular value occurs in the data.

Ungrouped Frequency Distribution Table:

MarksTallyFrequency
42\mid\mid\mid\mid\mid\mid6
56\mid\mid\mid\mid\mid\mid6
65\mid\mid\mid\mid4
73\mid\mid\mid3
78\mid\mid\mid\mid\mid5
89\mid\mid\mid3
91\mid\mid\mid3

Now the data makes much more sense at a glance!

Grouped Frequency Distribution

When data has a large range, we group it into class intervals (also called bins).

Key Terms:
- Class interval: A range like 1010-2020, 2020-3030, etc.
- Class size / width: Upper limit - Lower limit. For 1010-2020, class size =10= 10.
- Class mark (mid-value): Upper limit+Lower limit2\frac{\text{Upper limit} + \text{Lower limit}}{2}. For 1010-2020, class mark =15= 15.
- Frequency: Number of observations falling in each class interval.

Example: The marks of 50 students (out of 100) are grouped:

Class IntervalFrequency
00-20204
2020-40408
4040-606015
6060-808014
8080-1001009

Convention: In 2020-4040, the observation 2020 is included but 4040 is not. This ensures each observation falls in exactly one class.

Cumulative Frequency: The running total of frequencies.

ClassFrequencyCumulative Frequency
00-202044
2020-4040812
4040-60601527
6060-80801441
8080-100100950

Practice this topic on SparkEd — free visual solutions and AI coaching

Try Free

Graphical Representation of Data

Graphs make data visual and easy to compare. Class 9 covers three types.

1. Bar Graph:
- Used for ungrouped or categorical data.
- Bars of equal width with uniform gaps between them.
- Height of each bar == frequency of that category.

2. Histogram:
- Used for grouped (continuous) data.
- Bars are drawn without gaps (because the class intervals are continuous).
- Width of each bar == class size; height == frequency.
- If class sizes are unequal, use frequency density on the y-axis: Frequency density=FrequencyClass width\text{Frequency density} = \frac{\text{Frequency}}{\text{Class width}}.

3. Frequency Polygon:
- Formed by joining the mid-points of the tops of the bars in a histogram with straight lines.
- Can also be drawn independently: plot class marks on the x-axis and frequencies on the y-axis, then connect the points.
- Add a class interval with zero frequency on each side to close the polygon.

Key Difference: Bar graphs have gaps between bars; histograms do not. Bar graphs work for discrete data; histograms work for continuous data.

Mean (Arithmetic Mean)

The mean is the most commonly used measure of central tendency. It gives the 'average' value.

For raw data:

xˉ=Sum of all observationsNumber of observations=xin\bar{x} = \frac{\text{Sum of all observations}}{\text{Number of observations}} = \frac{\sum x_i}{n}

For ungrouped frequency distribution:

xˉ=fixifi\bar{x} = \frac{\sum f_i x_i}{\sum f_i}

where fif_i is the frequency and xix_i is the observation.

Solved Example:
The following table shows the number of goals scored by a football team in 20 matches.

Goals (xix_i)012345
Matches (fif_i)256421

fixi=(0×2)+(1×5)+(2×6)+(3×4)+(4×2)+(5×1)\sum f_i x_i = (0 \times 2) + (1 \times 5) + (2 \times 6) + (3 \times 4) + (4 \times 2) + (5 \times 1)

=0+5+12+12+8+5=42= 0 + 5 + 12 + 12 + 8 + 5 = 42

fi=2+5+6+4+2+1=20\sum f_i = 2 + 5 + 6 + 4 + 2 + 1 = 20

xˉ=4220=2.1\bar{x} = \frac{42}{20} = 2.1

The team scored an average of 2.12.1 goals per match.

Median

The median is the middle value when the data is arranged in ascending (or descending) order. It divides the data into two equal halves.

How to find the median:
1. Arrange all observations in ascending order.
2. If nn (number of observations) is odd: Median =(n+12)th= \left(\frac{n+1}{2}\right)^{\text{th}} observation.
3. If nn is even: Median =12[(n2)th+(n2+1)th]= \frac{1}{2}\left[\left(\frac{n}{2}\right)^{\text{th}} + \left(\frac{n}{2} + 1\right)^{\text{th}}\right] observations.

**Solved Example 1 (Odd nn):**
Find the median of: 3,7,2,9,5,11,43, 7, 2, 9, 5, 11, 4.

Arranged: 2,3,4,5,7,9,112, 3, 4, 5, 7, 9, 11. Here n=7n = 7 (odd).

Median =(7+12)th=4th= \left(\frac{7+1}{2}\right)^{\text{th}} = 4^{\text{th}} observation =5= 5.

**Solved Example 2 (Even nn):**
Find the median of: 12,8,15,20,6,1012, 8, 15, 20, 6, 10.

Arranged: 6,8,10,12,15,206, 8, 10, 12, 15, 20. Here n=6n = 6 (even).

Median =12[3rd+4th]=10+122=11= \frac{1}{2}\left[3^{\text{rd}} + 4^{\text{th}}\right] = \frac{10 + 12}{2} = 11.

When is median better than mean? When data has extreme values (outliers). For example, if incomes are {20000,25000,22000,30000,500000}\{20000, 25000, 22000, 30000, 500000\}, the mean (119400119400) is skewed by the outlier, but the median (2500025000) better represents the typical income.

Mode

The mode is the value that occurs most frequently in the data. It is the simplest measure of central tendency.

How to find the mode: Count the frequency of each value. The value with the highest frequency is the mode.

Solved Example:
Find the mode of: 2,3,5,3,7,3,8,5,3,2,52, 3, 5, 3, 7, 3, 8, 5, 3, 2, 5.

Value23578
Frequency24311

The value 33 has the highest frequency (44).

\therefore Mode =3= 3.

Special Cases:
- No mode: If all values occur with equal frequency (e.g., 1,2,3,4,51, 2, 3, 4, 5).
- Bimodal: If two values have the same highest frequency (e.g., in {1,2,2,3,3,4}\{1, 2, 2, 3, 3, 4\}, both 22 and 33 are modes).

When is mode useful? When you want the most 'popular' or 'common' value — e.g., the most common shoe size in a class or the most frequently ordered dish in a restaurant.

Mean vs Median vs Mode: When to Use Which

MeasureBest WhenAffected by Outliers?
MeanData is evenly distributed, no extreme valuesYes — heavily
MedianData has outliers or is skewedNo
ModeYou want the most common value; categorical dataNo

Empirical Relationship (for moderately skewed data):

Mode3×Median2×Mean\text{Mode} \approx 3 \times \text{Median} - 2 \times \text{Mean}

This formula is handy for quick estimation and sometimes appears in objective-type questions.

Solved Example:
If the mean of a data set is 2424 and the median is 2626, estimate the mode.

Mode3(26)2(24)=7848=30\text{Mode} \approx 3(26) - 2(24) = 78 - 48 = 30

More Solved Examples for Practice

Example 1: Mean from a frequency table

The runs scored by 11 players in a cricket match are: 6,15,120,50,100,80,10,15,8,7,16, 15, 120, 50, 100, 80, 10, 15, 8, 7, 1. Find the mean and median.

Solution:

xˉ=6+15+120+50+100+80+10+15+8+7+111=41211=37.45\bar{x} = \frac{6 + 15 + 120 + 50 + 100 + 80 + 10 + 15 + 8 + 7 + 1}{11} = \frac{412}{11} = 37.\overline{45}

Arranged: 1,6,7,8,10,15,15,50,80,100,1201, 6, 7, 8, 10, 15, 15, 50, 80, 100, 120. n=11n = 11 (odd).

Median =6th= 6^{\text{th}} value =15= 15.

Notice the big difference between the mean (37.4537.45) and median (1515). The mean is pulled up by the century-scorers (100,120100, 120). In this case, the median better represents the typical score.

Example 2: Finding Missing Frequency

Problem: The mean of the following distribution is 5050. Find the missing frequency ff.

xix_i1030507090
fif_i17ff322419

Solution:

xˉ=fixifi=50\bar{x} = \frac{\sum f_i x_i}{\sum f_i} = 50

fixi=17(10)+f(30)+32(50)+24(70)+19(90)\sum f_i x_i = 17(10) + f(30) + 32(50) + 24(70) + 19(90)
=170+30f+1600+1680+1710=5160+30f= 170 + 30f + 1600 + 1680 + 1710 = 5160 + 30f

fi=17+f+32+24+19=92+f\sum f_i = 17 + f + 32 + 24 + 19 = 92 + f

50=5160+30f92+f50 = \frac{5160 + 30f}{92 + f}

50(92+f)=5160+30f50(92 + f) = 5160 + 30f

4600+50f=5160+30f4600 + 50f = 5160 + 30f

20f=56020f = 560

f=28f = 28

Exam Tips & Common Mistakes

Mistake 1: Forgetting to arrange data before finding the median. The median requires data in ascending order. Jumping straight to the middle value of unsorted data gives the wrong answer.

Mistake 2: Confusing histograms and bar graphs. Histograms have no gaps and are for continuous grouped data. Bar graphs have gaps and work for discrete or categorical data.

**Mistake 3: Wrong formula for even nn median.** When nn is even, you need the average of the two middle values, not just one of them.

Strategy: In the exam, if a question gives a frequency table and asks for the mean, set up a clean table with columns for xix_i, fif_i and fixif_i x_i. This organised approach prevents arithmetic errors and earns full marks.

Marks Tip: Statistics questions in CBSE Class 9 are typically 3-mark or 5-mark questions. They are considered scoring because the method is straightforward — just be careful with arithmetic.

Summary & Next Steps

Here is a compact recap of everything from NCERT Chapter 14.

Data Presentation: Raw data \rightarrow Frequency distribution (ungrouped / grouped) \rightarrow Graphical representation (bar graph / histogram / frequency polygon).

Measures of Central Tendency:
- Mean =fixifi= \frac{\sum f_i x_i}{\sum f_i}
- Median == Middle value (after sorting)
- Mode == Most frequent value

Key Insight: Mean is affected by extreme values; median and mode are not. Choose the right measure based on the data.

Want to practise statistics problems with instant feedback? Head to the SparkEd Statistics practice page for adaptive questions. Use the SparkEd Math Solver for step-by-step calculation checks, or ask the SparkEd Coach to explain any concept in more detail.

Practice These Topics on SparkEd

Frequently Asked Questions

Try SparkEd Free

Visual step-by-step solutions, three difficulty levels of practice, and an AI-powered Spark coach to guide you when you are stuck. Pick your class and board to start.

Start Practicing Now