Representing data - EduqasCumulative frequency diagrams – Higher

Data is represented in many different forms. Using bar charts, pie charts and frequency diagrams can make information easier to digest.

Part ofMathsStatistics

Cumulative frequency diagrams – Higher

A cumulative frequency table shows a running total of the frequencies. A cumulative frequency diagram reproduces this table as a graph.

The table below shows the lengths of 40 babies at birth.

To calculate the cumulative frequencies, add the frequencies together.

Length (cm)FrequencyCumulative frequency
\(30 \textless l \leq 35\)44
\(35 \textless l \leq 40\)1014 (\(4 + 10 = 14\))
\(40 \textless l \leq 45\)1125 (\(14 + 11 = 25\))
\(45 \textless l \leq 50\)1237 (\(25 + 12 = 37\))
\(50 \textless l \leq 55\)340 (\(37 + 3 = 40\))
Length (cm)\(30 \textless l \leq 35\)
Frequency4
Cumulative frequency4
Length (cm)\(35 \textless l \leq 40\)
Frequency10
Cumulative frequency14 (\(4 + 10 = 14\))
Length (cm)\(40 \textless l \leq 45\)
Frequency11
Cumulative frequency25 (\(14 + 11 = 25\))
Length (cm)\(45 \textless l \leq 50\)
Frequency12
Cumulative frequency37 (\(25 + 12 = 37\))
Length (cm)\(50 \textless l \leq 55\)
Frequency3
Cumulative frequency40 (\(37 + 3 = 40\))

The cumulative frequency tells us that there are 14 babies that are between 30 and 40 cms.

A cumulative frequency diagram is drawn by plotting the cumulative frequency against the upper class boundary of the respective group. The upper class boundaries for this table are 35, 40, 45, 50 and 55.

Cumulative frequency is plotted on the vertical axis and length is plotted on the horizontal axis.

Length vs Cumulative frequency graph

Finding the median and interquartile range from a cumulative frequency diagram

Using a cumulative frequency diagram is a good way to find an estimate of the average, or middle, value, and interquartile range.

To find the median, work out \(\frac{1}{2}\) of the total frequency. Find this value on the vertical axis (the cumulative frequency axis). Draw a line across until it meets the curve. Draw a vertical line from that intersection to meet the horizontal axis. This will be the median.

The interquartile range is the difference between the and . The quartiles with the median split the data into four equal parts. The interquartile range is a measure of how spread out the data is. It is more reliable than the range because it does not include extreme values such as very high or low values. It looks at the middle 50% of the data, ignoring the extremes or .

To find the lower quartile, use the same method as for the median, except use \(\frac{1}{4}\) of the total frequency rather than \(\frac{1}{2}\). To find the upper quartile, use \(\frac{3}{4}\) instead.

Example

In the earlier example, there were 40 babies.

For the median:

\(\frac{1}{2}\) of 40 = 20. Read across at 20 and down. The median length is about 43 cm.

For the lower quartile:

\(\frac{1}{4}\) of 40 = 10. Read across at 10 and down. The lower quartile is about 38 cm.

For the upper quartile:

\(\frac{3}{4}\) of 40 = 30. Read across at 30 and down. The upper quartile is about 47 cm.

The interquartile range is the upper quartile – the lower quartile, so for this data the interquartile range is \(47 – 38 = 9\).

Using a graph to identify lower, median and upper interquartile ranges