Scatter graphs

Identify correlation and use lines of best fit.

Pearson EdexcelGCSE MathsStatisticsFoundation and Higher
Visual model

Correlation and line of best fit

positive correlationline of best fittrend, not exact
Look for the overall pattern.
Use the line of best fit for estimates.
Do not extrapolate too far beyond the data.
Gold-standard guide
20 mins

What you will learn

Identify correlation and use lines of best fit.
Use a clear step-by-step method for scatter graphs.
Check your answer and avoid the most common exam mistake.
Useful before you start
Core number skillsEarlier statistics skillsShowing clear working
Core knowledge

Know the rule, then use it

These are the short notes. Read each one, then check you can use it in the worked example below.

Method

Method

Correlation: positive — both increase together

Step 1

Identify the type of correlation

As revision hours increase, exam scores increase → positive correlation

Step 2

Assess the strength

If points lie close to a straight line, the correlation is strong

Step 3

Describe drawing the line of best fit

Draw a straight line with roughly equal numbers of points on each side

Watch out

Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side

f
Correlation

Correlation describes association, not proof of causation.

f
Line of best fit

Interpolate inside the data range; be careful with extrapolation.

Worked example

Describe the correlation in a scatter graph where higher revision hours are associated with higher exam scores. Draw a line of best fit and explain how to use it.

1

Identify the type of correlation: As revision hours increase, exam scores increase → positive correlation.

2

Assess the strength: If points lie close to a straight line, the correlation is strong.

3

Describe drawing the line of best fit: Draw a straight line with roughly equal numbers of points on each side.It should pass through or near (mean hours, mean score).

4

Use the line of best fit for prediction: Interpolation: reading off a value within the data range — reliable.Extrapolation: extending beyond the data — unreliable, as the pattern may not continue.

Final answer

Strong positive correlation; line of best fit passes through the mean point

Question ladder

Build up to the hardest questions

Do them in order. If you miss a step, read the solution, then redo the question without looking.

Workedreasoning

Describe the correlation in a scatter graph where higher revision hours are associated with higher exam scores. Draw a line of best fit and explain how to use it.

4 marks4 minsscatter-graphs-worked
Show solution
Worked solution
  1. 1.Identify the type of correlation: As revision hours increase, exam scores increase → positive correlation.
  2. 2.Assess the strength: If points lie close to a straight line, the correlation is strong.
  3. 3.Describe drawing the line of best fit: Draw a straight line with roughly equal numbers of points on each side.It should pass through or near (mean hours, mean score).
  4. 4.Use the line of best fit for prediction: Interpolation: reading off a value within the data range — reliable.Extrapolation: extending beyond the data — unreliable, as the pattern may not continue.
Final answer

Strong positive correlation; line of best fit passes through the mean point

Mark points
  • M1: identify the type of correlation
  • M1: assess the strength
  • M1: describe drawing the line of best fit
  • M1: use the line of best fit for prediction
  • A1: Strong positive correlation; line of best fit passes through the mean point
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Diagnosticrecall

What type of correlation: car age vs value?

1 mark2 minsscatter-graphs-q1
Show solution
Worked solution
  1. 1.Spot the skill: Correlation: positive — both increase together.
  2. 2.Use the identify the type of correlation stage first, then assess the strength.
  3. 3.Keep the final answer visible: Negative correlation.
Final answer

Negative correlation

Mark points
  • M1: use the correct correlation: positive — both increase together. negative — as one increases the other decreases.no correlation — no pattern. strength: strong (points close to line), weak (spread out).line of best fit: balanced line through the point (x̄, ȳ).
  • A1: Negative correlation
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Easyprocedure

A line of best fit passes through (5, 40) and (10, 70). Find the equation.

2 marks3 minsscatter-graphs-q2
Show solution
Worked solution
  1. 1.Spot the skill: Correlation: positive — both increase together.
  2. 2.Use the assess the strength stage first, then describe drawing the line of best fit.
  3. 3.Keep the final answer visible: Gradient = (70−40)/(10−5) = 6; y = 6x + 10.
Final answer

Gradient = (70−40)/(10−5) = 6; y = 6x + 10

Mark points
  • M1: use the correct correlation: positive — both increase together. negative — as one increases the other decreases.no correlation — no pattern. strength: strong (points close to line), weak (spread out).line of best fit: balanced line through the point (x̄, ȳ).
  • A1: Gradient = (70−40)/(10−5) = 6; y = 6x + 10
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Mediumreasoning

Explain why extrapolation is unreliable.

3 marks4 minsscatter-graphs-q3
Show solution
Worked solution
  1. 1.Spot the skill: Correlation: positive — both increase together.
  2. 2.Use the describe drawing the line of best fit stage first, then use the line of best fit for prediction.
  3. 3.Keep the final answer visible: The relationship shown by the data may not continue beyond the observed range.
Final answer

The relationship shown by the data may not continue beyond the observed range

Mark points
  • M1: use the correct correlation: positive — both increase together. negative — as one increases the other decreases.no correlation — no pattern. strength: strong (points close to line), weak (spread out).line of best fit: balanced line through the point (x̄, ȳ).
  • A1: The relationship shown by the data may not continue beyond the observed range
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Hardproblem solving

The mean hours studied = 7, mean score = 62. Must the line of best fit pass through (7, 62)?

3 marks5 minsscatter-graphs-q4
Show solution
Worked solution
  1. 1.Spot the skill: Correlation: positive — both increase together.
  2. 2.Use the use the line of best fit for prediction stage first, then identify the type of correlation.
  3. 3.Keep the final answer visible: Yes — the mean point lies on the line of best fit.
Final answer

Yes — the mean point lies on the line of best fit

Mark points
  • M1: use the correct correlation: positive — both increase together. negative — as one increases the other decreases.no correlation — no pattern. strength: strong (points close to line), weak (spread out).line of best fit: balanced line through the point (x̄, ȳ).
  • A1: Yes — the mean point lies on the line of best fit
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Exam-stylemulti-step

Distinguish between correlation and causation.

4 marks6 minsscatter-graphs-q5
Show solution
Worked solution
  1. 1.Spot the skill: Correlation: positive — both increase together.
  2. 2.Use the identify the type of correlation stage first, then assess the strength.
  3. 3.Keep the final answer visible: Correlation means two variables change together; causation means one directly causes the change in the other.
Final answer

Correlation means two variables change together; causation means one directly causes the change in the other

Mark points
  • M1: use the correct correlation: positive — both increase together. negative — as one increases the other decreases.no correlation — no pattern. strength: strong (points close to line), weak (spread out).line of best fit: balanced line through the point (x̄, ȳ).
  • A1: Correlation means two variables change together; causation means one directly causes the change in the other
Watch out

Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.

Grade 9 stretchproblem solving

Explain why using a line of best fit to predict far outside the data range may be unreliable.

4 marks7 minsscatter-g9
Show solution
Worked solution
  1. 1.Identify that the prediction is an extrapolation.
  2. 2.Explain that the observed relationship may not continue.
Final answer

It is an extrapolation, so the pattern may change beyond the observed data.

Mark points
  • C1: identify extrapolation
  • C1: explain unreliability
Watch out

Do not rush straight into arithmetic. Select the relevant method and show a complete chain of working.

Timed checkpoint
12 mins - 9 marks

Switch between skills

Set a timer and attempt all four questions before opening any answers. This is closer to the way skills appear in a real paper.

1Scatter graphs - 2 marksWhat type of correlation: car age vs value?Mark answer
Answer

Negative correlation

2Collecting and sampling data - 2 marksWhy might a questionnaire question be biased?Mark answer
Answer

Leading wording, only offering responses that agree, or not including a 'no' option

3Averages and range - 2 marksThe mean of 5 numbers is 12. Four of them are 8, 14, 10, 15. Find the fifth.Mark answer
Answer

13

4Grouped data and estimated mean - 3 marksA survey records [10,20): 5 responses and [20,30): 15. Estimate total mean across both groups.Mark answer
Answer

Use midpoints 15 and 25: (5×15 + 15×25)/20 = 4502045\frac{0}{2}0 = 22.5

Mastery check
  • I can explain the method for scatter graphs.
  • I can show clear working without skipping key steps.
  • I can avoid this mistake: Students draw the line of best fit from one corner to another rather than balancing points on each side.The line must have roughly equal scatter above and below it — it is not a 'connect-the-dots' line.
Related topics
Official exam-board sources

This guide follows the Pearson Edexcel GCSE Mathematics 1MA1 specification. Practice questions are original Learnova questions shaped around official content and exam skills.

Ready for the next step?

Get help with anything that still feels tricky.

Ask Nova Bot