Scatter Plots

GUIDE

What a scatter plot shows, how to read correlation, and why a pattern is never proof of cause.

A scatter plot answers one question better than any other chart: are these two numeric things related? Instead of summarising the data into bars or slices, it plots every observation as a single point, positioned by its value on the horizontal (x) axis and its value on the vertical (y) axis. The result is a cloud of dots whose shape tells the story — whether the two variables rise together, pull against each other, or drift independently.

x variable
Each dot is one observation; a cloud that rises left-to-right shows a positive relationship.

How a scatter plot works

Both axes carry a continuous numeric scale — there are no categories. To place a point you take one observation that has been measured on two variables, read its first value along the x axis and its second value up the y axis, and drop a dot where they meet. Repeat for every observation and you have built the plot. Because position is the most precisely read visual cue we have, the eye can detect trends, gaps, and stragglers in the cloud almost instantly, even across hundreds of points.

The key insight is that you are not reading any single dot — you are reading the collective shape. One point on its own says little; a few hundred points arranged into a clear diagonal band says a great deal.

Reading correlation: the three basic shapes

Correlation is just the word for the direction and tightness of that cloud. Three patterns cover most of what you will see:

negative no correlation
A downhill cloud is negative correlation; a scattered blob shows no relationship.

Two further qualities matter. Strength is how tightly the points hug an imaginary trend line: a narrow band is a strong relationship, a wide spread is a weak one. Form is whether the trend is a straight line or a curve — some variables rise quickly then level off, which a straight-line summary would miss.

Outliers and clusters

Beyond the overall trend, two features are worth hunting for. Outliers are points that sit far from the rest of the cloud — unusually high, unusually low, or simply away from the pattern. They can flag a data-entry error, a genuinely exceptional case, or the most interesting observation in the whole set, so they deserve a second look rather than automatic deletion.

Clusters are clumps of points that group together, hinting that the data contains distinct subgroups. A cloud that splits into two separate clumps may mean you are unknowingly mixing two different populations, and analysing them together can hide or invent a relationship. Colouring points by a third attribute often makes such groups visible.

When to use a scatter plot

A good test

If your question is "does this go up when that goes up?" — and both "this" and "that" are numbers you measured on the same items — a scatter plot is the right tool. If one of your axes is really a set of named categories, you want a bar chart instead.

When not to use a scatter plot

Scatter plots are powerful but specialised, and the wrong job makes them confusing:

The mistake that matters most: correlation is not causation

Watch out for

A scatter plot can show two variables moving together, but it can never prove that one causes the other. The cause might run the opposite way, both might be driven by an unseen third factor, or a tidy diagonal might be pure coincidence in a small sample. Treat a strong pattern as a question worth investigating — not as an answer. Watch too for a single extreme outlier dragging an apparent trend out of thin air, and for hidden clusters faking a relationship that vanishes once the groups are separated.

Make a scatter plot

Ready to plot your own data? The free scatter plot maker lets you enter X and Y values (or paste them), label the axes, and export a PNG or SVG — no signup. Or read how scatter compares with the rest in the complete guide to chart types.