How to Summarize Data for a Chart

ARTICLE

Most data needs aggregating before it can be charted — here is how to count, sum, and average it the right way.

A chart shows a few summarized numbers, but most data arrives as many raw rows. A spreadsheet might have one row per sale, per visit, or per measurement — hundreds or thousands of them. You cannot meaningfully draw a bar for every row. The step almost everyone skips is the most important one: aggregating those rows into a small table of summary values that a chart can show. This guide covers how to do that, and how the summary you pick determines the chart you get.

Why raw rows need summarizing first

Imagine a table with one row for every order: a date, a region, and an amount. There might be 900 rows. A chart of 900 bars is unreadable, and it would not answer any real question anyway. What you actually want to see is something like total sales per region, or the number of orders per month — a handful of values, each rolling up many rows. That rolling-up is aggregation: collapsing many records into one number per group.

There are two honest exceptions. A scatter plot plots one point per record on purpose, to show the relationship between two numeric columns. And a histogram takes all the individual numeric values and groups them into ranges for you, so the "summarizing" is built into the chart. For nearly everything else — bar charts, line charts, pie charts — you summarize before you plot.

The three core summaries: count, sum, average

Almost every aggregation is one of three operations:

The choice of average matters. Suppose five values are 2, 3, 3, 4, and 28. The mean is 40 ÷ 5 = 8, but four of the five values sit below it — the single large value of 28 has dragged the mean upward. The median, the middle value when sorted, is 3, which represents the typical case far better. As a rule: reach for the mean when values are roughly symmetric, and the median when the data is skewed or has outliers. Our mean, median & mode calculator works all three out for you from a pasted list.

Don't average an average

If you already have monthly averages and want a yearly figure, you generally cannot just average the twelve monthly averages — that ignores how many records sat behind each month. Go back to the raw rows and compute the overall average from scratch, or weight each month by its count.

Grouping and counting categories

Most summaries are computed per group: you pick a column to group by, then count or sum or average within each group. Group by region and sum the amount, and you get one total per region. Group by month and count the rows, and you get orders per month. The grouping column is usually categorical — region, category, or a time period — and the summarized number becomes the height of each bar.

The figure below shows the move: many raw rows on the left collapse into one count per category on the right, which is exactly what a bar chart needs.

Raw rows A B A C A B Count per group A = 3 B = 2 C = 1 Bar chart 0 A B C
Six raw rows become three counts, which become three bars. The summary is the bridge from data to chart.

How the summary decides the chart

Once the data is summarized, the chart is nearly chosen for you, because each shape of summary table maps to a chart type:

This is why summarizing and chart choice are really one decision. When you decide to count orders per month, you have decided on a line chart; when you sum revenue per region, you have decided on a bar chart. If you are weighing options, how to choose a chart connects each summary to a chart type in detail.

The practical workflow is short: identify the question, pick the column to group by, pick whether to count, sum, or average, build the small summary table, and plot it. When the summary is ready, head to make a chart, paste the few summarized values, and you are done — no raw rows required.