Unfortunately, and more often than is good for my mental health, I encounter data being ineffectively displayed in stacked bar charts. Which phenomenon leads to my “Question of the Day”: When should we present our healthcare data in a stacked bar chart versus some other display form? (A quick thanks, before I forget, to data-viz expert Steve Few for his recent insightful post on this subject.)
As with all charts, we need to think first about the different types and characteristics of the data we are working with. (Are we seeing a time series? An interval? Nominal?) What do we need to tell our viewers? Do we need them to
- Understand the distribution of the data and whether or not it is skewed?
- See how the data is trending over time?
- Compare the parts of a whole or the sum of two or more of the same data parts for different groups?
Once we have considered both our data type and our message, we can confidently select the right chart design for the job.
In the following example, we need to clearly show the age distribution of a group of patients. If we use a horizontal stacked bar chart, it will be close to impossible to quickly and easily compare age groups and determine if they are distributed normally, or if they skew towards younger or older ages. Compounding this problem is the use of color, such as the shades of blue and grey, which are very similar, but show different percentages. For example, the age group 10-19 years (21%) is displayed in grey, as are the 60-69 age group (4%) and the 80+ cluster (1%).
The appropriate way to display the age distribution of the population of interest is with a histogram like the one below.
Displaying the data like this makes it easy for the viewer to directly compare the values in the different age categories by looking at the height of the bars, and to understand if the patients are skewing younger (as in the display above) or older.
Stated another way, a histogram is perfectly designed to enable us to compare the size of the bars and see the shape and direction of the data.
Trends Over Time
Another mistake I see on a regular basis is the use of a stacked bar chart to display trends over time for different parts of a whole or a category of data, as in the chart below.
Unfortunately, this approach permits accurate viewing and interpretation only at the very bottom or first part of the stacked bar (starting at 0). A viewer cannot in fact accurately or easily see how categories change over time, because each part of the bar begins and ends at a different place on the scale.
In order to correctly interpret what she sees in such a design, the user must do a mental calculation (a sort of math gymnastics) involving the beginning and end points of each section of the bar, for each time-frame.
She must then hold those pieces of data in memory, while simultaneously trying to understand how the data has changed through time, and attempting to compare it to the same information for all the other sections of the bar. (Merely describing this onerous process makes me tired.) The best way to show trends over time is with a line graph like the one below.
Such a graph allows the viewer to see whether something is increasing or decreasing, improving or getting worse; and how it compares to other parts of the whole. I have been challenged a few times by folks who believe that the stacked bar chart is better suited to showing that the displayed data is part of a whole; however, one can highlight that aspect easily by labeling the chart and lines clearly, as in the example above.
Comparing Parts of a Whole and Sums of Parts
At this point, you may be asking, “OK, when is it appropriate to use a stacked bar chart?”
Well, let me tell you. Whenever you need to show two – and two only – parts of a whole, a stacked bar chart does the trick quite nicely, and can also be a space-saver if you have limited real estate on a dashboard or report. The display works well precisely because the viewer doesn’t have to do the math gymnastics described above: the two parts can easily be seen and compared.
Depending on the layout of a report, you can play around with vertical or horizontal bars as in the two different displays below to determine what will work best for your specific report or dashboard. I often prefer to use horizontal bars, because they allow me to place my labels once and add additional information in alignment with the bars (such as figures or line graphs) to show trends over time.
Trying to compare the sum of parts using two stacked bars, however, generates yet another problem. As I have said, it is very difficult to understand how big each part of the bars is, never mind comparing one bar to the other in some meaningful way. And the piling on of different colors, as in the graph below, is just distracting: it requires looking back and forth between two as we try to hold colors and numbers in our short-term memory – a task none of us is very good at. More likely than not we give up this cumbersome task, and the message is lost.
There is all the same a way to compare the SUM of the same parts using a stacked bar chart. In the following graph example (which I found on Steve Few’s site), I can compare different clinics’ payor mix (for the same payors) with a stacked bar chart like this:
It is important to note in this example that the parts are arranged in the same order on both bar charts, and the two payor groups to be compared are at the very bottom of the charts. This design permits an effortless grasp of begin and end points. And here the use of color separates those parts from the others, drawing the viewer’s attention to the comparison to be understood.
As with all data visualization, the goal is to create charts and graphs that help people see the story in mountains of data without doing math gymnastics, color-matching, or anything else that strains not-always-reliable (and always over-taxed) short-term memory and pre-attentive processing.
Bottom line? Stop and think about the type of data you need to communicate, what you want your viewers to consider, and the best data visualization to accomplish these tasks.