Visualizing Data

Just what the world needs — another blog.

Well, when it comes to the sharing the best practices for displaying healthcare data visually and finding and telling the story buried in your data that is EXACTLY what the world needs — a blog that delivers the information and help you've just got to have, but don't have easy access to.

And as much as I love the sound of my own voice (and I do, ask anyone) I encourage you to contribute your thoughts, questions and examples (HIPAA compliant please — I don't look good in stripes).

Let the blogging begin.

Raising the Bar on Stacked Bar Charts

Unfortunately, and more often than is good for my mental health, I encounter data being ineffectively displayed in stacked bar charts. Which phenomenon leads to my “Question of the Day”: When should we present our healthcare data in a stacked bar chart versus some other display form? (A quick thanks, before I forget, to data-viz expert Steve Few for his recent insightful post on this subject.)

As with all charts, we need to think first about the different types and characteristics of the data we are working with. (Are we seeing a time series? An interval? Nominal?) What do we need to tell our viewers? Do we need them to

  • Understand the distribution of the data and whether or not it is skewed?
  • See how the data is trending over time?
  • Compare the parts of a whole or the sum of two or more of the same data parts for different groups?

Once we have considered both our data type and our message, we can confidently select the right chart design for the job.


In the following example, we need to clearly show the age distribution of a group of patients. If we use a horizontal stacked bar chart, it will be close to impossible to quickly and easily compare age groups and determine if they are distributed normally, or if they skew towards younger or older ages. Compounding this problem is the use of color, such as the shades of blue and grey, which are very similar, but show different percentages. For example, the age group 10-19 years (21%) is displayed in grey, as are the 60-69 age group (4%) and the 80+ cluster (1%).


The appropriate way to display the age distribution of the population of interest is with a histogram like the one below.


Displaying the data like this makes it easy for the viewer to directly compare the values in the different age categories by looking at the height of the bars, and to understand if the patients are skewing younger (as in the display above) or older.

Stated another way, a histogram is perfectly designed to enable us to compare the size of the bars and see the shape and direction of the data.

Trends Over Time

Another mistake I see on a regular basis is the use of a stacked bar chart to display trends over time for different parts of a whole or a category of data, as in the chart below.


Unfortunately, this approach permits accurate viewing and interpretation only at the very bottom or first part of the stacked bar (starting at 0). A viewer cannot in fact accurately or easily see how categories change over time, because each part of the bar begins and ends at a different place on the scale.

In order to correctly interpret what she sees in such a design, the user must do a mental calculation (a sort of math gymnastics) involving the beginning and end points of each section of the bar, for each time-frame.

She must then hold those pieces of data in memory, while simultaneously trying to understand how the data has changed through time, and attempting to compare it to the same information for all the other sections of the bar. (Merely describing this onerous process makes me tired.) The best way to show trends over time is with a line graph like the one below.


Such a graph allows the viewer to see whether something is increasing or decreasing, improving or getting worse; and how it compares to other parts of the whole. I have been challenged a few times by folks who believe that the stacked bar chart is better suited to showing that the displayed data is part of a whole; however, one can highlight that aspect easily by labeling the chart and lines clearly, as in the example above.

Comparing Parts of a Whole and Sums of Parts

At this point, you may be asking, “OK, when is it appropriate to use a stacked bar chart?”

Well, let me tell you. Whenever you need to show two – and two only – parts of a whole, a stacked bar chart does the trick quite nicely, and can also be a space-saver if you have limited real estate on a dashboard or report. The display works well precisely because the viewer doesn’t have to do the math gymnastics described above: the two parts can easily be seen and compared.

Depending on the layout of a report, you can play around with vertical or horizontal bars as in the two different displays below to determine what will work best for your specific report or dashboard. I often prefer to use horizontal bars, because they allow me to place my labels once and add additional information in alignment with the bars (such as figures or line graphs) to show trends over time.



Trying to compare the sum of parts using two stacked bars, however, generates yet another problem. As I have said, it is very difficult to understand how big each part of the bars is, never mind comparing one bar to the other in some meaningful way. And the piling on of different colors, as in the graph below, is just distracting: it requires looking back and forth between two as we try to hold colors and numbers in our short-term memory – a task none of us is very good at. More likely than not we give up this cumbersome task, and the message is lost.


There is all the same a way to compare the SUM of the same parts using a stacked bar chart. In the following graph example (which I found on Steve Few’s site), I can compare different clinics’ payor mix (for the same payors) with a stacked bar chart like this:


It is important to note in this example that the parts are arranged in the same order on both bar charts, and the two payor groups to be compared are at the very bottom of the charts. This design permits an effortless grasp of begin and end points. And here the use of color separates those parts from the others, drawing the viewer’s attention to the comparison to be understood.

As with all data visualization, the goal is to create charts and graphs that help people see the story in mountains of data without doing math gymnastics, color-matching, or anything else that strains not-always-reliable (and always over-taxed) short-term memory and pre-attentive processing.

Bottom line? Stop and think about the type of data you need to communicate, what you want your viewers to consider, and the best data visualization to accomplish these tasks.

Posted in Communicating Data to the Public, Data Visualization, Design Basics, Graphs, Newsletters | Leave a comment

My Secret Tip for Testing Data Visualizations

This past Sunday my husband, Bret, our pup, Juno, and I headed out to Deer Island in Massachusetts Bay. We love this walk because of the fantastic views it affords of the Bay and of Boston, and because the island’s history is always a fun and fascinating topic of conversation.

For example, on this excursion, Bret and I talked about Trapped Under the Sea, Neil Swidey’s riveting book about the nearly-10-mile-long Deer Island Tunnel, built hundreds of feet below the ocean floor in Massachusetts Bay. It helped transform the Harbor from the dirtiest in the country to the cleanest – and its construction led to the tragic (and completely avoidable) deaths of five men.

As we rounded the southwest corner of the island, Boston revealed itself to us, and we stopped to see how many landmarks we could identify, along with an interesting fact about each to liven things up (yes, we do try to one-up each other).

We’ve done this numerous times over the years, but on this occasion, the exercise started me thinking about what I was seeing in an entirely new and different way. I began on my left looking at Fort Independence, then moved my eyes to the right to see the Prudential and John Hancock buildings, then the Bunker Hill Monument, followed by the Zakim Bunker Hill Memorial Bridge and the Logan Airport Control Tower.

That’s when it hit me: I was creating sentences and weaving them into a narrative about my beloved city using visual landmarks as cues, just as I do with my healthcare data visualizations.


I’ve developed a habit when I’m designing or testing reports and dashboards: I imagine that I’m in front of the individual or group they’re intended for. Speaking aloud (yes, I do talk to myself on a regular basis), I practice to test whether, using the figures and graphs as my guide, I can create a cohesive, fluent, and compelling narrative.

The reason I do this (and that I encourage you to do it, too) is that I’ve learned that if I can tell a guided story about the data and information on the reports and dashboards I’m designing, then the people in my intended audience will be able to as well. Conversely, if I find myself struggling and stumbling, then I know I need to go back to the drawing board and either refine what I’ve created or, yes, ditch it and start over.

Consider the following prototype CEO Monitoring Dashboard that my team and I at HealthDataViz (HDV) created using fabricated data. I’ve added a few examples of the sentences and narrative I wrote as we were developing and testing it.

(click to enlarge)

I always begin my descriptions with an introduction or executive summary about the level of data being displayed (Summary Overview vs. Subject-Area Specific, for example); the intended audience; and overall objectives and end use.

Next, I carefully survey the data being displayed, moving primarily from left to right and top to bottom – or, depending on the layout of the dashboard and leveraging the way that our eyes cover a page, beginning at the top left, moving to the right and then down the right-hand column and back up along the left-hand one.

Perhaps most important, I include very specific examples supported by data points. Selecting just the right ones for my review may be the hardest and most time-consuming part of this self-check I do, but it is absolutely essential for testing that what I have displayed is correct and makes sense – and that I can explain it in simple, brief terms.

Here is an abbreviated example of what I mean (pretend you’re in the room listening while I practice):

Summary Overview

  • This Hospital CEO Dashboard takes into account the current environment in which hospital CEO’s have to navigate – one shaped by Value Based Purchasing (VBP) and public reporting, and where financial, clinical, information technology, and patient satisfaction results are all inextricably linked.

Top Left – One-Month Results and Summary Performance

  • On the upper left side of the dashboard, we can see that the Actual Average Daily Census for December was 4% below Budget (254 versus 264); and that as shown in the trend graph, this performance is reflective of the past twelve months’ performance, culminating in a YTD below-budget result of 8%.

Top Right – Payor Mix

  • It is also interesting to note changes to the hospital’s year to date (YTD) payor mix displayed in the bar graph at the top right of the dashboard. For example, in the current year, Commercial Insurance represents approximately 50% of all hospital payors as compared to 40% in the previous year.

Middle Right – Quality and Patient Satisfaction

  • On the HCAHPS survey question “Would recommend this hospital,” approximately 80% of the patients responding for this specific hospital said “yes” as displayed by the horizontal blue bar. This result misses the hospital’s target of 90% (represented by the vertical black line), and places the hospital in the 75th percentile nation-wide, as signified by the underlying horizontal stacked bar in shades of grey (no, not the movie, people – the bar chart!).

Bottom Right – EHR Compliance

  • In this display, we can see that Medicine and Pharmacy are performing better than their target levels at 100% compliance, and that Pathology and Urology have the worst compliance rates, at only 60% each.

Bottom Left – Hospital Specific Key Metrics

  • Two specific metrics that the CEO wants to monitor are the hospital’s 30-day readmission rates, and Supply Expenses as a percentage of Net Operating Expenses compared to target.

Middle Left – Mortality O/E Ratio

  • This display reveals that for the last three months displayed, the O/E ratios are statistically unusually high (more deaths recorded than we would have expected, and the confidence interval does not include one). In October, the ratio was approximately 1.5; in November 1.8; and by December, it had climbed to 2.0. We have also coded these statistically significant O/E ratios in red to draw attention to them.

I cannot encourage you enough to start using this review-and-read-aloud technique to challenge yourself and clarify whether you have created a dashboard that makes sense and provides insights for your audiences that will lead them to take prompt, effective action. It is a simple, fast, and inexpensive way to get the answers you need for yourself and your own confidence and serenity.

The process may not always be easy: when you have to really, truly describe what you have created in a clear and compelling manner, using detailed explanations with examples from the data, I’ll bet you’ll find it challenging – perhaps even rather frustrating – the first few times you try. But keep at it: in the long run, you will discover that it helps you to create much better and more comprehensive reports and dashboards.

And if you ever need a break to clear your head, I have the perfect walk in mind to do so.

Posted in Best Practices, Communicating Data to the Public, Dashboards | 1 Comment

David Bowie vs. Alan Gilbert

alan-gilbert-conductingI recently watched Charlie Rose interview New York Philharmonic conductor Alan Gilbert (pictured at right).

During the interview, Gilbert described how he slowly and in small increments moved the Philharmonic musicians (and by extension their audiences) from a traditional view to a thoroughly new one of how concerts may be performed.

First, Gilbert had the musicians crumple up pieces of paper and throw them at him at the end of a concert. (I know: radical stuff! Call the concert police!)

Over the next ten (yes, ten) years, he made other small changes in the performances, and along the way earned enough trust from both musicians and audiences to permit occasional rowdy behavior, as when the musicians don wild, themed costumes and move freely around the stage while playing.

It was interesting to hear about these amusing wrinkles in standard concert behavior; but what struck me most vividly was Gilbert’s willingness to “meet [the musicians] where they were,” and the patience and skill it took to make that journey.

David-BowieBy contrast, we – his fans – had to meet singer, composer, musician, artist, and creative spirit David Bowie, who died earlier this year just days after releasing yet another ground-breaking, game-changing album, where he was.

Often described as a visionary chameleon, Bowie hit the ’70’s music scene with a look, sound, and attitude that caught most people by surprise. His physical appearance alone made the Beatles and the Stones look like schoolboys in short pants.

He didn’t just bend conventional ideas on gender, art, music, performance, costume, and genre; he took them on a Turkish Taffy roller coaster ride inside a house of mirrors. Unlike Alan Gilbert, he showed up living and performing his vision, and invited us to come along-or not. We had to meet Bowie where he was, not the other way around.

I would love to conduct my career in the spirit of David Bowie, but I’m a pragmatist (with a mortgage) working in an industry that moves at its own pace. As a result, I’m continually learning new ways to meet people where they are. I imagine most of you are, too.

In that spirit, I offer here three approaches I’ve found helpful when I work with groups that need me to meet them where they are:

  • Use Small Demonstrations of Data Visualization Best Practices.

    Identify existing reports and dashboards that may be improved by replacing a poor display device (like a pie chart) with a bar chart; or changing a three- (or more) part stacked bar chart to a small multiple display.

    By making such modest changes, you can begin to move people toward using and understanding the best practices of data visualization at a pace they find comfortable and easy to accept.

  • Focus on Areas with the Most Evident ROI.

    Identify spots where data is urgently and immediately needed to prevent the imposition of a penalty or to mitigate risk, and is not being well reported (i.e., data reports are onerous and hard to use).

    One example might be performance metrics tied to CMS Annual Payment Updates, or data analysis to identify and manage patient risk factors for outcomes such as morbidity or mortality. Such weak places are often great starting targets, because the return on investment can be easily quantified.

    People are under pressure to actively manage information and avoid ineffective or confusing results, and therefore more open to new ways of seeing, thinking, and analyzing.

  • Look for Blank Canvases.

    Seek the areas and groups where no dashboards or reports exist. If you’re lucky enough to find such a situation, you’ll get the chance to use the best practices of data visualization from the start, and to capture people’s support for those best practices right away.

Listening to Alan Gilbert describe small but powerful steps toward realizing his vision for the Philharmonic was good for me: it helped me remember that although I do believe in change, I sometimes need to retreat – as most of us do – to the reassuring fallback position “Change is good. You go first!” all while humming to Bowie playing softly in the background: “Ch-ch-ch-ch-Changes, turn and face the strain, Ch-ch-Changes…”

Posted in Best Practices, Know Your Audience, Newsletters | Leave a comment

And Around We Go… Again

As I mature (and boy, is aging a high price to pay for maturity), I find I have very little need or even desire to win an (never mind every) argument, or to prove that I’m right about something.

I suppose that’s true in part because I understand that we all see the world in different ways, and in part because it seems to take a very long time for even solid, compelling evidence about anything to persuade people to change their firmly held beliefs. (And I admit that sometimes I count myself among those folks.)

It’s also why I’ve written very rarely (even though I’m occasionally tempted to say something) on “why I’m a card-carrying member of the ‘Better not use pie charts’ club.”

There are many expert voices, and there is plenty of evidence, on this topic.

The data-visualization pioneer Edward Tufte said that “pie charts should never be used”; William Cleveland referred to pie charts as “pop charts” because they are commonly found in pop culture media rather than in science and technology writing. Data-visualization expert Stephen Few wrote the widely-read and frequently-referenced essay “Save the Pies for Dessert.” All the same, I feel the need to add my voice to the chorus in the hope of improving healthcare data visualizations.

What pushed me over the edge?

A free e-book from a software vendor (that should have been my first clue) which, in spite of well-established expert opinions and evidence about why pie charts are not as effective as other display devices, presents advice about the misuse of pie charts – that is, it explains how to use pie charts correctly. And around we go again – oh, my aching head!

Let’s walk through what is suggested and why those suggestions constitute bad advice; and then let’s turn to the part left out: how to display data better with nary a pie chart in sight.

Here are some excerpts (I’m paraphrasing):

Example 1

The book says…

“Don’t squeeze too much information into a pie chart: the slivers get too thin, and the audience confused.”

I say…

Use a bar chart like the one in Example 1, below. We humans find it very difficult to judge the size of the angle in a pie chart. With a bar chart, we can immediately tell the size of the data being encoded by the length of each bar. It’s then easy to directly compare the lengths of the bars, and determine which values are larger or smaller.

We can also add a comparison or target line if we need to, which we can’t do on a pie chart.

We can label each value being displayed directly rather than making our viewers match a color-coded key to each slice of the pie, all while trying to hold the information in short-term memory as they look back and forth from the chart to the key. (Try it, and you’ll see what I mean!)



(click to enlarge)

Example 2

The book says…

“Order your slices from largest to smallest for easiest comparison.”

I say…

Okay, this is just silly!! Simply use a ranked bar chart like the one in the Example 2, below.



(click to enlarge)

Example 3

The book says…

“Avoid using pie charts side by side – it’s an awkward way to compare data.”

I say…

Yep, you guessed it: use bar charts. And if you need to encode additional comparison data, try a bullet graph (a modified form of the bar chart). In addition to being a better way to display data, a bar chart allows additional context for visualizations.

In the example below, by using a bar chart and leveraging the fact that my viewers read from left to right, I label the data once and accomplish all of the following. I can

  • show the number of cases eligible for the measure (the denominator);
  • display compliance compared to target;
  • note the difference between the current quarter performance and the target; and
  • record how each clinician has performed over the last four quarters.

You simply can’t do all this – quickly, clearly, and in a modest display space – with a pie chart. Look at the results in Example 3, below.



(click to enlarge)

Here’s the bottom line – pretty much anything you can do with a pie chart, you can do better with a bar chart. This is especially true for the types of displays we create in healthcare.

Bar charts make it easy to:

  • directly compare the sizes of data groups displayed.
  • directly label the data.
  • easily rank the data.
  • include comparison or target data.
  • include additional contextual data.

As is clear from this last example above, bar charts are also far superior when used on a dashboard. They take up less space than pie charts and (as previously noted), make it possible to display much additional contextual data, such as performance over time.

Every so often I come across a forum where people still rant on about how maligned pie charts are. I admit I find them – both the people and the pie charts – infuriatingly amusing. Yes, the charts can be fun on an infographic, or useful for teaching young children the concept of part-to-whole, but for me and the work I do the evidence is in – forever – and pie charts are out.

Posted in Design Basics, Graphs, Newsletters | 1 Comment

Best Available Incomplete Information (BAII)

When I was a teenager, I had one terrible habit that drove my mother over-the-edge crazy. (OK: I had more than one. But hey, “driving your mother crazy” is part of the official job description for “teen-age girl.” I looked it up.)

My particular expertise was in the fine craft of strategically omitting information that would’ve assuredly had a negative impact on my desired outcome.

For example, I would ask if I could go to my best friend Tracy’s house for the night, but I would leave out the fact that we would be stopping by bad-boy Tom’s house for a “my parents are away” party. This fact would of course have resulted in my having to stay home – that is, in my view, in the worst outcome imaginable. (Yes, I did consider law school early in life.)

In my defense, there were times when I didn’t know bad-boy Tom was having a party until after I’d received permission to go to Tracy’s house. On these occasions I asked for my Mom’s consent based on the best available incomplete information. Of course, as is the way with all mothers, she eventually found out where I’d been (even on the occasions when no police were involved). As a result, each of my subsequent requests for permission to go out elicited an ever more rigorous line of inquiry from her.

My (now) fond memory of these mother-daughter tussles was prompted by a recent article I read in the New York Times: “The Experts Were Wrong About the Best Places for Better and Cheaper Health Care.” Let me tell you why.

Until recently, the largest and best data-set available for the analysis and study of healthcare delivery in the U.S. was that based on Medicare claim data. Private-insurance statistics have long been almost entirely inaccessible for the same type of analysis and scrutiny, as they are held and managed by private companies that are not required to make them public.

This situation has left us scant choice but to make assumptions and decisions about how our healthcare system does and should deliver care using what I have come to term “best available incomplete information (BAII).”

As highlighted in the article I’ve cited above, Medicare data have revealed enormous amounts of information about regional differences in Medicare spending, which are driven mostly by the amount of healthcare patients receive, not the price per service.

Even more important, Medicare data reveal that places delivering lots of medical services to patients often do not have any better health outcomes than those locations delivering less medical care at lower cost.

These findings based on Medicare data have, by and large, been reduced to one simple message: if all healthcare systems could deliver care in the same way these low-cost ones do, the country’s notoriously high medical costs could be controlled, and might even decline.

On the face of it, this makes perfect sense. What’s missing, however, is how these systems are performing on the delivery of care to their non-Medicare patients. Are the results observed in one cohort of patients (Medicare) also the results for all other non-Medicare cohorts (private insurance, self-pay, etc.)? Data newly available from the Health Care Cost Institute (HCCI) about a large number of private insurance plans offer new hope that we may begin to answer these and other important questions more fully.

As a first high-level analysis described by the Times article reveals, places in the U.S. that have been heralded for low-cost, high-quality care delivered to Medicare patients are not necessarily performing in the same way for their private-insurance patients.

You can see these findings displayed in the side-by-side choropleth maps below.

(click to enlarge)

Displaying the data like this reveals that (for example) although Alaska’s per capita Medicare spending is average as compared to all other areas in the U.S., its per capita private-insurance spending is above average. The data reveal a similar pattern for several other areas in the states of Idaho, Michigan, and New Hampshire (for example), where Medicare costs are either average or below average, but private-insurance spending is above average.

This isn’t the only observable difference. Interestingly, in places like my home state of Massachusetts, the opposite of the above is true: Medicare spending is above average, while spending on private insurance is average across the state.

The Times article displays this information on the maps above and also in this simple but effective graphic (click here to check a place near you).

(click to enlarge)

I find these new data wildly interesting, am certain they will result in new findings, and devoutly hope that they will also lead to greater transparency in and other improvements to our healthcare system.

But this new information also serves as a serious and important reminder that we are all making decisions using the best available incomplete information currently available to us, and only that. As a result, we have to try to get better at understanding what it can and cannot enlighten us about, and how we will act when new information becomes available to us.

After I read the Times article and thought about its title, I found myself annoyed at what seemed a rather negative headline: “The Experts Were Wrong…” In fact, the experts were right about what the BAII they had at the time revealed. Was it the full story? Absolutely not. Do we know that full story yet? We do not: even this new analysis is missing data on patients insured by Blue Cross & Blue Shield and Medicaid, as well as on the under- and the un-insured. To put it another way: “We still don’t know what we don’t know.”

It seems to me that the only sensible path to improving our healthcare system is to commit ourselves to continually seeking new data, information, and knowledge to support better-informed decisions, and to seek the courage to adjust our sails and lead change by following – even when that path may be disappointing, confusing, or difficult – where the data lead.

Posted in Data Analysis, Newsletters | Leave a comment

Really Big Goals

Like a lot of people I am a big goal-setter. I especially love BHAG’s [pronounced “Be- hags”?]: Big, Hairy, Audacious Goals.

You know: the ones based on no logic or well-developed plan whatsoever, but rather conjured up by the sheer and (sometimes) delusional belief that “Somehow, I will find a way.”

Exhibit A: starting a business with a kid headed off to college and a big-ass mortgage (a technical term in my household). To be fair, I also set a lot of smaller and saner goals for myself: the amount of money I wanted to save for retirement each year; the number of trips to the gym each week.

As 2015 ends, I find myself going back over the year to consider what I accomplished compared to the goals I set, and visualize what I hope to accomplish in 2016. As is often the case, my mind wanders to different data visualization techniques and how I might display actual vs. desired progress using graphs.

Yes, I hear you: “Memo to self: add ‘Get out more!’ to my 2016 goal list.”

A graph I often use to show how well a group performs compared to a goal or benchmark is a deviation graph.

(If you are a regular subscriber to this newsletter, you’ll recall that I have written more than one article about these types of graphs; you can check out those articles by clicking here.)

I especially like them on monitoring dashboards, because the absolute value of a changing goal or benchmark is not displayed – only the difference or deviation of actual performance from it is shown, as in the following example.


(click to enlarge)

Displaying the information like this allows the viewer to quickly and easily answer questions such as “Are we over or under budget on revenue or expenses?” or to evaluate medication reconciliation versus a target without worrying about the actual goal or performance values, as they often change over time or are different for a group or category of similar metrics (department budgets, for one).

Such a display lets them know if performance is above or below goal, and by how much.

Sometimes, a goal is set for a longer time frame, and we wish to display its actual value compared to performance. Most often a line graph like the one shown here is used for this type of display.


(click to enlarge)

While this is a perfectly acceptable way to show the data, it doesn’t clarify how far from target we are.

This is where a deviation graph – one that displays the actual target value and the actual performance difference or deviation, such as in the one below – can help.


(click to enlarge)

This data display makes clear that the target is 90% on medication reconciliation, and how far below (orange bars) or above (blue bars) monthly results are. It’s also possible to see actual performance by comparing the ends of the bars to the Y-axis.

All three of these displays work – as long as they respond to these key criteria:

  • are target values fixed or variable?
  • is it enough to simply monitor deviation, or must actual values be displayed?

As I write this, and consider my options for displaying my own performance compared to my 2015 goals, I am beginning to get a little spooked. I may after all be awash in orange (below target!!) for some time.

But then I remember a famous comment by American author, salesman, and motivational speaker Zig Ziglar: “If you aim at nothing, you will hit your target every time.” Back to that BHAG list – and onward.

Posted in Design Basics, Graphs, Newsletters | 1 Comment

Postcard from New Zealand

It has been almost a month since my return to the States, following a truly gratifying professional engagement with the Canterbury Health District and Health Informatics New Zealand (HINZ).

If you’ve ever had the pleasure of traveling to New Zealand, you know all the accolades about it are true… true and oh so very, very true! The landscape is spectacular, the people are lovely and yes, of course, there are far more sheep in New Zealand than there are people. Lots and lots of sheep, like this darling little lamb at the Walter Peak High Country Farm in Queenstown (which, of course, they only let me hold after they had served me his brother for lunch – clearly a brilliant strategy to reduce the number of requests for vegan meals.)


Given all the sheep and lambs we saw (it is spring in New Zealand now so there are even more lambs than usual!), it is no surprise that I started to think yet again about the great utility of small multiples to display our healthcare data.

If you are up on your data-visualization terms, you know that it was Edward Tufte, a statistician and Yale University professor, and a pioneer in the field of information design and data visualization, who coined the term “small multiples.” (You may be familiar with other names for this type of display: Trellis Chart, Lattice Chart, Grid Chart or Panel Chart.)

I think of small multiples as displays of data that use the same basic graphic (a line or bar graph) to display different parts of a data set. The beauty of small multiples is that they can show rich, multi-dimensional data without attempting to jam all the information into one, highly complex chart like this one:


Now take a look at the same data displayed in a chart of small multiples:


What problems does a small-multiples chart help solve?

  1. Multiple Variables. Trying to display three or more variables in a single chart is challenging. Small multiples enable you to display a lot of variables, with less risk of confusing or even losing your viewers.
  1. Confusion. A chart crammed with data is just plain confusing. Small multiples empower a viewer to quickly grasp the meaning of an individual chart and then apply that knowledge to all the charts that follow.
  1. Difficult Comparisons. Small multiples also make it much easier to compare constant values across variables and reveal the range of potential patterns in the charts.

Now, before you construct a small-multiples data display, here are a few additional pointers:

  1. Arrangement. The arrangement of small-multiples charts should reflect some logical measurement or organizing principle, such as time, geography, or another interconnecting sequence.
  1. Scale. Icons and other images in small-multiple displays should share the same measure, scale, size, and shape. Changing even one of these factors undermines the viewers’ ability to apply the understanding gained from the first chart to subsequent charts or display structures.
  1. Simplicity. As with most things in life, simplicity in the small-multiples chart is crucial. Users should be able to easily process information across many charts, and see and understand the story in the data.

I still go a little soft when I think of holding that darling lamb and patting its ears as it fell asleep in my arms. And while it is highly likely that this sweet memory will fade and I may eventually eat lamb once again, I will always remember seeing pasture after pasture of these gentle creatures and will continue to relate them to small multiples to display data!

Posted in Design Basics, Graphs, Newsletters | Leave a comment

My Bin or Yours?

I am writing this newsletter from my coach seat (28F, window) on United flight UA1292 from Boston to San Francisco. Funny how inspiration for a newsletter about healthcare data visualization and histogram bins can show up in the tightest spots.

This particular snug niche was the few cubic inches I desperately sought in an overhead bin, so I could stow my carry-on. As I did so, I was aggravated by the way luggage and other belongings had been shoved into places where it was clear they didn’t have a snowball’s chance in hell of fitting. Hello? If your things hang out of the bin and the door won’t shut, there’s a problem!

Then there’s the whole armrest fiasco. Is it mine, or is it the territory of the person next to me? Where does my boundary begin and his|hers end – when am I in the right space, and when have I illegally crossed the armrest border?

All of this got me thinking about the intervals, or “bins,” on histograms – the charts used to show the distribution of numerical data and to estimate the probability distribution of a continuous (quantitative) variable. Histograms are really useful, but – as with airplane bins – you need to be careful not to fall into “your bin or mine?” confusion.

A histogram is a type of graph most commonly used to show frequency distributions, or how often each different value in a set of data occurs. It looks much like a bar chart, but there are either no, or minimal, spaces between its bars, a feature which helps remind the viewer that the variables are continuous.

As a result, bins are usually specified as “consecutive, non-overlapping intervals” of a variable. The bins (intervals) must be adjacent, and are usually of equal size.

Histograms are very useful when you need to:

  • Display the distribution of continuous data (ages, days, time, etc.).
  • See if the data is distributed relatively evenly, is skewed (unbalanced), or is some other interesting shape as in some of the following examples:


In a Normal Distribution, data tends to be around a central value with no bias left or right (often referred to as a bell curve because its shape is similar to that of a bell).


Skewed Distributions commonly have one tail of the distribution considerably longer or drawn out relative to the other. A “skewed right” distribution has a tail on the right side, a “skewed left” one, on the left. The above histogram shows a distribution skewed right.

Clearly, histograms are a great choice when you wish to display and communicate data distribution quickly and easily – but again, don’t fall into that “my bin or yours?” trap. Often I see data displayed in a histogram like this one, which I created using data from the National Vital Statistics Reports, v. 64, No. 1, January 15, 2015*:


Histogram (1) displays the percentage of low-risk cesarean deliveries (C-Sections) by maternal age in the U.S. in 2013. Note that the X axis has divided maternal ages into bins; if you look closely, you’ll catch the “my bin or yours?” trap. If a woman is 30 years old when she has a C-Section, does she belong in the third bin (25-30 years) or the fourth (30-35)?

Once you catch this, it seems easy enough to fix.


In histogram (2), I changed the bin labels to eliminate this overlap – but in doing so, I may have created a new problem. If the data captures the exact age of women (i.e., years and months), and a woman is 24.5, 29.7, or another “in between” age when she has a C-Section, which bin is she in? We might just make an assumption and move on, but there’s a better way.


In the final histogram, (3), the addition of the “greater than,” “less than,” and “equal to” symbols provides the clarity we need to avoid the trap about where data with this level of detail falls in the distribution.

Now we can see that if a woman is age 24.5 when she has a C-Section, she is in the second bin; if 25.7, she is in the third one. Trap avoided: we have clearly labeled each five-year bin, thereby eliminating confusion.

The devil is always in the details, isn’t he? And yes, details matter if you are serious about making the story in your data (and bins!) clear, and if you want to avoid the “whose bin?” trap.

As for my personal bin and armrest struggles, flying first class may be the only solution.

Posted in Graphs, Newsletters | Leave a comment

Love, Time, Environment

As my daughter Annie’s tenth birthday approached, our friend Erik asked her what special gift she was hoping for. Without hesitation, Annie responded, “A crane.” We all turned and looked at her in astonishment. What would a ten-year-old girl want with a crane?

“I want a crane,” she calmly continued, “so I can lift up my piano and smash it to the ground.”

Stunned silence – followed by uproarious laughter – greeted her matter-of-fact but implacable pronouncement. In that moment, I saw my dream of vicariously becoming a concert pianist smashed (metaphorically) to smithereens.

This story has become legend among our family and friends because it says so much about Annie’s succinct style of expression, and creates an indelible image of how she really felt about learning to play the piano (as compared with my romantic fantasy about her learning to play).

But the anecdote has also stayed with me all these years because of the big life lesson it taught me. I believed that if I just gave her everything she needed – a beautiful instrument, lessons, encouragement – she would become a really great pianist.

What I failed to understand, however, was that the piano had not captured her imagination. She didn’t love it – clearly, she didn’t even like it – and she was never, ever going to move beyond playing a few simple (albeit charming) tunes.

Additionally, I came to realize that I didn’t really love the piano, either; as a result, my encouragement was cursory at best. I had no burning desire to create an environment that wholeheartedly nurtured and supported her learning to play, and love, the instrument. I had to face it: there was a real dearth of piano-playing passion in our lives.

I don’t regret having spent my time and money on any of it because I have faith (and evidence) that it raised Annie’s awareness and appreciation of music and beauty. But that’s pretty much the extent of my return on investment – except, of course, in the way that the piano episode has informed my professional work, particularly concerning the way people learn.

As you know, I have created curricula specific to health and healthcare professionals that teach the best practices of data visualization and the fundamentals of analysis and statistics.

Each time I conduct a workshop or training, I can pretty accurately predict which participants will love the material, and will continue to research and practice ways to improve their dashboards and reports, and which won’t.

Here are some clues to correctly identifying the successful ones:

  • The team lead, director, manager, or supervisor is in the course alongside the team, fully interested, engaged, and encouraging. Even if the leaders are unlikely ever to create a report or dashboard themselves, they are signaling their commitment to and support for the process.
  • I have the successful attendees’ full attention: phones out of sight; eyes on the examples I present; focused consideration of what they are seeing; articulate, involved communication with me about it; enthusiastic interest in the subject.
  • When it’s time to complete a group case study, they dig in and hang on. I see them opening books, talking to their colleagues, checking in with me, pro-actively putting pencil to paper to sketch out multiple strategies.
  • When I encourage them to think of a report they currently produce, and how they might improve it using what they have learned, they jump at the chance to re-imagine it, eagerly soliciting my feedback along the way.
  • After the course is over, they stay engaged and interested, sending me e-mails with reports or dashboards attached that they have re-designed and that have been, they tell me, effective and well received. And they continue to drop me notes on occasion, to ask advice or recommend a useful article.

Here’s the bottom line. Becoming good at something – creating powerful health and healthcare reports and dashboards, or just about anything in life – requires three things: [a] an interest in or love for the subject; [b] training bolstered by practice (10,000 hours of it, according to Malcolm Gladwell’s Outliers); and [c] a supportive and nurturing environment in which to develop and refine your knowledge and skill.

This last point is especially important for managers to understand. You can send me teams of people, and I can raise their awareness and with luck ignite the fire of their imaginations about the best practices of data visualization and healthcare analytics.

But if you don’t share in that interest, or neglect to arrange things so that those who do can encourage, inform, and cheer on their colleagues, no amount of training in isolation is going to improve your health and healthcare reports and dashboards. (I’m good, but I’m not that good.)

By the way: if you know anyone who might be interested in a brand-new, barely-been-played, threatened-within-an-inch-of-its-life piano, drop me a line, won’t you?

Posted in Best Practices, Etcetera, Newsletters | Leave a comment

What’s Hue?

Recently someone sent me the bar chart below, and asked what I thought of it (as if I needed an invitation to comment!).


My immediate reaction was to ask why the author had used different shades – or “hues,” as they are called in graphic design – of black on the bars.

The heights of the bars clearly show that revenue has increased each month, so it’s redundant (and distracting) to use color hues to display the same increases. Of course it’s also true that hues often highlight variations in volumes, rates, and other measurement, but here they aren’t needed for that purpose, either.

The change in color is redundant (am I repeating myself?). Simply display the bar-graph data like this:


This is perhaps a good time to ask, “when, how, and why do hues work best in data visualization?” As I suggested above, we might want to show changes in volumes or rates with them, as on this choropleth map from the Dartmouth Atlas:

2010 Part D Medicare Enrollment Cohort
Percent Filing at Least One Prescription For a Dementia Medication


By using different hues of yellow to brown (from the lightest shade for the lowest percentages, deepening to dark brown for the highest), we can illustrate that in 2010 the percentage of Medicare beneficiaries enrolled in Part D (the prescription drug program) and who had filled at least one prescription for a dementia medication, was much higher in Southern California than in Northern California (for example).

The use of hues on this type of map helps us quickly and easily see and compare low and high values, and even to better grasp the full “what and why” behind the display. This use of color hues makes complete sense. (To see and learn more about this type of data display, take a look at The Dartmouth Atlas of Health Care.)

Hues also often work well to dynamically direct viewers’ attention to a metric that signals an urgent situation requiring immediate attention. This is particularly useful and important on dashboards or in summary reports; I’ve discussed these frequently in previous newsletters and posts.

In the issue of 15 May 2015, for example, I used a dot indicator in dark red to draw attention to those measures that fall furthest below the national comparison, then incrementally lightened the hue to match the diminishing differences between actual performance and the standard.


In another example – you can see it in our HDV website portfolio by clicking hereI used arrow-shaped indicators and a range of black tones to show changes (increases, decreases) in a hospital’s payor mix from one year to the next.


Note that the selection of a neutral color for icons (instead of more emotion-laden colors such as red or green) allows the viewer to quickly see changes in the data without conveying any judgment on the value of the change. This is especially important in a display presenting an element such as payor mix, where the same changes may be good in one situation and bad in another.

Further, avoiding colors such as red and green makes understanding the display easier for those with visual variations and inability to see certain shades accurately.

The example offered to me for comment (the first graphic, above) demonstrates a fundamental understanding of the use of color hues to show differences in volume. That’s a good thing.

But as with everything in data-viz (okay, and perhaps in life in general), true mastery resides in knowing precisely when and how to use (or not to use) a technique, so you can get your point across without distracting or losing your audience.

Posted in Newsletters, Using Color | Leave a comment