Visualizing Data

Just what the world needs — another blog.

Well, when it comes to the sharing the best practices for displaying healthcare data visually and finding and telling the story buried in your data that is EXACTLY what the world needs — a blog that delivers the information and help you've just got to have, but don't have easy access to.

And as much as I love the sound of my own voice (and I do, ask anyone) I encourage you to contribute your thoughts, questions and examples (HIPAA compliant please — I don't look good in stripes).

Let the blogging begin.

Design Thinking Required

A colleague recently sent me an article by Atul Gawande, MD: “Why Doctors Hate Their Computers” (huh — you don’t say!).

If you’re familiar with Dr. Gawande’s work, you know that his articles and books are both thoughtful and thought-provoking (he’s a multi-award-winning author and a MacArthur Fellow for good reason). I read the article with interest, recognizing many of the frustrations with new Electronic Health Records (EHR’s) noted by the physicians he interviewed.

But then there was this pronouncement by one Chief Clinical Officer in response to clinicians’ frustrations: “[W]e think of this as a system for us and it’s not,” he said. “It’s for the patients.” I was struck speechless (which also means that hell may actually freeze over).

Hold up! EHR’s are and should be for use by, and the benefit of, BOTH physicians and patients.

And let’s be clear: there’s no “which came first” chicken|egg conundrum here. Aside from some electronic portals that allow patients to enter their personal and medical history, the majority of EHR data and information comes directly from clinicians and other providers caring for patients.

Therefore, patients will realize the full benefit of the new EHR’s only if they are intuitive and easy to use by clinicians — that is, only if clinicians are able to use them optimally.

Clearly, we need EHR’s for all of the reasons noted in Gawande’s article, and I fully understand the frustration the Chief Clinical Officer and others feel when they introduce new technology to a sometimes hostile audience. (I am after all a charter member of the “Innovators who are Deeply Maligned and Misunderstood” club.)

However, this particular comment by the CCO causes me great consternation and reinforces my conviction that Design Thinking methodologies are not well understood or being used to create better systems for ALL stakeholders.

Rather, and all too often, the methodology of Design Thinking is missing entirely. (Sadly, any reference to it is also missing from the Gawande piece.) Instead, teams with no training or experience in Design Thinking develop systems using mostly traditional engineering approaches. Then, when users find these systems difficult and frustrating to use, the response tends to be:

bang head here and carry on

Design Thinking is a process for creative problem-solving that puts people at the center of the design process through empathy. It is human-centered, collaborative, experimental, and optimistic. The process is not linear, but usually follows these steps:

Empathize. Empathy in Design Thinking requires you to observe the people you are designing for, interact with them to understand their points of view, and immerse yourself, so that you can experience what they experience. It helps move teams away from self-referential thinking. (Check out my post about personas and how they help us empathize here.)

Define. As you define a problem in Design Thinking, you are looking for interesting and compelling insights about the people you are designing for, and how they think about the work they do. This search provides focus and a framework for your problem-solving efforts. It allows you to be intentional in your designs. (Check out my post about mental models here.)

Ideate. By building lots of design solutions, you allow for unexpected and radical solutions to design challenges. You also harness the collective perspectives and strengths of your design team. (Check out my post about the power of sketching ideas here.)

A Prototype is a way to test for functionality, and permits further insight from the people you are designing for. Prototyping allows you to fail quickly and cheaply by piloting ideas before fully implementing them. (Check out my must-read recommendation, a piece by Alan Cooper on interactive interface design, here.)

Test. By testing your ideas with actual people, you can both refine the concepts and learn more about the people—their needs and desires. Testing also allows you to examine whether or not your design solution is solving your original design problem, or if you need to re-examine your prototypes. (I highly recommend Steve Krug’sbooks on usability testing.)

In closing, I think about Cooper’s first book, The Inmates are Running the Asylum,which should be required reading for everyone involved in creating these systems.

Cooper argues convincingly that designing interactive software-based systems is a specialty as demanding as the construction of them. He forcefully and correctly asserts (and his assertions are further borne out by Dr. Gawande’s article) that the cost of bad design is incalculable, as it robs us of time, customer loyalty, competitive advantage, and opportunity.

Bottom line: it is long past time to stop blaming system users by labeling them “disgruntled” and “uncooperative,” loftily declaring that they simply “need to get with the program.”

Instead, we must get our collective act together and use Design Thinking methodologies to create systems that humans love to use.

Posted in Communicating Data to the Public, Know Your Audience, Newsletters | Leave a comment

A Profoundly Moving Data Display, Revisited

Perhaps you’ve visited this memorial. There are more than 58,000 names engraved on panels of polished black granite commemorating the Americans who died or were listed as Missing in Action in the war. The 250-foot long walls are each ten feet tall at their apex and gradually slope down to ground level. Viewers see their own reflections in the stone as they read the names inscribed there.

The obvious (and perhaps most neutral) way to list the names would have been alphabetically by last name. Instead, designer Maya Ying Lin chose to list them chronologically by date of death (or day reported missing).

Ordering the soldiers by date of death serves to place them near one another as they may have fallen on the battlefield. It helps other soldiers who served at the same time remember those whose deaths occurred during their own tour of duty. It encourages visitors to contemplate the sacrifice of each soldier, and to wonder at the connection of other visitors to the memorial.

The simple, beautiful, and brilliant design of this memorial is really something quite extraordinary in its dignified and engaging presentation of seemingly straightforward information — the names of soldiers.

The Vietnam Veterans Memorial is a profound example of meaningful data visualization, and of the importance of design in communicating the message in our statistics. Alphabetically, the names are just that — data. When listed chronologically, as here, the same information tells a deeply moving story.

The Vietnam Veterans Memorial gets it exactly right on all levels. It weaves a narrative that draws viewers in, connecting them to one another, and both leading and permitting them to reflect on their feelings about the war. It is a prism through which memories of and thoughts on our most controversial and divisive conflict may forever change as it re-molds the most firmly held beliefs — raising awareness, suggesting answers, perhaps stimulating new questions. It can even inspire action, changing what people do in the voting booth, in their career choices, in their communities or as volunteers.

The Vietnam Veterans Memorial sets the bar pretty high for the rest of us, and that is a good thing, because in doing so, it reminds us that even in our day-to-day work of reporting healthcare data, the way we do that really matters.

It matters because we need people to pay attention to important information, and to engage with it — to change how they think and work because of what we have shown them. Data presentation (on black stone or white board) matters because the only way we can affect our systems of care is to be moved to action by what the presentation reveals — to action that will make those systems, and the people they serve, better.

Posted in Communicating Data to the Public, Data Visualization, Know Your Audience, Newsletters | Leave a comment

Serious Talk About Bubbles

This past summer our beloved daughter, Annie, was married to our new, very favorite son-in-law, Douglas. There were bubbles blown by the guests as the bride and groom finalized their vows, and flowing Champagne (lots and lots of Champagne). It was everything we hoped for in every way imaginable. Quite simply, in the words of our sweet girl, “the day was perfect.” Indeed it was.

Now, dear reader, you know where this is going… all those bubbles got me thinking about the bubble charts I see in my work with clients. Alas, unlike Annie and Doug’s wedding, they’re not so perfect.

Consider the following example that was on a Hospital Report Card I recently received:

Click to expand

My initial reaction upon viewing this display was “WHYYYY???? (imagine my whining tone for emphasis).

Why would anyone display data in this manner?

I have a couple of guesses: it’s eye catching and “fun.” The software lets you do it, so it must be right. While those may be real factors governing the creation of this display,

that doesn’t mean they’re acceptable, or in accordance with data visualization best practices. Here are a few reasons why:

  • Not all of the bubbles are clearly labeled with the category of data being displayed, nor could they be, given the size and spacing of each one.
  • Even if the first problem could be remedied by the addition of a color-coded key, the key would be so long that no mere human could ever hold all of its information in short-term memory while viewing the display.
  • The colors are certainly bright and shiny, but they’re also distracting, and add no value to the viewer’s overall understanding of the data.
  • The value of each category of data is not labeled, and it is difficult (if not impossible) to make direct comparisons between one category value and another.
  • Categories can’t be ranked or ordered in any logical way.

As always, our first and over-arching objective must be to show the data and the story in it.

Enter the far less showy and oh-so-sensible, ever-practical bar chart. Displaying data using a bar chart affords us the ability to show the entire label for each value. Additionally, with the use of only one color, the viewer is no longer distracted by trying to understand what the different colors mean (nothing), and instead can see the shape of the data. It is also possible to directly label the value of each bar being displayed, and to rank the results or display them in some other potentially meaningful way, such as alphabetically by category.

Click to expand

Click to expand

Now let’s consider another scenario where bubbles hinder our ability to show the data clearly and in context.

Imagine that we have been asked to create a display for a provider group that delivers services for patients (male and female) diagnosed with reproductive issues. The display needs to include the number of cases in each category for:

  • the current month,
  • the year to date compared with the previous year to date and the difference, and
  • a twelve-month rolling trend.

For this example, let’s also assume that we are displaying data by calendar year, and that the last month of available data is for June of the current year.

Click to expand

Employing these techniques, we can use our label once and display the current month’s case count, followed by year-to-date versus prior year-to-date, and the 12-month rolling trend. Now the viewer can easily see that there are more cases for male than female in the current and year-to-date data, and that this seems to be an ongoing trend.

I know, I know: bars are boring; bubbles are fun. But that is not the point. The goal here — always — is to convey the data in a clear and compelling manner that will make the story in it stand out, and move people to inquire further, learn something new, and when appropriate take action.

If all the same you’re really feeling the need for some fun bubbles, I’ve got a case of leftover Champagne that I’d be thrilled to share.

Posted in Communicating Data to the Public, Dashboards, Data Visualization, Newsletters | Leave a comment

2018 Summer Reading List for the Health and Healthcare Data Geek

I’ve been thinking of including a book (or two) about using maps to display data on my 2018 Summer Reading List, but it was my old pup Juno who sent me the sign I needed as to which book(s). Now I just have to wait for her to finish the chapter she was reading (before she decided to take a nap), so I can get this one back. (Yes, she really is that smart: she even has a tag, given her by my brother Tim, that reads, “I’m the smart one.” Indeed.)

Cartographer’s Toolkit: Colors, Typography, Patterns

Gretchen N. Peterson

More and more often we work with clients who are enamored of maps — regardless of whether or not it makes any sense to include them on their dashboards, reports, or infographics.

Maps are the “cool” feature that everyone seems to want. Unfortunately, they are often reduced to mere filtering and navigation tools (click your state and go to another page of data) rather than helping viewers gain insights into important information (rates of disease prevalence in a country, state, or city; access to healthcare; and the like). Additionally, when maps display this type of important data, they are often used incorrectly.

This just won’t do. We need to endeavor to create really great maps that impart important insights. Peterson’s Cartographer’s Toolkit is a great first step to learning how.

It includes Choropleth and heat maps and numerous historical references and examples, making it more engaging and instructive than you might imagine. I recommend adding this gem to your reference library. I will warn you, though: if your pup’s anything like my Juno, you may have to wrestle it back from her hot little paws.

Designing Better Maps: A Guide for GIS Users (2nd edition)

Cynthia A. Brewer

If you are creating maps, this book is a must-have. As the title clearly signals, designing BETTER maps is the goal, and this comprehensive guide helps achieve it by teaching us to create maps that communicate geospatial data effectively.

Expert cartographer Cynthia Brewer offers clear, easy-to-understand information and direction about the basics of good cartography, including layout, design, scales, projections, color selection, font choices, and symbol placement. As an accompaniment to all of this information, Cynthia Brewer also makes a color selector available for public use at I highly recommend adding this book to your reference library.

The Ghost Map: The Story of London’s Most Terrifying Epidemic — and How It Changed Science, Cities, and the Modern World

Steven Johnson

Back in December, I wrote a newsletter about The Ghost Map and how much I loved it. If you missed that newsletter, or have just been procrastinating about reading this book, now’s the time!

Johnson writes about the cholera epidemic and Dr. John Snow in a thoroughly engaging and compelling way. He goes beyond the immediate details of the 1854 epidemic to describe in vivid detail related subjects like the history of toilets, the upgrading of London’s sewer system, the importance of population density for a disease that travels in human excrement, and the positive as well as the negative aspects of urbanization itself. Never before Victorian London existed, Johnson teaches the reader, had 2.4 million primates of any species lived together within a 30-mile perimeter. The conditions he describes stagger the imagination. This is the stuff of a Dickens novel.

Johnson also describes how Snow used some of the earliest Geographic Information System (GIS) methods to support his arguments. It seems only fitting that I should include his book on a list that includes two other books about displaying data using maps!

Design Matters with Debbie Millman

I am a podcast junkie, and one of my very favorite podcasts is Design Matters with Debbie Millman. True, this podcast is not directly on point for data visualization topics.

That said, and as described on the website, it is “The world’s first podcast about design and an inquiry into the broader world of creative culture through wide-ranging conversations with designers, writers, artists, curators, musicians, and other luminaries of contemporary thought.” Whew. That’s a mouthful, but it’s true. And I love Debbie’s tagline: “And remember, we can talk about making a difference, we can make a difference, or we can do both.” YES!

Just yesterday, on a flight from Washington, D.C. to Boston, I listened to an interview Debbie did with artist and graphic designer Paula Scher about her design career — a career that has spanned everything from album covers to the iconic Citibank logo. I was so mesmerized that I actually hit replay and listened to it again and then again (I admit to being more than a little bit obsessed). Check it out and, just for fun, check out Paula’s book of her map paintings and artwork. Cool stuff!

Here’s the bottom line: if you have a bit of down time this summer (or any time), I highly recommend these interesting reads. Taken together, they will broaden and deepen your thinking and help you gain an even fuller appreciation of how to display data (correctly!) using maps.

Posted in Books, Newsletters | Leave a comment

Visualizing Annual Reports and the Donaghue Foundation

Recently I was engaged to help the Donaghue Foundation re-imagine parts of its 2017 Annual Report. I was thrilled to work on this project, as the important work the Foundation does is completely in alignment with what I believe about our collective need to improve our healthcare system and people’s health.

As the report says, the Foundation envisions “continued improvement in people’s health as a result of converting research into practical benefit.” It will be “an imaginative, collaborative, and engaged participant in the process that begins with rigorous health research and ends in realized health benefits and by doing so will give the vision of Ethel Donaghue its best expression.”

The initial request by Donaghue’s leadership was to
add data visualizations in different places throughout the report — simple enough. But as I read of the grant awards and the research that Donaghue supports, I wanted to do much more. I wanted to make its important funding of researchers’ work “visible” and engaging.

Luckily, the Foundation decision-makers agreed, so we collaborated to highlight and give visual form to the problems the project teams tackle; the ways their research proposes to improve health and healthcare; and how they will report and disseminate their results and ideas.

To accomplish these crucial tasks, we created the Greater Value Portfolio Research Spotlight.

(Click to expand)

First, we built a template to follow for each of the grants to be included in the Spotlight section, studying the grant applications to see where we could use graphics and data visualizations to engage readers.

As you can see in the example above, the header includes the name of the project and the researchers’ names and credentials, followed by a one-line statement of its scope and a slightly longer summary of parameters, how the project contributes to improving value in health and healthcare, and how much money the Donaghue Foundation awarded.

In order to grab the reader’s attention, we designed a bold graphic that says clearly, “This is the adverse health outcome the team is working to understand, improve, perhaps even eliminate.” In this example, the latter is the number of Americans dying daily from opioid-related overdoses.

(Click to expand)

The next section describes the underlying problem associated with the adverse outcome on which the team will focus, along with a simple bar graph displaying the problem.

In this project, it’s the opioid-prescribing patterns of physicians — specifically, the pill burden, or number of days that a prescription covers. Physicians who are high-intensity prescribers are more likely to have patients who use opioids longer.

(Click to expand)

The last section of the Spotlight describes the project approach: what the researchers will do, and how they intend to do it.

In this example, part of the plan is try and change physician prescribing patterns (i.e., prescribing fewer pills, reducing the pill burden) by incorporating alerts into electronic medical records using a concept called Relative Social Ranking. The RSR displays physicians’ opioid-prescribing patterns compared to those of their peers. Again, we created a simple bar graph display to illustrate the concept.

(Click to expand)

The last part of the Spotlight details how the project teams propose to raise awareness and disseminate the results of their work.

In a world full of bad news, I find hope in the good and important efforts that groups like the Donaghue Foundation are making. I’m also excited and gratified that the Foundation engaged me to help more people “see” the great work it funds and supports. In fact, I already have a copy of the Annual Report in my desk drawer.

On days when I’m looking for inspiration, I take it out and remind myself of all the vital, useful developments happening on the front lines.

Posted in Communicating Data to the Public, Data Visualization, Know Your Audience, Newsletters | Leave a comment

Only Charlotte Needs a Web

Recently my colleague and HealthDataViz Senior Consultant Janet Steeger sent me the graph below and an associated article, with the subject line “Blog Posting?”.

(Click to expand)

Incredulousness was quickly followed by inspiration. How, we wondered, could super-smart people use such dumb graphs to display their data? In response, Janet came up with a great re- design (see below).

In a nutshell, the article is about the Value-Based Health Care Delivery (VBHCD) initiative at Harvard Business School, which engages with leading healthcare provider organizations to measure and manage patient-level costs over complete cycles of care for a variety of medical conditions.

The Radar Chart displayed above (sometimes referred to as a Spider Chart, Web Chart, Polar Chart, or as Star Plots) is supposed to help us understand and compare three different provider results on cost, complications, and patient outcomes that are part of a study conducted by VBHCD.

Why this chart doesn’t work

A radar chart allows the viewer to compare multiple quantitative variables; some would also argue that it is most useful for viewing data outliers and displaying performance. Here, however, are a few of the reasons why this chart simply doesn’t work:

The values being displayed are different, and have therefore been rendered on the same scale so that they can be displayed together on this chart. The notes at the bottom of the chart indicate that a score of 100 equals the lowest cost, which to my perception is completely counter-intuitive, annoying and confusing.

Not only is it hard to visually compare the lengths of the different spokes shown; it is also hard to hold the scale in our memories and accurately judge the radial distances.

The lines connecting the data may occlude (obstruct) one another as the values grow closer, or if they are identical. It’s inevitable that when multiple series are plotted, some values will eventually be on top of and therefore blocking each other.

The area of the shapes presented increases as a square of the values rather than linearly. This may cause us to misinterpret the data displayed, because a small difference in the values results in a significant change in the area, so the difference is visually exaggerated.

Now we get to the part about the inspiration that followed our astonishment over anyone’s using such a misleading display format. Let’s agree to leave these radar charts that look very much like Charlotte’s web to our friends the spiders, and consider a better way to display this data (with thanks again to Janet for her re- design).

(Click to expand)

By using bar charts arranged one right below the other, and displaying the actual data (versus the converted form used to display the data in the radar chart), we can easily see and understand that:

Surgeon B has the highest cost for Total Hospital Stay and Post Acute Care, the highest Postoperative Occurrence Rate and Readmission Rate, and the lowest Patient Improvement Scores across the board. (Which brings to mind the iconic Ricky Ricardo’s line “Lucy, you got some ‘splainin’ to do.” But I digress.)

We can also quickly and easily see that although Surgeon C’s Total Hospital Stay Cost is the second highest at $14,000 (it is still $4,000 less then Surgeon B’s), his/her Postoperative Occurrence and Readmissions rates are by far the lowest at 4.0% and 2.0% respectively, and Patient Improvement Scores are the highest.

At first glance it would appear that the extra $4,000 for Total Hospital Care connected with Surgeon C was money well spent. We’d want to dig into the data a bit more to see if we couldn’t learn a thing or two to share with the other surgeons to help them achieve the same or similar results.

Here again, dear Subscriber, we are faced in the radar chart with a display that may look cool, but that simply doesn’t work. I think of what a wise little spider once told her dear friend Wilbur:

“Trust me, Wilbur. People are very gullible. They’ll believe anything they see in print.”

E. B. White, Charlotte’s Web

Just because some very smart people decided to use a seemingly cool chart, we need not be so gullible as to believe it is the best way to display our data.

Posted in Communicating Data to the Public, Data Visualization, Newsletters | Leave a comment

Can’t See the Forest for the Treemaps

I’m sure you’ve heard the expression “Can’t see the forest for the trees.” It describes someone far too involved in the details of a problem or situation to grasp its entirety; (s)he has lost sight of overall goals — the “big picture.”

Thinking about this challenge in the context of data visualization, I can recall many displays created by analysts so enamored of the software they’re using and what it can do that they miss the larger objective: creating visualizations that show the story in the data simply, clearly, and compellingly.

And yes, not so coincidentally, these are sometimes displays of data using a visualization technique called Treemaps (forest, trees, Treemaps — get it?).

It’s important and extraordinarily helpful to understand the genesis of a visualization technique to ensure we are using it correctly. Why was a technique conceived? What problem was it designed to help us solve?

In the case of Treemaps, it was during the 1990’s that Ben Shneiderman of the University of Maryland imagined a new technique to display space-constrained visualizations of hierarchies — or, more simply, to visualize large quantities of hierarchical data far too numerous to be displayed more simply and effectively in a bar graph.

Here’s an example from Steve Few’s book Now You See It, which displays hierarchical stock market data using Shneiderman’s Treemap technique.

(click to expand)

Look closely, and you will see the different levels of data being displayed:

  • Level 1 The whole visualization represents the entire stock market.
  • Level 2 Next, different stock market sectors (financial, healthcare, etc.) are displayed and labeled in the secondary level of rectangles.
  • Level 3 Inside each of these rectangles, the smallest rectangles represent individual stocks within each sector.

Additional information is encoded by:

  • Making the size of each rectangle representative of the size of the respective sectors and stocks being displayed.
  • Using different saturations of the colors blue and red to encode both the current price of the stock and its change in price since the previous day (blue for gains, red for losses).

In the example above, we can see the entire stock market and the relative size of the different sectors (Technology, Consumer Goods, Healthcare). We can also see (for example) that the Financial sector is the largest in the stock market, and that Citigroup’s stock represents a large portion of this sector.

The light blue color conveys that Citigroup had gains, but not as much as some others, which are displayed in a darker blue. By comparison, we can see the relative size of the Technology sector (bottom row) and that Microsoft represents the largest number of stocks in that sector. The red color shows that it had losses.

Treemaps are a relatively complex type of visualization technique designed to solve the challenge of how to display complex categories and sub-categories (hierarchies) of data. The mistake I see most often, however, is that the displays of health and healthcare data being displayed in a Treemap don’t present the same challenges.

For example, I’ve seen simple categories of data like the top 25 states rates of adults diagnosed with diabetes displayed in a Treemap, when bar chart would be more appropriate.

(click to expand)

At first glance, a Treemap for such basic data may look cool, but the novelty wears off very quickly because we can’t easily understand data ranking or compare rates between states; and we can’t add further contextual data of interest.

Now consider the same data in a simple bar graph where we can rank the data, compare each state’s diabetes rate, and label everything directly. We can also include additional contextual information, such as the average for the entire country, using a vertical line overlaid on the bars.

(click to expand)

Here’s the bottom line: if we focus only on the “trees” of the different functionalities and seemingly cool visualizations that many new software applications allow us to create easily — without understanding why they were conceived, and what problems they’re designed to solve — we more often than not will miss the bigger objective of creating clear, accurate, compelling views of crucial data.

However, if we commit to understanding the underlying structure of our data and the best visualization to convey the meaning buried in it, we will be able to see both the forest and the trees.

Posted in Communicating Data to the Public, Data Visualization, Newsletters | Leave a comment

Intersecting Indicators and Time to Think

I want to send a very big and public thank you to the Health Informatics and Financial Reporting leadership at Johns Hopkins Children’s Hospital in St. Petersburg, Florida, for instructing their teams to put an away message on e-mail, shut off their phones, and spend two days immersed in my data visualization training. Man, do they get it!! People need uninterrupted time to learn new skills in an engaged and focused manner and — wait for it — to THINK.

But here’s the deal: it’s not just training that requires that time. Finding solutions to problems, whether simple or gnarly, also requires time to THINK. (Contrary to popular belief, those problems can’t be solved in the blink of an eye by FM — Freaking Magic.)

Let’s look at a visualization I’ve been re-designing that illustrates this idea. I was able to say right away, “This thing is hard to understand” — but that didn’t mean I could come up with a solution equally quickly. That part took me some quiet, uninterrupted time to sketch and consider possible solutions.

This visualization displays the intersection of two indicators from a public health survey. The first indicator, a composite of respondents who answered “yes” to six questions about mood, anxiety, and depression in the 30 days preceding the survey, is labeled “Serious psychological distress.” It is crossed with the indicator “Health insurance coverage” (indicated by answers to the question, “Do you have a health insurance policy?”).

(Click to expand)

The challenge of this visualization is that the viewer has to try and match the “yes” responses displayed in the bars of one indicator with those in the other, then try to decide whether respondents who experienced serious psychological distress had health insurance.

If you look closely, you will notice that the labels are hard to decipher: the two bars on the left are whether or not respondents answered yes or no to “Yes,” Psychological distress; the two on the right are whether they responded yes or no to “Yes,” they had health insurance (I think). Then you have to match the blue to the blue bar and the red to the red one. I admit I’m still wildly confused.

As a first step, I arranged the responses to the two indicators in a simple matrix, then just studied them for a bit.

The simple act of creating this matrix helped me realize that the denominators for the Second Indicator (Health insurance) are the YES and NO responses to the First Indicator (Serious psychological distress). This may sound like a minor insight, but it was a key first step to figuring out how to better display the information.

Next, I started to sketch possible ways to display the results using a simple Excel sheet to create the following visual without any competing distractions. I allowed myself some time just to think about it.

In this new visual, I’ve directly displayed the results of stratifying Indicator 1 results by Indicator 2 results. Now we can see that:

  • Of those respondents who answered YES, they had experienced serious psychological distress, only 5.7% had a health insurance policy compared to 89.1% who had no serious distress and did have health insurance.
  • It may be worthwhile to analyze in greater detail those who suffered serious distress and do not have health insurance. What are their demographics (age, sex, education, income level)? Where do they live? Such analysis can help us focus efforts to create or deliver (for example) new resources and support services.
  • There is power in taking time to create detailed, plain-language labels to eliminate the mystery of what is being displayed.

I love meeting, training, and collaborating with all our clients; but it is especially gratifying when those same clients give the projects we’re working on their full attention — and allow all of us time to think. That’s really the only magic required for creating great work.

Posted in Communicating Data to the Public, Data Analysis, Data Visualization, Newsletters | Leave a comment

Ghost Map

A few weeks ago, I delivered the keynote address at the First Annual Information and Quality Services Center (IQSC) Educational Event, put on by the Dallas Fort Worth Healthcare Foundation (DFWHC) in Dallas, Texas. It was a great meeting, and afforded me the chance to reconnect with folks I’d collaborated with at Baylor Scott & White, and to meet new and interesting people, one of whom introduced me to the book The Ghost Map by Steven Johnson.

Hearing about this book brought me up short — how in the world had I missed it, and what the hell else had I been missing in my busyness? Clearly it was time for me to slow down, turn my head to the left and right, and take in the bigger world around me.

Anyone who has ever joined one of our trainings or read other newsletters will recall that we always talk about Dr. John Snow and his famous work plotting cholera deaths on a map, thereby helping to make the visual connection between where people were getting their water, and deaths attributable to this lethal disease.

That’s pretty much where the story ends in most descriptions of Snow’s work (including, sadly, mine) leaving out many interesting and instructive details. Fortunately, it’s these missing bits that Johnson writes about in a thoroughly engaging and compelling way.

First, he goes beyond the most immediate details of the 1854 epidemic to describe in vivid detail related subjects like the history of toilets, the upgrading of London’s sewer system, the importance of population density for a disease that travels in human excrement, and the positive as well as the negative aspects of urbanization itself.

Never before Victorian London existed, Johnson teaches the reader, had 2.4 million primates of any species lived together within a 30-mile perimeter. The conditions he describes are quite simply staggering to one’s imagination — this is really the stuff of a Charles Dickens novel.

Next, Johnson describes how Snow used some of the earliest Geographic Information System (GIS) methods to support his arguments.

The map below is one that many of us are familiar with, but the analysis that Snow did was far more complex than this simple rendering conveys. Snow drew Thiessen (Voronoi) polygons around the wells, defining straight-line least-distance service areas for each. That is, the polygons defined individual areas of influence around each of a set of points, and also displayed the area closest to each point relative to all other points. (I know: you thought this would be an easy holiday read. Not so much.)

By analyzing the data in this way, Snow was able to understand that a large majority of the cholera deaths fell within the Thiessen polygon surrounding the Broad Street pump, and a large portion of the remaining deaths were on the Broad Street side of the polygon surrounding the bad-tasting Carnaby Street well.

He then re-drew the service area polygons (using nothing more than a pencil and string) to reflect the shortest routes along streets to wells, thereby revealing that an even larger proportion of the cholera deaths fell within the shortest-travel-distance area around the Broad Street pump. A modern-day version of this technique is displayed in the following version of Snow’s original map:

Source: GIS Analyses of Snow’s Map

Although I found the entire book fascinating, given the work I do, I thought that this additional detail about how Snow analyzed his map using Thiessen (Voronoi) polygons was perhaps the most interesting and instructive. We live in a world where technology allows us to see satellite images of our planet and download way-finding applications to our mobile phones in a matter of seconds.

The possibilities of how to use that data in a proactive and positive manner are pretty staggering.

For example, in the case of public health, it is of paramount importance to obtain a picture of mobility patterns and fluctuations in a continuous manner, particularly during emergencies (such as an outbreak of a potential pandemic or disasters) in order to support decision-making or assess the impact of government measures and restrictions to maximize the effect of interventions.

This type of data has historically been collected via manual health surveys; but with the worldwide use of mobile phones and the potential ability to capture and track infections and diseases in populations, along with their movements in the world, and then display that same data and patterns, using advanced GIS technology, the opportunities to see and understand and prevent human suffering are nothing short of revolutionary.

Here’s the bottom line: if you have a bit of down time over the holidays (or anytime), I highly recommend this interesting and easy read. I feel certain it will broaden and deepen your thinking as well.

Posted in Data Analysis, Data Visualization, Newsletters | 1 Comment

Fabulously Terrible

Years ago when I first worked at Mass General Hospital (MGH) in Boston, I had a really great boss, Dr. Dan Rosenthal. Like many smart people, he was also very funny. I especially loved his ability to use adjectives in a way that made you think he was going to tell you one thing, only to hit you with something entirely different.

Once, for example, I helped present a project that my team had been working on. In reviewing the results, Dan said, “Kathy, that was a spectacular…failure.” Ha! Very funny and very true, it had been just exactly that. Another time, I asked how he’d done in the half-marathon he’d run over the weekend. “Kathy, people are truly fantastic…liars,” he said. “At the 10-mile marker, everyone told me how great I looked – and that I shouldn’t worry: the finish line was close.” In my book, that is LOL good.

I was reminded of this last story while pulling together a recent presentation on data visualization – specifically about how data and information can be less than truthful, even downright misleading, when graphical displays are used incorrectly. A few examples follow.

A Quaker Oats ad claims that the cereal is a “Cholesterol Hunter,” and that by eating oatmeal every day for 30 days you can significantly reduce your cholesterol level. Here’s the trickery in the bar graph shown to support this claim: the scale begins at 196, and is so small that the change from week one to week four looks much larger than it actually is. The Center for Science in the Public Interest called Quaker Oats manufacturer General Foods out for its misleading claim and illustration, making it clear that it may take as many as four oat-product servings a day to get any benefit, and adding that a cholesterol level of 196 is still considered high.

As I’ve written elsewhere, bar graphs must start at zero to prevent this sort of distortion. You can read more about why, and see alternative displays here.

There is also this example, from a Masters of Media blog post by Joram Binsbergen of the University of Amsterdam. He discusses an infographic used by the New South Wales Ministry of Health to illustrate an apparently significant increase in the number of nurses recruited by that country’s healthcare system from 2008-March 2013. On closer examination, however, the scale is completely off.

Four “people” icons on the far left of the graphic represent 43,147 nurses; on the far right, forty such figures stand for 47,500 – making a 7% increase look like a nearly 7,000% one.

As Binsbergen points out in his post about this display, it is useful to consult Edward Tufte, who introduced the “Lie Factor” in his book The Visual Display of Quantitative Information: “The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.”

My last example, from Andrew Gelman’s blog, “Statistical Modeling, Causal Inference, and Social Science” (sounds like a snooze, but it’s not – honest!), uses a misleading graphic in a story on medical fund-raising. Purporting to show how much money is raised to help fight specific diseases as compared to the mortalities attributable to those diseases, the zippy graphic is beyond distorting – it devolves into “hot mess” territory.

If for example you look at the large purple bubble on the right and match it to the key above it, heart disease appears to be the leading cause of death (596,577 deaths). I say “appears to be” because the key isn’t labeled to represent mortalities; rather, it says, “Heart Disease|Jump Rope for Heart (2013).”

Now look at the much smaller purple bubble on the left, labeled $54.3 million. If you’re puzzled, that’s a good sign: confusion signals awareness of a large, awkward problem. That left-hand bubble – it would seem – represents just one fundraiser for heart disease (“Jump Rope for Heart”), as compared to all deaths from heart disease.

A quick internet search of “heart disease” led me to this summary document, which made me dismiss the illustration in its entirety.

You may never create a graph, chart, or table, but you will, like most of us, assuredly consume, even depend on, graphic displays. As with all consideration of important and complicated information, I encourage you to slow down, take a breath, and ask yourself, “Does this make sense? Can I believe what is being presented?” And what is more important, “Can I learn something true here, or make a confident decision based on what I think I’m seeing?”

Remember, even information displayed in a pretty graph can be incredibly misleading. Or, as Dan Rosenthal would probably say, “fabulously terrible.”

Posted in Communicating Data to the Public, Data Analysis, Newsletters | Leave a comment