Visualizing Data

Just what the world needs — another blog.

Well, when it comes to the sharing the best practices for displaying healthcare data visually and finding and telling the story buried in your data that is EXACTLY what the world needs — a blog that delivers the information and help you've just got to have, but don't have easy access to.

And as much as I love the sound of my own voice (and I do, ask anyone) I encourage you to contribute your thoughts, questions and examples (HIPAA compliant please — I don't look good in stripes).

Let the blogging begin.

Intersecting Indicators and Time to Think

I want to send a very big and public thank you to the Health Informatics and Financial Reporting leadership at Johns Hopkins Children’s Hospital in St. Petersburg, Florida, for instructing their teams to put an away message on e-mail, shut off their phones, and spend two days immersed in my data visualization training. Man, do they get it!! People need uninterrupted time to learn new skills in an engaged and focused manner and — wait for it — to THINK.

But here’s the deal: it’s not just training that requires that time. Finding solutions to problems, whether simple or gnarly, also requires time to THINK. (Contrary to popular belief, those problems can’t be solved in the blink of an eye by FM — Freaking Magic.)

Let’s look at a visualization I’ve been re-designing that illustrates this idea. I was able to say right away, “This thing is hard to understand” — but that didn’t mean I could come up with a solution equally quickly. That part took me some quiet, uninterrupted time to sketch and consider possible solutions.

This visualization displays the intersection of two indicators from a public health survey. The first indicator, a composite of respondents who answered “yes” to six questions about mood, anxiety, and depression in the 30 days preceding the survey, is labeled “Serious psychological distress.” It is crossed with the indicator “Health insurance coverage” (indicated by answers to the question, “Do you have a health insurance policy?”).

(Click to expand)

The challenge of this visualization is that the viewer has to try and match the “yes” responses displayed in the bars of one indicator with those in the other, then try to decide whether respondents who experienced serious psychological distress had health insurance.

If you look closely, you will notice that the labels are hard to decipher: the two bars on the left are whether or not respondents answered yes or no to “Yes,” Psychological distress; the two on the right are whether they responded yes or no to “Yes,” they had health insurance (I think). Then you have to match the blue to the blue bar and the red to the red one. I admit I’m still wildly confused.

As a first step, I arranged the responses to the two indicators in a simple matrix, then just studied them for a bit.

The simple act of creating this matrix helped me realize that the denominators for the Second Indicator (Health insurance) are the YES and NO responses to the First Indicator (Serious psychological distress). This may sound like a minor insight, but it was a key first step to figuring out how to better display the information.

Next, I started to sketch possible ways to display the results using a simple Excel sheet to create the following visual without any competing distractions. I allowed myself some time just to think about it.

In this new visual, I’ve directly displayed the results of stratifying Indicator 1 results by Indicator 2 results. Now we can see that:

  • Of those respondents who answered YES, they had experienced serious psychological distress, only 5.7% had a health insurance policy compared to 89.1% who had no serious distress and did have health insurance.
  • It may be worthwhile to analyze in greater detail those who suffered serious distress and do not have health insurance. What are their demographics (age, sex, education, income level)? Where do they live? Such analysis can help us focus efforts to create or deliver (for example) new resources and support services.
  • There is power in taking time to create detailed, plain-language labels to eliminate the mystery of what is being displayed.

I love meeting, training, and collaborating with all our clients; but it is especially gratifying when those same clients give the projects we’re working on their full attention — and allow all of us time to think. That’s really the only magic required for creating great work.

Share
Posted in Communicating Data to the Public, Data Analysis, Data Visualization, Newsletters | Leave a comment

Ghost Maps

A few weeks ago, I delivered the keynote address at the First Annual Information and Quality Services Center (IQSC) Educational Event, put on by the Dallas Fort Worth Healthcare Foundation (DFWHC) in Dallas, Texas. It was a great meeting, and afforded me the chance to reconnect with folks I’d collaborated with at Baylor Scott & White, and to meet new and interesting people, one of whom introduced me to the book The Ghost Map by Steven Johnson.

Hearing about this book brought me up short — how in the world had I missed it, and what the hell else had I been missing in my busyness? Clearly it was time for me to slow down, turn my head to the left and right, and take in the bigger world around me.

Anyone who has ever joined one of our trainings or read other newsletters will recall that we always talk about Dr. John Snow and his famous work plotting cholera deaths on a map, thereby helping to make the visual connection between where people were getting their water, and deaths attributable to this lethal disease.

That’s pretty much where the story ends in most descriptions of Snow’s work (including, sadly, mine) leaving out many interesting and instructive details. Fortunately, it’s these missing bits that Johnson writes about in a thoroughly engaging and compelling way.

First, he goes beyond the most immediate details of the 1854 epidemic to describe in vivid detail related subjects like the history of toilets, the upgrading of London’s sewer system, the importance of population density for a disease that travels in human excrement, and the positive as well as the negative aspects of urbanization itself.

Never before Victorian London existed, Johnson teaches the reader, had 2.4 million primates of any species lived together within a 30-mile perimeter. The conditions he describes are quite simply staggering to one’s imagination — this is really the stuff of a Charles Dickens novel.

Next, Johnson describes how Snow used some of the earliest Geographic Information System (GIS) methods to support his arguments.

The map below is one that many of us are familiar with, but the analysis that Snow did was far more complex than this simple rendering conveys. Snow drew Thiessen (Voronoi) polygons around the wells, defining straight-line least-distance service areas for each. That is, the polygons defined individual areas of influence around each of a set of points, and also displayed the area closest to each point relative to all other points. (I know: you thought this would be an easy holiday read. Not so much.)

By analyzing the data in this way, Snow was able to understand that a large majority of the cholera deaths fell within the Thiessen polygon surrounding the Broad Street pump, and a large portion of the remaining deaths were on the Broad Street side of the polygon surrounding the bad-tasting Carnaby Street well.

He then re-drew the service area polygons (using nothing more than a pencil and string) to reflect the shortest routes along streets to wells, thereby revealing that an even larger proportion of the cholera deaths fell within the shortest-travel-distance area around the Broad Street pump. A modern-day version of this technique is displayed in the following version of Snow’s original map:

Source: GIS Analyses of Snow’s Map

Although I found the entire book fascinating, given the work I do, I thought that this additional detail about how Snow analyzed his map using Thiessen (Voronoi) polygons was perhaps the most interesting and instructive. We live in a world where technology allows us to see satellite images of our planet and download way-finding applications to our mobile phones in a matter of seconds.

The possibilities of how to use that data in a proactive and positive manner are pretty staggering.

For example, in the case of public health, it is of paramount importance to obtain a picture of mobility patterns and fluctuations in a continuous manner, particularly during emergencies (such as an outbreak of a potential pandemic or disasters) in order to support decision-making or assess the impact of government measures and restrictions to maximize the effect of interventions.

This type of data has historically been collected via manual health surveys; but with the worldwide use of mobile phones and the potential ability to capture and track infections and diseases in populations, along with their movements in the world, and then display that same data and patterns, using advanced GIS technology, the opportunities to see and understand and prevent human suffering are nothing short of revolutionary.

Here’s the bottom line: if you have a bit of down time over the holidays (or anytime), I highly recommend this interesting and easy read. I feel certain it will broaden and deepen your thinking as well.

Share
Posted in Data Analysis, Data Visualization, Newsletters | 1 Comment

Fabulously Terrible

Years ago when I first worked at Mass General Hospital (MGH) in Boston, I had a really great boss, Dr. Dan Rosenthal. Like many smart people, he was also very funny. I especially loved his ability to use adjectives in a way that made you think he was going to tell you one thing, only to hit you with something entirely different.

Once, for example, I helped present a project that my team had been working on. In reviewing the results, Dan said, “Kathy, that was a spectacular…failure.” Ha! Very funny and very true, it had been just exactly that. Another time, I asked how he’d done in the half-marathon he’d run over the weekend. “Kathy, people are truly fantastic…liars,” he said. “At the 10-mile marker, everyone told me how great I looked – and that I shouldn’t worry: the finish line was close.” In my book, that is LOL good.

I was reminded of this last story while pulling together a recent presentation on data visualization – specifically about how data and information can be less than truthful, even downright misleading, when graphical displays are used incorrectly. A few examples follow.

A Quaker Oats ad claims that the cereal is a “Cholesterol Hunter,” and that by eating oatmeal every day for 30 days you can significantly reduce your cholesterol level. Here’s the trickery in the bar graph shown to support this claim: the scale begins at 196, and is so small that the change from week one to week four looks much larger than it actually is. The Center for Science in the Public Interest called Quaker Oats manufacturer General Foods out for its misleading claim and illustration, making it clear that it may take as many as four oat-product servings a day to get any benefit, and adding that a cholesterol level of 196 is still considered high.

As I’ve written elsewhere, bar graphs must start at zero to prevent this sort of distortion. You can read more about why, and see alternative displays here.

There is also this example, from a Masters of Media blog post by Joram Binsbergen of the University of Amsterdam. He discusses an infographic used by the New South Wales Ministry of Health to illustrate an apparently significant increase in the number of nurses recruited by that country’s healthcare system from 2008-March 2013. On closer examination, however, the scale is completely off.

Four “people” icons on the far left of the graphic represent 43,147 nurses; on the far right, forty such figures stand for 47,500 – making a 7% increase look like a nearly 7,000% one.

As Binsbergen points out in his post about this display, it is useful to consult Edward Tufte, who introduced the “Lie Factor” in his book The Visual Display of Quantitative Information: “The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.”

My last example, from Andrew Gelman’s blog, “Statistical Modeling, Causal Inference, and Social Science” (sounds like a snooze, but it’s not – honest!), uses a misleading graphic in a story on medical fund-raising. Purporting to show how much money is raised to help fight specific diseases as compared to the mortalities attributable to those diseases, the zippy graphic is beyond distorting – it devolves into “hot mess” territory.

If for example you look at the large purple bubble on the right and match it to the key above it, heart disease appears to be the leading cause of death (596,577 deaths). I say “appears to be” because the key isn’t labeled to represent mortalities; rather, it says, “Heart Disease|Jump Rope for Heart (2013).”

Now look at the much smaller purple bubble on the left, labeled $54.3 million. If you’re puzzled, that’s a good sign: confusion signals awareness of a large, awkward problem. That left-hand bubble – it would seem – represents just one fundraiser for heart disease (“Jump Rope for Heart”), as compared to all deaths from heart disease.

A quick internet search of “heart disease” led me to this summary document, which made me dismiss the illustration in its entirety.

You may never create a graph, chart, or table, but you will, like most of us, assuredly consume, even depend on, graphic displays. As with all consideration of important and complicated information, I encourage you to slow down, take a breath, and ask yourself, “Does this make sense? Can I believe what is being presented?” And what is more important, “Can I learn something true here, or make a confident decision based on what I think I’m seeing?”

Remember, even information displayed in a pretty graph can be incredibly misleading. Or, as Dan Rosenthal would probably say, “fabulously terrible.”

Share
Posted in Communicating Data to the Public, Data Analysis, Newsletters | Leave a comment

The Hammock Chronicle (2017 Edition)

Greetings from Bustins Island, Maine, where every summer I mix a batch of watermelon mojitos, climb into the hammock, and reflect on the year to date, the work ahead, and the Sox’ standing in the ALE (as of 8/25, first place – 5 games ahead of the Yankees).

On the professional front, I am heartened by the fact that data is increasingly being analyzed and used to improve our health and systems of care, and by growing awareness of the best practices of data visualization and visual intelligence. There is all the same a lot of work to be done!

The good news I’m hearing from clients is they’ve improved their performance on many quality measures and are consistently meeting their goals. What remains to master is how to monitor their results using something other than a bar chart.

Though useful and easy to interpret most of the time, bar charts can make it difficult to see very small differences between multiple measures on a dashboard or report; and they can take up a lot of space better used to display additional important metrics. Clients often ask, pleadingly, if “just this once” they can ignore the best practice of starting a bar graph at zero. My inner Dana Carvey kicks in with “…wouldn’t be prudent …not gonna do it” (yes, it sounds funnier when he says it). Won’t be auditioning for the host slot on SNL anytime soon, though, even with help from those mojitos; so back to visualizations.

As you can see in this first display, the points showing actual performance for all four measures versus their respective targets appear quite closely grouped, so it’s hard to tell them apart. This hindrance in turn tempts us to break the best-practice rule of starting a bar graph at zero, and instead start it at a higher value. To remind ourselves why this is a bad idea, let’s review that rule.

Bars display and compare the size of different values; if they begin at a value greater than zero, the true size of what they are designed to measure is distorted. Starting the scale at a value > 0 and decreasing the increments across the chart or down the bars (that is, along the X or the Y axis) to magnify the values can make the differences displayed seem much larger than they actually are. (Take a look at my earlier post on this topic here. You may decide to eat less oatmeal after you do).

This illustration shows the effect of interval distortion.

Consider these options for resolving this conflict between visualization guidelines and the need for a quick fix.

Points and Lines

Using points and lines to display values like these can be a good alternative to a bar chart because that frees us from having start the scale at zero. The points and lines format also allows us to display viewer-required details in far less space on a dashboard or report than does a bar graph.

Deviation Graph

One of my favorite ways to display performance versus target data on a dashboard or report is with a deviation graph, which illustrates the relative difference between two values. Though each performance metric may have a different target figure, this graph shows only the distance from target for each, so we can line them all up one after the other (see below). A deviation graph is neat, clear, and space-saving. (Two more articles on this useful visualization appear here and here.)

Small Multiple Line Graphs

Depending on viewer requirements, you might also consider displaying the kind of data we’ve been discussing in small multiple line graphs. Unlike bar graphs, line graphs do not have to begin at zero, because the line isn’t comparing the sizes of two values. Instead, lines display values changing or trending over time.

If for example you need to show how your organization has performed over time compared to each measure’s unique target, you can create separate graphs for each, careful to make the scales begin and end with the same values, then arrange them in sequence. This technique makes it easy to see how the measures have changed, and compare measures both to targets, and to each other. (Two previous small multiples posts are here and here.)

Click to expand

In the past decade or so, the health and healthcare industries (among others) have increased their awareness of the need to discover and embrace best practices of caring for patients and delivering services. This has also happened with data visualization.

As we become more familiar and comfortable with the science behind and research on visual intelligence and cognition, we too must embrace these best practices to ensure that our visualizations are correctly displaying the data, that our message is clear, and that people are informed, inspired, and above all moved to action to improve the healthcare we help deliver.

Speaking of caring for people, the hammock is beckoning. I need a nap before tonight’s game. Life is good.

P.S. For those who ask, “What about wide variation data?” – a different, albeit related subject – I call your attention to a not-so-recent post I wrote, here.

Share
Posted in Communicating Data to the Public, Data Visualization, Graphs, Newsletters | 1 Comment

2017 Summer Reading List for the Health and Healthcare Data Geek

I sincerely wish I could avoid typing the following: it is July 2017, which means it is time again for my annual book recommendations, as well as a swing in the hammock with a watermelon mojito (no regrets there).

Seriously? I’m going to hire a search and rescue team to find my life and bring it back to me (the edited version, of course), because I have no idea where it has zoomed off to! I know I was just celebrating my 21st birthday a few moments ago… denial is more than a river in Egypt (or so I’m told).

Here is some of what I’ve been reading. I think you’ll enjoy these titles, too.

Truman

I know: you’re probably wondering why this book is on the list. Although Ron Chernow’s Alexander Hamilton is all the rage these days (and for good reason – I read the book and loved it; helped to pass the time waiting for the next millennium to land tickets to the Broadway show), there is a wealth of other great books about the history of our country, including a special-interest subset recounting the lives and work of champions of universal healthcare for U.S. citizens. Until I read David McCullough’s book, I thought Lyndon Johnson had been the original promoter and architect of the Medicare program: I had not fully understood the huge role that Truman played.

Additionally, Truman’s famous defeat of Dewey in the 1948 election was extraordinarily fascinating in its demonstration of how pollsters can get it so wrong (yes, history repeats itself). The significant political repercussions highlighted the pitfalls of survey samples and results, and the need for rigor in this type of work.

Willful Ignorance – The Mismeasure of Uncertainty

The author, Causalytics founder Herbert I. Weisberg, Ph.D., weaves engaging stories about important thinkers, and how the problems they worked to solve using statistical methods helped propel scientific research. But this book is more than just a historical view of these efforts: it’s also a cautionary tale about the mountains of simplified studies and statistics that result in frequent reversals of scientific findings and recommendations.

As the title suggests, the fallacy of regarding probability as the full measure of our uncertainty is contributing to an oncoming crisis. Weisberg says, about clinical research and care, that our current methodological orthodoxy plays a major role in deepening the division between scientific researchers and clinical practitioners. Prior to the Industrial Age, research and practice were better integrated. Investigation was generally more directly driven by practical problems, and conducted by those directly involved in solving them. But then as scientific research became more specialized and professionalized, the perspectives of researchers and clinicians began to diverge. In particular, their respective relationships to data and knowledge have resembled each other less and less.

If you work with statistical methods, especially probabilities, or you have to understand them well enough to explain them (and their limitations), then this is a really top-notch book that you should seriously consider taking the time to read.

The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios

I state my disclaimer right up front: my HealthDataViz team and I contributed to this book. That is of course what makes this title a particularly great addition to your health|health care reference shelf (no brag; just fact)!

Each chapter follows a standard layout that I like a lot. Each begins with a summary of the big picture that the dashboard addresses, followed by the specific metrics displayed and related scenarios illustrated. “How People Use This Dashboard” is next, supported by different visuals on the dashboard as examples. A “Why This Works” section rounds out the chapter.

The authors also include a data visualization best practices summary by displaying what NOT to do – very useful! No red, yellow, green pies, donuts, bubbles, or word clouds, please!

A Civil Action

If you have never read this book, this is the summer to do so. I absolutely love it, and any time the movie based on it is on television, I drop everything and watch (just ask my husband, Bret – there is no dissuading me). A Civil Action is the true story of the quest by a somewhat idealistic young lawyer to collect damages from two corporate giants, Beatrice Foods and W.R. Grace, for allegedly polluting the water in Woburn, Massachusetts, a Boston suburb, with carcinogens. The case considers a cluster of leukemia victims in Woburn (the disease claimed the lives of at least six children), and the tremendous challenge of reconciling a preponderance of experiential or circumstantial evidence with scientific results. How do you prove causation in the courtroom? Is it possible or even correct to try and do so?

I love, love, love this book and am certain it will find its way into my summer bag yet again this year – a perfect read for a few hours in the hammock.

Coming Soon!! Tableau for Health and Healthcare, v. 3

Many of you have been asking, and the answer is a resounding YES! The HealthDataViz team has been hard at work updating our Tableau manual (a very special thanks to Janet, Ann, Marnie, Dan, Jim, Peter, and our very own GrammarLady, Anne). Data sources and examples have been updated to reflect what we’ve learned while training folks to successfully use Tableau, and there are some new tips and tricks. It’s due out in the coming weeks, so stay tuned for our announcement.

Finally, let me take a moment to thank all my faithful subscribers and our clients. You are the best and biggest reason for what we do, and we are deeply grateful for your support. We look forward to presenting engaging new ideas and fresh approaches, and collaborating on innovative projects with you.

Share
Posted in Books, Newsletters | Leave a comment

Scope of Responsibility Changes the View

I’ve been teaching a lot of data visualization workshops lately. Inevitably, when I reach the part of the day when I ask participants how they gather requirements to build a monitoring dashboard, I always get the same rote, data-analyst-centric response: “I ask my customers what questions they want answered.”

My job (or cross to bear; you decide) is to then firmly nudge them toward a new approach, one that requires them to ask instead, “What is your role and scope of responsibility? As you work in that role, what decisions must you make to achieve your goals and objectives?”

Dashboards exist to help people visually monitor – at a glance – the data and information they need to achieve one or more goals and objectives quickly and easily. This is considerably different from analyzing data to answer a specific question or to uncover potentially interesting relationships in that data.

With this construct about the purpose of a dashboard in mind, let’s consider examples of two different prototype Emergency Department (ED) dashboards designed using the same source data. We’ll ask end users to describe their role (position) in the ED, the scope of their responsibility there, and what summary information they need in deciding how to meet their goals and objectives. We’ll call this the RSD [Role, Scope, Decisions] approach.

Example A: Emergency Department Operations Manager

Here, the ED Operations Manager’s role and scope of responsibility are to ensure that patients arriving at the ED receive timely and appropriate care, and that the ED doesn’t become overloaded, thereby causing unduly long patient wait times or diversion to another facility.

Given these parameters, the chronological frame for the dashboard below is present|real time, and is focused on where and for how long actual ED patients are in the queue to receive care.

Click to expand

In the upper left-hand section of this dashboard is a summary ED Overload Score (70), overlaid on a scale of No to Extreme Overload. Under this summary are elements of the score: ED Triage (10 points), Seen by MD|Waiting for Specialty (10 points), Specialty Patients Waiting (20 points) and Waiting for In-Patient Bed (30 points). This summary provides a mechanism for the manager to monitor both the risk of overload and the key factors driving the score higher.

Additional information on the dashboard helps the Manager analyze (across census categories) the patient census, and see how many cubicles are currently in use vs. available for examination and treatment. Average wait times in minutes and by patient triage level in eight (8) categories such as Arrival to MD Evaluation (compared to a hospital goal), and ED Length of Stay (LOS) are displayed using bar graphs in the middle section. The lower left-hand display projects when additional cubicles will be available (blue signals available cubicles; orange, a shortage); the lower right-hand one shows information on patient wait times by sub-specialty.

All of this dashboard’s metrics are designed to help the ED Operations Manager identify active and potential bottlenecks, and to act to meet the objectives: delivering timely and appropriate care, and avoiding ED overload.

Example B: Emergency Department Executive Director

Here, the Executive Director’s role and scope of responsibility are to ensure that not only is the ED team providing timely and appropriate care, but that reimbursement is not forfeited because pay-for-performance (annual|contractual|third-party|value-based-purchasing) goals are missed.

In response to this role’s needs, display time frames include both Month to Date and Current and Previous YTD performance, allowing the Director to stack current performance against agreed-upon targets for metrics tied to third-party reimbursement, as well as potential opportunities for improvement.

Click to expand

At the top of the dashboard is a table of summary performance metrics (number of patients seen and treated; time in diversion due to ED Overload) for the current month vs. current and previous years, and change over time. The middle bar charts provide the Director with the current month and YTD performance compared to targets for the metrics often tied to third-party, value- based (pay-for-performance) reimbursement. The deviation graphs at the bottom of the dashboard provide context for monthly performance compared to targets trended over time.

In this dashboard, summary metrics help the ED Executive Director monitor overall performance, identify areas for improvement in delivering timely care and avoiding ED Overload, and ensure that reimbursement is not lost.

Shifting from asking your customers what questions they need to answer to asking them to describe scope, role, and decisions may seem like a distinction without a difference. It isn’t. Framing inquiries this way stimulates everyone to step back and examine what is required to support a universal, shared goal: acquiring the right information – at a glance – to work toward goals and objectives, and hit those targets, quickly, confidently, and well.

Share
Posted in Dashboards, Data Visualization, Newsletters | Leave a comment

Stop Hunting Unicorns and Start Building Teams

Guests in our home are often very generous with their compliments on my cooking skills. While I sincerely wish those compliments were deserved, the sad (and, okay, shocking) truth is that they are not.

I’m not a great cook: rather, I am an excellent assembler of food that other people have created. I know where to shop, and the way to put together terrific dishes, and I know how to pour a generous glass of wine (or three). These skills appear to convince people that I know how to cook.

Here’s another thing I’m great at assembling: fun, smart, wildly talented, highly collaborative, and productive professional teams. What’s my secret? I know that unicorns aren’t real.

Unfortunately many health and healthcare organizations, rather than working to assemble these types of teams, persist in hunting unicorns. They assume that one person can posses every skill required to create compelling and clear analysis and reporting.

These organizations need to stop the fairy-tale hunt, and start building data-analytics and communications teams. The idea that any one analyst or staff person will ever have every single bit of knowledge and skill in health and healthcare, technical applications, and data visualization and design required to deliver beautiful and compelling dashboards, reports, and infographics is just – well, sheer lunacy.

3 Tips for Building Data-Analytics and Reporting Teams

Tip 1: Search For Characteristics & Core Competencies

To build a great team, you need to understand what characteristics and core competencies are required to complete the work. Here’s where to begin:

  • Curiosity. When teams are curious they, question, probe, and inquire. Curiosity is a crucial impetus for uncovering interesting and important stories in our health and healthcare data. Above all else, you need a team of curious people! (Read my previous post about this here.)
  • Health & Healthcare Subject-Matter Expertise. Team members with front-line, boots-on-the-ground, clinical, operational, policy, financial, and research experience and expertise are essential for identifying the questions of interest and the decisions or needs of the stakeholders for and to whom data is being analyzed and communicated.
  • Data Analysis and Reporting. Without exception, at least one member of your team must have math, statistics, and data-analysis skills. Experience with data modeling is a plus if you can find it; at a minimum, some familiarity with the concept of modeling is very helpful. The ability to use data-analysis, reporting, and display tools and applications is also highly desirable, but another more technically trained IT team member may be able to bring this ability to the table if necessary.
  • Technical: IT & Database Expertise. Often, groups will confuse this skill area with data-analysis and reporting competence. Data and database architecture and administration require an entirely different set of skills from those needed for data analysis, so it’s important not to conflate the two. You’ll need team members who know how to extract, load, and transform (ETL) and architect data for analysts to use. And while you may sometimes find candidates who have both skill-sets, don’t assume that the presence of one means a lock on the other.
  • Data Visualization & Visual Intelligence. Knowledge of best practices and awareness of current research is required to create clear, useful, and compelling dashboards, reports and infographics. But remember, these skills are not intuitive; they must be learned and honed over time. And although it is not necessary for every team member to become an expert in this field, each should have some awareness of it to avoid working at cross-purposes with team members employing those best practices. (That is, everyone should know better than to ask for 3D red, yellow, and green pie charts.)
  • Project Management. A project manager with deep analytic, dashboard, and report-creation experience is ideal – and like the mythical unicorn nearly impossible to find. But don’t let that discourage you. Often a team member can take on a management role in addition to other responsibilities, or someone can be hired who, even without deep analytics experience, can keep your projects on track and moving forward.

Tip 2: Be Prepared to Invest in Training and/or External Resources

  • Why? Because they don’t teach this stuff in school.

    At present, formal education at institutions of higher learning about the best practices of data visualization, and state-of-the-art visualization and reporting software applications is scarce, and competition to hire qualified data analysts is fierce. As a result, you must be prepared to invest in training the most appropriate team members in many of these new skills, and/or working with qualified external resources.

Tip 3: Have A Compass. Set a Course. Communicate It Often.

  • The primary challenge for your team is not to simply and boldly wade into the data and find something interesting. Rather, team efforts should be aligned with the organization’s goals. This means that you must establish and communicate clear direction and objectives for everyone to deliver on from Day One. Having a compass and setting a well-defined course also help keep your teams from getting caught up in working on secondary or tertiary problems that are interesting, but unlikely to have significant impact on the main goal.

I do wish that data-analysis and reporting unicorns were real! Life would be so much simpler. But they aren’t and never will be, so I let go of that fantasy long ago. You should, too.

Share
Posted in Data Analysis, Newsletters | Leave a comment

Mental Models

Whenever I teach my “data visualization best practices” courses, I always include an introductory overview about mental models – an explanation of a person’s thought process about how something works in the real world. I do this because understanding mental models can help us construct an effective approach to solving problems and accomplishing tasks.

First, I ask course participants to think about, then describe, how they read a printed book.

The responses always include such observations as, “I look at the Table of Contents; then I turn the pages from right to left. I read the words on the pages from left to right and top to bottom. If a passage holds particular interest, I often underline it; if I come across an unfamiliar word, I sometimes look it up in a dictionary.”

Once we have gone through this exercise, I ask how they read a book on a Kindle or other electronic device. Their responses are almost identical to the first set. Turning pages and text exploration are faster and more effortless on an e-reader (if less tactilely satisfying) – but they are essentially the same processes.

Next, I ask them to weigh in on how successful they believe Amazon would have been had its designers created an e-reader that required people to process a book in an entirely new way – for example, by starting on the last page, turning pages from left to right, and reading from bottom to top. How many of you, I ask, would have even considered reading on a Kindle? Not a single hand is ever raised.

This simple, familiar example makes the point: it’s really difficult, if not impossible, to get people to change the way they think about doing something – especially when that way is familiar, and works.

As a result, the importance of uncovering and understanding the mental models of the viewers of our dashboard and reports – the way they use data and information to support their work – is essential to designing and building something of value. Quite simply, before we ever sit down to our design work at a computer screen, we must endeavor to learn as fully as possible the process by which our internal and external customers use data to make decisions about the work they do.

Let’s consider a simple example: post-discharge referrals to home health care providers by a local hospital.

How might a discharge or case manager think about – what is the mental model for – determining which patients to refer for services and where to refer them? It is highly likely (and has been confirmed based on previous work analyzing one such group’s mental model) that these managers think about and want to know the answers to such questions as:

  • are all patients who could benefit from home health care services – say, patients who might be at increased risk for readmission within 30 days – receiving referrals to them?
  • which providers are geographically closest to a patient’s home?
  • how well do different agencies perform by quality-of-care measures?
  • how do patients rate different agencies on satisfaction surveys?

Using the questions gleaned from our example discharge or case manager’s mental model as a guide, we created the following three interactive dashboards to display, highlight, and clarify data in alignment with these questions.

The first dashboard filters for a particular hospital and desired date. The top section displays summary metrics that drill down by hospital service line. The map pinpoints the ZIP code locations of home health agencies with referrals, while a bar graph quantifies referrals per agency. Each Provider Name is a hyperlink to the Home Health Agency Comparison dashboard.

(click to expand)

On the second dashboard, “At Risk by DRG,” is a summary narrative capturing statistics on missed opportunities – that is, concerning patients who may be at risk for readmission and for whom home health care may help reduce that risk; a visual trend line highlights these figures. Additionally, the data displays categories, and drills down to a specific DRG level. To the right is a payer heat map that uses color to identify those at highest risk.

(click to expand)

“Home Health Agency Comparison,” the third dashboard, shows – with an easy-to-use, side-by-side comparison tool – how HHA’s perform on publicly reported quality metrics.

(click to expand)

Far too often we blame ourselves when we fail to grasp how something new to us works, or can’t make any sense of the information we have been given in a dashboard or report. Most of the time, though, we are not to blame. Rather, the product designer or data analyst has failed to understand our mental model – the way we interact with or think about things in the real world. We end up looking for this:

And worse than banging our heads against the foolishness of paying for and being handed something we don’t want and won’t use is the inevitable result that we will simply revert to what we know: a book printed on paper, or an Excel spreadsheet – thereby missing the potential to do more and see better in a new and exciting way.

And wouldn’t that be a shame?

P.S. To view all three of these examples as interactive dashboards, click here.

Share
Posted in Best Practices, Dashboards, Data Visualization, Newsletters | Leave a comment

Red, Yellow, Green… Save It For The Christmas Tree

Listen up folks… it is time for a red, yellow, green color intervention of the most serious kind. The use of red, yellow, green to indicate performance on your reports and dashboards has reached a crisis level and can no longer be ignored.

It is time for some serious professional help.

Here is your choice: go into color rehab treatment and clean up your act or, risk losing your stakeholders attention and – even more damaging – risk obscuring important information they require to make informed decisions.

And just to be clear – you are absolutely risking these things by overusing and incorrectly using red, yellow, green color coding in your reports and dashboards. (And besides, red, yellow, green is SO last season.)

I can read your thoughts – “but that is what people ask for – they want to emulate a stoplight – they LIKE red, yellow, green.” And I liked cheap beer until I tasted the good stuff.

Let’s consider how the use of these colors is hurting and your reports and what you can do to fix it.

1. Did you know that approximately 10% of all men and 1% of all women are colorblind? Yes, it is sad, but true. So, where most of us see this:

traffic light

Our colorblind colleagues see this:

traffic light - colorblind

Which means, that when you publish a report that looks like this to the majority of us:

Medical Center Results 2010
Q1 Q2 Q3 Q4
Acute Myocardial Infarction (AMI)
Aspirin at Arrival 88% 83% 78% 83%
Aspirin Prescribed at Discharge 38% 86% 60% 86%
ACEI or ARB for LVSD 40% 70% 53% 83%
Adult Smoking Cessation Advice/Counseling 80% 80% 80% 80%
Beta-Blocker Prescribed at Discharge 89% 92% 89% 87%
Fibrinolytic Therapy Received Within 30 Minutes of Hosp Arrival 98% 98% 98% 97%
Primary PCI Received Within 90 Minutes of Hospital Arrival 86% 86% 86% 65%

There are about 10% of the men and 1% of women who will only see this:

Medical Center Results 2010
Q1 Q2 Q3 Q4
Acute Myocardial Infarction (AMI)
Aspirin at Arrival 88% 83% 78% 83%
Aspirin Prescribed at Discharge 38% 86% 60% 86%
ACEI or ARB for LVSD 40% 70% 53% 83%
Adult Smoking Cessation Advice/Counseling 80% 80% 80% 80%
Beta-Blocker Prescribed at Discharge 89% 92% 89% 87%
Fibrinolytic Therapy Received Within 30 Minutes of Hosp Arrival 98% 98% 98% 97%
Primary PCI Received Within 90 Minutes of Hospital Arrival 86% 86% 86% 65%

2. Additionally, without a column that indicates what the red, yellow and green thresholds mean (goal or benchmarking data) the viewer has no way of knowing when a measure rate changes. What is the rate that will change the color in this report to green? Or yellow? Or (oh horrors!) red?

And since when is red a “bad” color? It simply means stop on a traffic light – a very good thing for managing traffic. Red can symbolize fire, passion, heat and in many countries it is actually a symbol of good luck… but I digress.

Using all the red, yellow and green also breaks the big data display design rule – which is:

Increase the DATA INK and decrease the Non-Data INK

The data, data, data is what it is all about – not colors, gridlines and fanciful decoration.

So what can you to do without your stoplight colors in order to draw viewer’s attention to important data? Plenty…

You can eliminate all of the non-data ink and add data-ink to the areas of importance by:

  • Italicizing and bolding
  • Using soft hues of color to highlight data
  • Applying simple enclosures to denote the data as belonging to a group that needs attention paid.

You can do all of these things as I have below or just one or two depending on how much data you have in your table.

Medical Center Results 2010
Acute Myocardial Infarction (AMI) Q1 Q2 Q3 Q4 Target
Aspirin at Arrival 88% 83% 78% 83% 80%
Aspirin Prescribed at Discharge 38% 86% 60% 86% 80%
ACEI or ARB for LVSD 40% 70% 53% 83% 80%
Adult Smoking Cessation Advice/Counseling 80% 80% 80% 80% 80%
Beta-Blocker Prescribed at Discharge 89% 92% 89% 87% 85%
Fibrinolytic Therapy Received Within 30 Minutes of Hosp Arrival 98% 98% 98% 97% 95%
Primary PCI Received Within 90 Minutes of Hospital Arrival 86% 86% 86% 65% 85%

This method of displaying the data is much easier on the eyes and brain – it is far less jarring and allows the viewer to focus on the information that is important.

You could also simply sort and categorize the data to show where improvement is required versus where things are going well. Consider the following example report for Q3 results:

Medical Center Results 2010
Acute Myocardial Infarction (AMI) Q1 Q2 Q3 Target
Measures Requiring Improvement:
Aspirin at Arrival 88% 83% 78% 80%
Aspirin Prescribed at Discharge 38% 86% 60% 80%
ACEI or ARB for LVSD 40% 70% 53% 80%
Measures that Meet or Exceed Target:
Adult Smoking Cessation Advice/Counseling 80% 80% 80% 80%
Beta-Blocker Prescribed at Discharge 89% 92% 89% 85%
Fibrinolytic Therapy Received Within 30 Minutes of Hosp Arrival 98% 98% 98% 95%
Primary PCI Received Within 90 Minutes of Hospital Arrival 86% 86% 86% 85%

By arranging the report in this way I have eliminated the viewers need to hunt and peck and synthesize the measures that require improvement. They are at the top of the report and clearly and simply displayed.

Now go back and take a look at red, yellow, green table – check your pulse and note if your jaw is clenched. Look at the newly designed data tables – I bet you feel calmer already.

And if you were wondering how colorblind people manage to drive it is because of the order of the lights. They know that red is first, then yellow and green. If the lights are arranged horizontally though, all bets may be off and you should proceed with caution… lots and lots of caution…

Share
Posted in Communicating Data to the Public, Data Visualization, Design Basics, Know Your Audience, Newsletters, Using Color | Leave a comment

Twitter Me This

Time for a confession: I’ve been a Twitter skeptic from day one.

Even though I understand how it works (140-character electronic updates – “Tweets” – that people post for their followers – friends, family, political junkies – and that fill the gaps between other types of communications, such as e-mail and blog postings), I’ve still wondered, “Why would I want to do that?”

It’s only after experiencing Twitter over time that I’ve come to understand its value. And these real-world experiences have made me care about Twitter in a way that neutral facts or statistics never could. 140 characters cleverly arranged are much more than friendly updates. In some cases, they have enormous influence – good, bad, and occasionally ugly (you know it’s true). Tweets can be powerful.

In reflecting on my skepticism about Twitter, I also realized that I had been a bit of a hypocrite (a Twittercrite?): almost daily, I use display devices such as Sparklines (to name only one) that condense lots of data into one concise display – a sort of “Twitter for data visualizations.”

And as happens with Twitter, once I began using them regularly, it became clear that, deployed in a clever and correct way, this “condensing and concentrating” type of display tool could empower me to deliver far more information on my dashboards and reports than could other methods.

Edward Tufte coined the term “Sparkline” in his book Beautiful Evidence: “These little data lines, because of their active quality over time, are named sparklines – small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are datawords: data-intense, design-simple, word-sized graphics” (47).

Typically displayed without axes or coordinates, Sparklines present trends and variations associated with some measurement of frequent “sparks” of data in a simple, compact way. They can be small enough to insert into a line of text, or several Sparklines may be grouped as elements of a Small-Multiple chart. Here are a few examples.

Example 1: Patient Vital Signs

Here, 24-hour Patient Vital Signs (blood pressure, heart-rate, etc.) are displayed in the blue Sparkline, along with the normal range of values, displayed in the shaded bar behind them. To the right of the Sparkline is a simple table that shows the median, minimum, and maximum values recorded in the same 24-hour time-frame.

Click to expand

Click to expand

This basic display delivers a lot of valuable information to care-givers monitoring patients, making it clear that during the same period around the middle of each day, all of the patients’ vital signs fall outside normal ranges.

Example 2: Deviation from Clinic Budget

In this second example, we used a deviation Sparkline to show whether use of available surgical-center hours at three different locations is above or below budget. We added two colors to the Sparkline to make clear the difference between the two values (blue for “above”; orange for “below”) within a rolling 12-month time-frame.

Click to expand

Click to expand

Example 3: Deviation From Hospital Budget

Here we created a deviation Sparkline to show the departure from hospital budget numbers across several metrics (“Average Daily Census” and “Outpatient Visits,” for two examples), but instead of using brighter colors to indicate where the performance falls, as we did in Example # 2, we have chosen a pale gray shade to indicate when daily real performance drops below projected targets.

Click to expand

Click to expand

Please note as well that in each of these three examples, we have embedded the Sparklines into the display and provided context through the use of words, numbers, and icons. We do this because most of the time Sparklines cannot stand on their own; rather, they require some additional framework to convey information and signal value to the viewer.

Finally: although I have been a Twitter skeptic, HDV does have a Twitter account at @vizhealth and Tweets occasionally about things that interest us, or what the company is up to. Take a look!

Share
Posted in Communicating Data to the Public, Data Visualization, Newsletters | 1 Comment