Visualizing Data Fluency

The real question that’s plagued the world has never been “Should I visualize the data?” it’s always been “How should I visualize the data?” Because it’s not that people don’t want to see things visually it’s a matter of how “data fluent” they are and how creative you can be in displaying the data.

For the sake of this post (which may never be completed) I’m going to use a set of data that represent Healthcare Quality Measures. I will gradually unveil deeper and deeper dives into the data (Analytics and Data Discovery) and as I go forward hopefully you will realize that ones “data fluency” is going to limit what you can do. At the same time driving you to increase your own and encouraging others to do the same.

Let’s start really simply and imagine that we want to show Ed Zecutive the # of measures that we are dealing with in our dashboard.

Wow that’s a big number of records and it’s a completely accurate account of the volume of data we are dealing with. Two things to consider “Ed is a busy guy, does he have time to read that many digits and does he really care about that kind of precision.” If Ed can interpret the following KPI (with M for million) instead … wouldn’t this allow him to consume the value faster?

Ed also asks to see “our compliance” which of the following do you think answers Ed’s question? Do you think he really means he wants to see both? Does the color indicate anything to you? What does it imply?

Now we have to ask ourselves if we need to waste that much screen real estate to show ED those 2 numbers? But if I show only him 1 or the other which is most important? If he is “data fluent” enough to figure it out by combining them in a single KPI object that shows both, but in a way that he sees the %, always most important but the number is there in case he wants it. All in a single compact KPI that saves space.

That may well be all that Ed Zecutive cares to see. But what about Candi Stryper won’t she want to drill in and see the number for the most recent compliance versus last years compliance? Of course she will. If she is used to “fitlers” we can do something like the following where we let her simply click “Yes” or “No” to indicate what she wants to see.

But perhaps she’s afraid to “filter” anything for fear she’ll do something wrong. Or what if we simply want to save her time as opposed to just doing the minimal work necessary to give her what she asked for. Then we can calculate and show her both values at once.

Ok you are screaming “Well duh if you could show both numbers don’t give her a filter in the first place just show her both numbers.” To which my response is “It’s never about 2 numbers instead of 1 or 1 simple filter.” Her very next question is going to be “Can you show me those numbers for each system. Of course we can. Now all Candi has to do is process all of those cells and do some mental gymnastics in her head.

Here is where the rubber starts meeting the road. If we can increase Candi’s data fluency so that she can handle something like this we can present what I’m about to instead of the table above.

Each cell represents a different system and contains SO much more information. Clearly the green immediately lets her know that all 7 systems have improved year over year. The one that hasn’t … is “unknown” meaning the data doesn’t identify the system name for the quality records.

What is the next thing you notice about each cell of this KPI?

The sizes of the triangles of course. I didn’t have to explain that to you and she didn’t need me to explain it to her either. Numbers tend to blend together size wise their isn’t much difference 0.01 and 0.09 but by golly if the arrow is 9 times the size wow that makes a difference in her ability to immediately focus on big/small etc. Don’t believe me scroll back to the table version and honestly consider if your eyes are immediately drawn to systems 2 and 5?

On any given day she may want more meat on the visual bones. The lower right corner shows the % of compliance for the current year. Above that is the change from year over year. In the lower left corner that difference is calculated as a percentage of difference. System 2 has improved 16% year over year, whereas System 5 is virtually unchanged (when rounded) but did in fact improve slightly.

But if Candi isn’t ready to handle this kind of a chart we are wasting our time using it. The fact that we can produce it, doesn’t mean that end users like Candi can consume it.

…. Updated 8/31/2017

Introducing the Tree Map

Great news is that I heard from Candi and she completely got it. When I told her that she could actually click on any of the KPI’s to “drill down” into the Practices for that system she was ecstatic as well. Said she had never dreamed of having that much information at her finger tips. But she called back a short time later and said that her boss wanted to know if they could see Systems and Practice groups together instead of having to drill into each. Apparently they had seen a Pivot Table at some point and liked that kind of flexibility. Of course we could show a pivot table, but again, that isn’t very visually appealing and it forces you to do lots of mental gymnastics.

I told her that viewing 2 dimensions at the same time is absolutely presentable in what is called a Tree Map. Each System can be shown with blocks inside of it for each practice. (Any 2 dimensions.) The blocks are sized by the value represented. “What about Year over Year she asked, we really love being able to see that?” Hehe I knew where she was going and told her that we can color the blocks based on whether or not each practice within each system had improved year over year or not.

Those of you who are familiar with this type of thing are probably waving your finger at me saying “But Qlik Dork you can’t use red and green because people are color blind you should have used a different color scheme.” I can use any colors I want, we all can, but rather than focusing so much on color blindness the point of this article is to focus on moving people along the Data Consumption Continuum and helping them become more Data Fluent. For those used to Red and Green Scorecards that are manually produced in Excel making a shift to another color scheme will slow the process down for them to immediately recognize “good versus bad.” The fact is that the red/green jumped out at you before you ever looked at the numbers themselves. You immediately knew which was getting worse and which was getting better.

You will notice that the blue/grey version contains more practices than the red/green version. That’s because the world isn’t always as simple as what we want to display in a given amount of space. And this is no exception. You see the systems that Candi and her boss work for has lots and lots of practice groups within each system. I mean lots. The above were simple captures of just a small region of the actual tree map in the application I’m building for this post which is below.

So why didn’t I just start with the full screen shot?

Great question. I’m not sure that you could consume it. Because as you will see many of the cells don’t show the practice names and values. Figured that might freak you out because you aren’t ready for the next step in the journey to data fluency.

The entire chart is meant to take you from what you are familiar with. You might tell me that you get every name and every value in a pivot table. To which my come back will be “Liar, liar pants on fire.” In fact you can’t see them all the same time you have to scroll throw. You have to take action to move around.

What smart responsive objects like those in Qlike Sense do is present to you the high level overview and as much detail as the real estate will allow. You can see the systems sorted by the highest ranking of compliance. Then you can see how many practices each has and get a really quick understanding of how many practices within each system have improved year over year.

What I’m going to suggest is that you can actually obtain more information without the numbers than you can with them. Let me give you some examples:

Notice System 7 does not have the best compliance, but every single practice has improved year over year.

System 5 is kind of mid range of the systems in terms of its overall compliance but the vast majority of the practices within the system did poorer this year than last. But perhaps the volumes of those that did better pulled it up.

Now that you are seeing things that wouldn’t jump out at you otherwise, let’s keep going. Out of the top 24 practices in terms of compliance at System 2, only 5 of them performed poorer this year than last.

Stark comparisons and insights that we can obtain regarding System 2, System 5 and System 7 that we would very likely miss entirely if we simply saw numbers in a pivot table. None of which really involved knowing or caring about the values or the practice names.

However, I do need to ensure that people are ready to deal with this concept. They have to be “data fluent” enough to understand that the system isn’t going to simply display .00002 font sizes that wouldn’t be readable anyway. They have to understand that they can zoom too see more data. They have to be willing to want to gain the insights that numbers alone don’t show. Because if they are focused on numbers they might be missing the much bigger picture.

…. Updated 9/1/2017

How do I know what to target?

That’s the call I received this morning from Esta Mate. A Junior Business Intelligence Analyst assigned to the Quality Improvement team. Apparently Esta sorted a list of the practices with the worst percentages of compliance and started working with the lowest group. They laughed at her when she told them what she was there for and thus the call to me.

What Esta had missed in looking at the list was that when you are trying to move the dial of quality for a large system you have to account for not only the percentages, but the number of items. If a practice has a compliance of only 12%, but they only have 100 quality measures all together, out of our 63 million total, getting them to 100% isn’t really going to move the dial. Counter intuitively we might actually want to focus on the group doing the best already if they have more overall records.

But how do you relate 2 different measures to someone so that they can consume both numbers at 1 time? A great way to do that is a Scatter Plot. It means that Esta will have to learn how to consume both measures at 1 time though and make inferences.

Being the Data Fluent person you are can you read the scatter plot below and determine which System we should focus our efforts on?

Often times we focus on what we are thinking. Our goal is “Compliance” so we measure it.

What if in order to improve, we shifted our focus to “non compliance” and plotted that in a scatter plot instead. Does this help at all determine which system we should focus on initially?

Poor Erma is pretty new. She sees the benefit of a Scatter Plot and we ended up talking about it’s usage for other types of measures but she still wasn’t confident using it as a way to explain to her boss why she wanted to focus on System 6. BTW — Is that the one you chose?

I then made another suggestion to Esta. Since she understood the concept of focusing on the non compliant measures instead of the compliant measures perhaps we could avoid the complications of two measures by looking at the Overall % of Non Compliant Measures instead of making her guess the relationships to the whole. Meaning look at all of the measures that were non compliant, then figure out who had what % of them.

The following very simple bar chart shows that System 6 has over 30% of the overall systems non compliant measures. While the next closest were Systems 1 and System 3.

But a system is just made up of practice groups. So I built the bar chart so that she could drill into each system. When she drilled into System 6 she could clearly see that Practice 399 was obviously the practice within the system that needed her focus.

My eyes got a little salty when Esta asked if she could have the same kind of bar chart but not have to drill into the System first in case there were practices that had a lot of non compliance but the overall system wasn’t near the top of the list. You better believe I gave her that immediately. How about that for a guess? Esta won’t be a Junior BI Analyst for long and her data fluency is growing as well.

…. Updated 9/8/2017

Quit taunting me!!!

I’ve enjoyed the flexibility that writing and updating has given me as I’ve played with this data set while also tackling what I believe to a real issue that impacts User Adoption. I realize that many of you are probably getting tired of my updating an existing blog every couple of days so I’ll end your suspense now.

You are welcome to play with the application I actually built to see one potential real world approach to visualizing 62.5 million quality records for 2.76 million patients, covering 8 health systems, with 685 Practice Groups employing 5 thousand physicians. Simply click this link  and you can start playing with the application on our demo site. 

Your turn

If you would like to try your hand at Visualizing this data set just email me at Dalton.Ruer@Qlik.Com and I will send you a link to the QVF used for the demo application and the data set (CSV files) you can download.

I would love to read your thoughts, experiences and approaches to increasing Data Fluency within your organizations and how you’ve seen it impact the User Adoption of your applications.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Visualizing Population Health from a Community Perspective

In my previous post I asked you to consider Population Health from a Global Perspective. I understand completely how hard that is to do.  The world can be a scarier enough place without having to imagine how we can improve “health” care all around the world.

In this post I’m going to bring it down a notch and talk about Visualizing Population Health from a Community Perspective.

My first “Community” Health Impression

A few years ago the health system I worked with could not negotiate a contract with one of the Big 5 insurers. It affected me from a business standpoint as we stood to lose a substantial income business wise, but my wife’s insurance was with that particular insurer as well. What our CEO did has had a profound impact on me ever since.

She wrote a letter to every single patient with that insurance and sent it to every single employee as well. The point of her letter was basically … “The mission of our health system is to serve the health needs of this community. While our negotiations have failed with _______ we want you to know that regardless of the fact that they no longer considers us to be “in your network” we care about your health more than the finances and want to continue our relationship with you and will continue to accept the lower payments from your insurer so that you aren’t impacted because you are still our community.”

Of course as CEO her letter didn’t have a really long run on sentence like mine, but that was the gist of it. She absolutely wanted the community to know that they were our mission. AND she wanted every employee to remember that as well. Mission statements on walls are nice. But when they get lived out they can be life changing. Make no mistake there was a financial impact that was felt, but the warmth of our hearts drew us together as a system more than ever and our vision of “community” became solidified then and there.

FQHC’s

Growing up I was pretty poor. The concept of fancy smancy physician offices was totally unknown to me. We went to what we called “Free Clinics.” Free was what we could afford so they were our families version of a Primary Care Provider.

Today they are called Federally Qualified Health Centers. They still operate on shoe string budgets primarily from grants and are still lacking the marble rotundas of premier physician practices. I suppose many consider them to be the safety nets for communities like the one I lived in growing up.

However, that’s not the point of this blog. If you look at their logo you will see something pretty amazing … they call out the Q as “Quality” rather than Qualified. Why is that important to me? Beecause they are the poster child for Population Health in my book. As the large physician groups and large health system’s are still trying to understand the concept of quality initiatives and “health” not “sickness”, things like MACRA, MIPS, MAPS, MUMPS, BUMPS and BRUISES (some liberty in names has been taken) they have been focusing on those things for a long time.

The lesson to learn is in the WHY?

As I’ve shared in my previous posts Population Health is about keeping people Healthy rather than treating sickness. If you get to bill for every sickness your business model is naturally going to focus on charging for sickness. Many systems don’t even have cost accounting systems to know what their costs are for their procedures. If you are going into the red you raise the prices to compensate. Cutting costs? If you don’t know what the costs are you can’t possibly figure out where the variances are that are putting you in the red.

But if you are on a tight budget. I mean one that you can’t change. Guess what? Your operating model is to keep people from getting sick so that they don’t use up your scarce resources.

Community Health Center of Southeast Kansas

I recently met a coffee bean named Janae Sharp who is passionate about putting other coffee beans together. (If you are wondering what I mean by coffee beans be sure to read my first post.) She even has a site Healthcare Scene Blog that is like a super blog for healthcare.  I will be sharing more about Janae and her work in my next post:

Visualizing Population Health from a Personal Perspective

For the sake of this post I bring her up because within no time of meeting she introduced me to another coffee bean, Karlea Trautman, who is a consultant who works with lots of FHQC’s one of whom is CHC-SEK. Her passion for these organizations is like my passion for data … way out there. So as you can imagine we had lots to talk about and all of it was enjoyable.

When she mentioned that data regarding quality of the nations health centers was available online … boom … my attention was really piqued. The Health Resources & Service Administration maintains a phenomenal nationwide database that is freely downloadable. 

As the Qlik Dork I figured I owed it to you to do some research on this data, and well I flat out enjoy playing with new 0’s and 1’s. So before going to bed that night I downloaded all 4 years of nationwide data and ingested it into Qlik Sense and started playing. I had no idea what to look for so my natural first step was to look at the center that we had spent so much time talking about.

What I saw was pretty amazing. Continuous improvement across the board on clinical indicators. The things that keep people in the COMMUNITY HEALTHY. What Karlea and CHC-SEK are doing is making an impact on the community. Their numbers are not only high for the State of Kansas their numbers are high in contrast to the entire country.

Oh shoot. I just realized that simply seeing their numbers doesn’t help you understand how they fit into the “Community.” So how about this instead. The dots represent the CHC-SEK sites. The census tracts are color coded by per capita income. The darker the color the higher average income. The lighter the colors the lower the income. All of the areas these clinics are in are just above or are below the poverty line.

Of course we might want to zoom in to see even more details. The dot pictured is their main clinic in Pittsburg.

Here is a thought. A crazy one perhaps. But Karlea agreed so if I’m way off base, at least I’m not alone … what if instead of thinking of these clinics as the “safety nets” for the poor … we flipped the script and started applauding the true “health” care that they are practicing and the mega systems turned to them for advice?

What if as the rest of the world struggles to grasp for answers to what “Population Health” is around the world and what Social Determinants of Health are … these FQHC’s have had the answers for years?

How can I wrap this section up in a really cool way? What if I told you that CHC-SEK recently appointed a Director of Population Health, Mallory Roberson,  to help them continue moving further forward. How awesome is that?

Ladies and Gentlemen we are now arriving in New Ulm, Minnesota

If you aren‘t sure where New Ulm, Minnesota is don’t worry. I had no idea either and I play with maps almost daily. So I figured before talking about them I would give you a visual of where they were located as a point of reference.

Kind of out of the way right? Which is perfect for my story. Completely ordinary community that has been radically changed as a few coffee beans decided to focus on Population Health in their Community.

How did they get restaurants to offer better foods?

How did they encourage people to think about their health instead of waiting to deal with their sickness?

I could’t begin to do it justice. So take a few minutes and enjoy this incredible video of how a community you never heard of, but will never forget, has been radically transformed by Population Health.

Entertainment or Inspiration?

That’s the question only you can answer.

 

Update: 8/29/2017 – Felt like I needed to update this post after the finalized 2016 data was released on the UDS site. CHC-SEK once again managed to improve their numbers and I’m happy to report that they were also the proud winner of a National Quality Award. This link is for the award site for Kansas but all you have to do is adjust the last two characters to see your state: https://bphc.hrsa.gov/programopportunities/qualityimprovement/awards.aspx?state=KS

I also reached out to Jason Wesco the Executive Vice President for Community Health Center of Southeast Kansas and he was willing to be a guest for my second Dork Cast. He shared his secret with me about how he “transforms organizations thru the use of data.” He said the trick is “consistently being curious.” Be sure to pay attention the part in the video where you see the coal miners. Jason shares from the heart why community health centers are so vital to this country. So thankful for his willingness to give his time so generously. Click here to watch our interview.

 

Posted in Uncategorized | 1 Comment

Visualizing Population Health from a Global Perspective

The Problem(s) with Population Health

I’ve got a nickel that says you are probably undertaking a Population Health Initiative. Everyone else in the world is so it’s safe to say you are as well. As the Qlik Healthcare team travels the globe we’ve seen 3 common factors that impede most folks.

  1. Data is Everywhere – There are so many publicly available data sources to pull from. The problem is that there are too many 0’s and 1’s and it’s hard to deliver a comprehensive view.
  2. If you can get that the next issue quickly becomes how to you utilize that comprehensive view to find the members/patients that are most at risk.
  3. Finally how can you track the efficacy of the numerous community-level programs to find out if your time and resources are having the greatest impact.

The Basics of Population Health

I guess I should slow my roll a bit … as I may not get a nickel from you as very well may never even heard the term “Population Health.” Like with many things it’s the kind of phrase that is easier to define by describing what it’s not. It’s the opposite of the sickness care that is now practiced in many areas of the world. You know what I mean. You have a medical problem so you go and have that problem treated then you go back to your unhealthy lifestyle.

At the core of Population Health is the notion that it’s far better to practice the things that will keep you healthy instead of treating the consequences of poor choices. For a diabetic it might mean keeping regular appointments and monitoring sugar. For pre-diabetics it might mean, this is crazy, changing their diets before having to deal with the severe consequences in terms of sickness and finances of becoming diabetic. It means getting checkups for colon cancer, breast screenings and so on.

In an ideal world every single person on the face of the earth would eat healthy, exercise, have no stress and live 2 minutes walk from a physician. They would be a lot healthier and it would cost a whole lot less to treat them.

Unfortunately Qlik Dork’s universe doesn’t exist. The universe we live in has it’s complications. Other stuff comes up and “health” isn’t a priority, or there are factors that prevent people from being as healthy as they would like to be …  hence the need for Population Health.

I would suggest at it’s core you think of the simple fact that the very term “Health Care” is totally incorrect. It’s Sickness Care. We don’t have “Health Insurance” we have Sickness Insurance.

Social Determinants of Health

A recent study revealed that up to 90% of patient risk is directly attributed to social determinants of health, individual behavioral patterns and genetics.

But what in the world does “Social Determinants of Health” (SDOH) mean? I would like to suggest the most primary SDOH would have to be the very word “Health.”

I would suggest at it’s core you think of the simple fact that the very term “Health Care” is totally incorrect. It’s Sickness Care. We don’t have “Health Insurance” we have Sickness Insurance. We don’t have “Electronic Health Records” we have Electronic Sickness Records.

If we can’t distinguish between those things we probably aren’t going to truly be “healthy. Once we understand the difference between Health and Sickness then we can look at other Social Determinants of Health. The things that keep people from being healthy.

In my previous post: Visualizing Population Health I introduced the absolute basic social determinant of health outside of vocabulary … basic drinking water. I wrote that post to help everyone see immediately that there are conditions regarding the environment that a person lives in that can dictate their health. And that correcting SDOH can be far less expensive than treating the sicknesses that result.

Felt compelled to go there before you mocked me for the SDOH that is primary in my life. The total lack of sidewalks. You see walking and listening to music is a huge destresser for me and as you know walking is just flat out good for the human body. But living off a rural highway with no sidewalks puts my life at risk when I walk. Jumped right from basic drinking water to a middle aged, middle income issue. That’s the point of SDOH they effect everyone … including me and you.

When I visited the Qlik office in Sweden I was shocked and impressed to see that their sidewalks were more like roads themselves. They had lanes just for biking and lanes just for walking. EVERYWHERE I went. They believe in walking, riding bikes and it shows. The lack of belief in walking and riding bikes surrounds me nearly everywhere I go in the United States. Why? Because so many still believe that sickness care is valuable, but health care isn’t. Good news is that the County Health Rankings  documents a score for Walkability so that should I decide to move I will certainly choose an area of the country that demonstrates it’s believe in caring for ones health before becoming sick. Of course they also document so many more.

They aren’t the only publicly available data source for SDOH another popular one is the CDC’s Social Vulnerability Indicators. These sources and others like them provide data that let’s you begin understanding “how” your patients/members are living not just where they live. They can also help you see the areas where your patients/members may be at the most risk.

The list goes on and on and on

Another SDOH is absolutely air quality. Living in Hotlanta I deal with allergies about 75% of the year due to 8 gajillion varieties of pollen. Nuisance headaches, tired feelings … beyond my control totally environmental. What about what it gets worse … when it effects people with asthma? My mother grew up in a coal mining town and is now dealing with the last stages of COPD. Totally beyond her control.

What about hidden and very dangerous SDOH. Lead paint. Lead pipes. Asbestos. Carbon monoxide. Radon.

What about crime? Would you consider that a social determinant of health?

What about the lack of cars in a family? Is that something you would consider a social determinant of health? Hard to pick up that prescription that was handed to you if you are in a taxi and it will cost $50 + to sit and wait 20 minutes at the pharmacy.

What about the inability to speak a native language? My wife and I have had occasion to visit Pharmacia’s while traveling the world. What if they don’t understand us or we don’t understand them and take the wrong thing?

What about a basic misunderstanding or complete lack of education on things as fundamental as hand hygiene?

How about fundamental health issues as they change over time? I’ve recently be doing data science research on comorbidities and discovered that Vitamin D deficiency is linked to a number of serious problems (I’m not a clinician so I had no idea. But data science doesn’t lie so I know now.) Fortunately my wife has a phenomenal PCP who got her onto Vitamin D supplements awhile back. But what about those woman in society who haven’t been told about that or a myriad of other issues and instead face consequences of those illnesses?

The lists go and on of these things that in most cases are way beyond anyone’s choices of things they are forced to live with which cause negative impacts to their health. Regardless where on the socioeconomic scale you are, SDOH effect us all. Seriously with a lack of education, lack of communication, lack of interoperability between practices and the abundance of commercials promoting the wrong things … what do you really expect our “health” to be.

I would love to be able to purchase true HEALTH INSURANCE but unfortunately I can’t. My sickness insurance kicks in only once I’ve become sick.

But wait … there’s hope

There are many many people out there working diligently to bring SDOH to the forefront. I follow #SDOHImpact driven by Mandi Bishop on Twitter who daily drives the issues home in a way that anyone can understand. If your appetite has been wet to dig in jump on board.

Physicians like Dr. David Nash and Dr. Fred Goldstein pouring the life work into helping others realize that it’s not only cheaper but better to look at the “HEALTH” of a population as something that should occur prior to issues arising and treating their sickness.

Lots and lots of resources. Lots and lots and of data available. Which sort of brings it back to my opening … The problem is how can I possibly consume it all in a way that actually provides insights. Lists of of data in Excel don’t translate well to peoples minds.

I’m about to do something totally crazy, kind of like what you would expect of the Qlik Dork, I’m going to suggest that the answer to consuming all of this Population Health and all of this SDOH data lies within a 7 Layer Chocolate Cake.

The 7 Layer Chocolate Cake

If you will indulge me for a moment I’d like to suggest that the answer lies within a 7 Layer Chocolate Cake. It’s a chocoholics dream because instead of just 1 chocolate taste, it lights up every single one of the gajillions of taste buds in their mouth. I think the same is true for consuming data. If I can layer multiple types of data together then it lights up every single one of the gajillions of triggers in our brains.

Like a chocolate pastry chef in the world’s finest pâtissier, Qlik GeoAnalytics can create as many layers as the taste buds in your brain can consume. You want to visualize your members/patients add them. You have the taste for Social Determinants of Health pour some in. Need a nice ganache on the top? Carefully layer on a live source of air quality data.

Getting hungry yet? Me too so I better move on.

You see knowing that zip code 12345 has terrible living conditions is 1 thing, but seeing your patients/members living in that zip code brings the data to life.

Seeing your patient/member in an area that is 32 minutes from the nearest pharmacy tells a different story than just plotting them on the screen.

Once the data starts being visualized together, in unison, you begin seeing HOW your patients/members are living and not just WHERE they are living. You begin seeing and understanding their HEALTH RISKS prior to dealing with their SICKNESS.

[Tweet You begin seeing and understanding their HEALTH RISKS prior to dealing with their SICKNESS.]

In my, perhaps geospatially jaded, mind I believe that the single best way to really drive home the points of Population Health and of Social Determinants of Health is to paint a vivid layered portrait of the data on a map.

Population Health is a Global issue

Working on the Qlik Healthcare team is an amazing opportunity for a coffee bean like me because I get to see “health” care from a global perspective. I get to consume data from sources around the world and you know I love consuming data.

And boy did I put some data on the barbie working with this Australian Census data. Gotta tell you the Aussies are serious about their Census SDOH data. To the tune of 8,492 COLUMNS of data. Capitalized the word column so that you would know it wasn’t a typo of the word rows. Seriously 8,492 different columns for the Qlik Dork to consume. My style is to be comical, but don’t let that fool you … Visualizing Population Health from a Global Perspective is not a game to me.

I began by suggesting that one of the serious issues is that there is to much data. How would you possibly go about visualizing data from 8,492 columns of data. This 8 minutes just might be the best 8 minutes you ever spend. Enjoy!

If you are intimidated by Population Health from a Global Perspective I can certainly understand that. Hard enough trying to deal with the pressures of your own organization without trying to solve the entire worlds problems. Never fear my next 2 posts will be:

Visualizing Population Health from a Community Perspective

Visualizing Population Health from a Personal Perspective

PS – Yes I’m aware that my image was a 24 Layer Chocolate Cake. Kind of proves the point that the Qlik Dork loves his chocolate and always wants to consume more layers of data than the average data junkie.

Posted in Uncategorized | Tagged , , , , , , , | 2 Comments

Visualizing Population Health

Wintality

Several years ago I read the term Wintality via posts from Auburn softball players. I liked the term so much that I proceeded to write not 1, not 2 but 8 posts to my own softball related website.

Each post began with a made up definition of the word such as:

[Wintality] – win-tal-i-ty – noun; The act of mentally attacking everything on the field as though it may be the last time you ever play the game. “That player’s wintality is just infectious.”

Recently I decided to post the word with a hashtag in a Tweet just to catch peoples attention. The person it made the most impact on was me. You see I accidentally clicked on #Wintality and as expected Twitter took me to the page for it.

Normally I’m not a ‘rah-rah-rah’ motivational speech kind of reader because there are more than enough books on consuming data to keep me busy. But seeing a book by Baylor Barbee called Wintality was to much a coincidence to resist so I had to purchase it. 

I’ve found the book to be very trans formative and therapeutic. If you read the book you’ll understand when I say that “I’m a lion on a surfboard” and that isn’t easy. If you don’t read the book you’ll just have to guess what I mean.

Are you a Carrot, Egg or Coffee Bean?

That is a question that Baylor asks early on in the book. Most people would be freaked out by a strange question like that and put the book down, but having written a post years ago called “What Kind of Bird Are You?” I was all about digging to see where Baylor’s creativity was going to go. Of course as the Qlik Dork I ended up realizing this crazy question had a lot of applicability to the Business Intelligence and Data Consumption communities let’s see if  you agree.

A carrot is an object that starts hard but when put under the pressure of boiling water becomes soggy and limp. Still called a carrot but clearly merely a shadow of it’s former self. Kind of like the workers in this field who started out with a vision and wanting to change the world. But they end up apathetic and believing nothing they do can change anything. They are the ones behind closed doors that will tell you things like “just go along to get along” “don’t make waves” “it’s no use to offer ideas.”

An egg is an object that was soft, had a hear but under the pressure of boiling water becomes hardened. Kind of like the workers in this field who clearly no longer have any empathy nor patience for others. It’s not that they don’t think they can help, they see no reason to “waste their time.”

In other words under the pressure of the boiling water they are changed. The yield to their environment. A coffee bean on the other hand knows who it is and when put under the pressure of boiling water changes the water. It takes ordinary water and converts into one of the most desired substances on earth.

This isn’t a therapy session so I won’t ask you to take the time now to determine which you are most like. I know in my own life I’m the coffee bean. Everyone probably wants to be a coffee bean and “change the world” but I can assure you it’s not easy. Water doesn’t want to change.

What’s this have to do with Population Health?

Great question … just hang tight a while longer.

Mercy Ships

Mercy Ships is a charitable organization that could be referred to as a floating hospital. Medical professionals donate their time to reach out to the world around them by volunteering their time to help those in great need around the world. I first heard about them via a colleague, Joe Warbington, who traveled with them as they are one of Qlik’s CSR partners. 

More recently I heard about them via a gentleman named Scott Harrison. Scott also traveled with Mercy Ships for a year. On one trip (this is important later on so pay attention) the team he was with was going to conduct 1,500 surgeries. I won’t share the graphics of the type of facial tumors they were removing, so just visualize mouth tumors that were literally suffocating the patients. The kind of things you just don’t see in the United States. While walking around the community Scott was touched by the fact that woman and children were literally pulling their water from swamps. While not a clinician, it doesn’t take one to realize that what we would label as a health hazard and run from is the source of water for 10% of people in this world.

Scott went to the clinicians to tell them and his thoughts that drinking that sewage can’t be good and their response was something to the effect “Well duh 52% of the disease in developing nations is a direct result of the water they have to drink” (my inflection) but then in his words challenged him “So what are you going to do about it?” Scott could have chosen to be a carrot and just said “this is a global problem there is nothing a little old guy like me can do about” and just lived in apathy. Scott could have been a non empathetic egg who simply focused on the surgeries. But Scott’s a “coffee bean.”

What’s this have to do with Population Health?

I’m getting there just give me a little more time it will all make sense soon.

Charity: Water

Ironically in describing the conditions Scott used the phrase “children shouldn’t have to drink water that looks like coffee.” So he started Charity: Water. A mission focused solely on raising awareness of the problem and raise funds to provide clean drinking water to those around the world. To date his organization has already completed 23,377 projects and have provided 7,128,152 people around the world which can be visualized here:


What’s this have to do with Population Health?

I’m glad you asked. I’m about to write 3 posts all about Population Health and Social Determinants of Health.

Visualizing Population Health from a Global Perspective

Visualizing Population Health from a Community Perspective

Visualizing Population Health from a Personal Perspective

One could easily decide that “fee for service” is the way things have always been in healthcare and this whole population health mumbo jumbo is unneeded. One could easily argue that it’s not a hospital’s responsibility that people do the right thing in their own live’s. One could easily argue that it’s not financially up to insurance companies to reimburse people for doing the right thing for their own health. One could easily become a soggy carrot, or a hardened egg when it comes to healthcare.

But coffee beans like Dr. David Nash have been fighting the battle for population health because they know full well … it’s the right thing to do and they have gradually proven to healthcare systems that from a financial perspective it’s cheaper to focus on health rather than on the costs of sickness.

Remember I mentioned the 1,500 surgeries. While Mercy Ships donates those procedures, imagine the costs for 1,500 surgeries. Now tie that in with Scott’s work at Charity: Water. It costs only $10,000 to provide a clean drinking source. Providing “health” is both right and cheaper. Clean water is one of the social determinants of health that I will talk about in my coming posts.

Population Health and Social Determinants of Health aren’t just new buzzwords.  They are in fact the proper terms to use when talking about HEALTHcare instead of SICKNESScare.

Visualizing Population Health

Visualizing Population Health is easy. Next time you take a sip of your Starbucks imagine the 10% of people in the world whose only water supply looks like your coffee. Washing their hands in water with cholera. Ingesting water with leaches in it. Too graphic for you?

Ok don’t think about all those people around the world … just focus on those in Flint, Michigan and realize that our “health” is a product of the environment around us. Realize that changing our current fee for system isn’t going to be easy. It involves change and with change of this magnitude there is going to be a lot of pressure. How you handle the change is up to you. I sincerely hope that you will join the population health movement and prove to be a coffee bean.

Posted in Random thoughts | Tagged , , , , , | 2 Comments

Visualizing Data at REST

My boy Sir Isaac Newton is famous for a few laws he wrote about motion. His first such law on the topic says:

An object at rest will remain at rest unless acted on by an unbalanced force. An object in motion continues in motion with the same speed and in the same direction unless acted upon by an unbalanced force

Apparently back in his day motion was not only a big deal but it was also apparently out of control and needed to have some laws written to govern it. If I lived back then I would have been mad about apples falling on my head as well and probably would have come up with some pretty awesome laws of my own.

But the Qlik Dork is living in a day and age where 0’s and 1’s are falling out of the sky and totally are totally out of control. Not sure there is any governing body to prevent me from doing so therefore I figured it was time for me dictate a few Laws of Data:

Qlik Dork’s First Law of Data – “Data at REST is stupid. Put that data into motion by visualizing it and your company will pick up momentum.”

Qlik Dork’s Second Law of Data – “Data at REST is expensive. Put that data into motion by visualizing it and the data will start paying for itself.”

Qlik Dork’s Third Law of Data – “Data at REST is useless. Unless you expect an 8’th day of the week to be added to the calendar I’ve got bad news for you … your ‘Someday’ won’get here. So start making use of your data ‘Today.'”

BIGger DATA

Big Data is kind of old news. How old? Even I jumped on the bandwagon all the way back in September of 2016. In this post I’m not going to talk about Big Data instead as the heading suggests I’m going to write about BIGger DATA.

Marketing slides would say that Qlik has 10 points of integration with Cloudera. Woo-hoo. You can actually get the value out of all that Big Data with Qlik. But those of you reading this blog aren’t novices. You already know that Qlik is a data ingesting beast and already knew that Big Data is simply another source of data. Qlik’s Associative Model will gladly allow you to associative your 0’s and 1’s from cocktail napkins, spreadsheets, flat files, databases, EDW’s and of course Big Data sources.

Kind of crazy that I can actually make light of that point, but let’s face we already know that. Nothing new to look at there. Where I’m going today is beyond even that. I’m suggesting that the data Cloudera collects about your ingestion of Big Data is also a source of meaningful data which is now just sitting at rest but can and should be visualized so you can get value out of it as well. In other words BIGger DATA.

Don’t laugh to hard but my use of the uppercase when using the word REST isn’t accidental. It’s intentional. What I’m driving at this post is using the Qlik REST data connector to take advantage of all of the data sitting there, at rest, about what you are doing with your Big Data. Sneaky huh??? That’s how I roll.

Visualizing Data at REST

Let me lay the ground work for what I’m talking about. Cloudera provides a nice Cloudera Manager tool that you can utilize to see a lot of different things about your implementation. I can see what’s going across the system.

I can also drill into different systems to see what’s going on with them specifically.

There is data behind those things. Meaningful. Important. Useful. Data. Visualizing all of that wonderful information inside Qlik provides you the ability to get an overall and subsystem visual as well:

No big deal you say? Happy to click around and hunt and peck to find everything are you? Well how would you find any issues in any of the systems that have to do with memory or see their history? Well in Qlik I would simply use the Smart Search feature to find anything that has to do with the term “memory” but that’s just me.

What about the metadata collected about the queries that are used to pull the Big Data? Doesn’t have value? OF COURSE IT DOES so you know I want to visualize that as well. Not only can we show you which queries were fired and how many times they were fired, we can show a distribution plot of how long the queries took each time. Glad I was the guy who selected “Hello World” in 18 ms and not 5.5 minutes. Hate to be that guy.

More importantly I’d hate to be the team that is just leaving all this data lay at rest on your system. If you don’t understand why… see my 3 Laws of Data above.

Using REST to get your data at rest about your Big Data usage

I’d like to begin this section by sharing that all of the real work in coding that you will see was done by my buddy, and Cloudera implementation stud David Frericks. Yes, there are others at Qlik that are bigger data dorks than I. And David is the guy the Qlik Dork goes to.

Did you even know that Cloudera had a fully supported set of REST API’s that could be used to pull this wonderful data that is only resting as of now? Guessing the answer is no or it would be redundant to write the question. You can check it all out right here.

Let me walk you through a very simple REST API call. Let’s say we want to get all of the Impala queries that were run on the system. There is a REST API for that.

http://your_VM_IP_here:7180/api/v15/clusters/Cloudera QuickStart/services/impala/impalaQueries?from=2017-01-01&limit=1000&offset=0

We would implement that call using the Qlik Rest Connector like so. Notice that I’ve filled in my Cloudera system IP address and put my Cloudera system credentials in.

That will then allow me to see what that REST API call will surface in terms of data.

I say heck yeah, go get that for me … and boom it loads that data. It’s important to understand what’s happening behind the scenes at this point. Script is written that will execute and process that call.

Phooey you say? TMI you say?

No my friends here is where the rubber meets the road … with script you can overcome any kind of REST API barriers that may exist with Cloudera or with others … say those pesky times when they implement a MAXIMUM number of rows being returned.

Qlik can loop and get all of the rows for you. But wait there’s more.

What about those times where you have 1 REST API call that returns some key information and you then need to go get the details using another REST API call? Or when you have multiple systems implemented and want to get all of the information about the systems, the calls that were fired, their history, their blah-blah-blah-blah. Yeah Qlik let’s you do all of that because of it’s ability to do the ETL on the fly.

When I started the section with a shout out to my friend David Frericks it wasn’t in jest. Dude provided a serious baseline of work in making the Cloudera REST API’s fly information and has produced some great work.

Follow up

“Hope isn’t a Strategy.” I read that recently in a book called “Wintality” by Baylor Barbee and that it was an awesome quote to summarize what I see happening right now in most people when it comes to big data. Lot’s of HOPE. They are hoping that the time will come when they will learn, understand, begin diving into “Big Data.” I’m assuming that you are reading this post because you are the kind that is more action oriented and you have or are formulating you strategy for visualizing your Big Data and hopefully now your Big’ger’ Data as I’ve laid it out for you.

Be sure to check out http://cloudera.qlik.com/ to see some great examples of how Qlik brings Big Data to life.

Hungry for even more? Want to see a phenomenal example of how Qlik can call SOLR Search on the fly using the REST API, ingest the data and produce a web application using the Qlik Sense Visualization API calls then check this out. http://cloudera.qlik.com:3000

 

 

Posted in Data Science / Big Data | Tagged , , , , | Leave a comment

The World “is” Flat

Oh sure those nerdy science types will give you explanations and “supposed” evidence that the world isn’t round. Blah-blah-blah.

But let’s face it … we are humans and the way our brains work it’s simply easier to see things in 2 dimensions. So unless you have a new fangled holographic imaging systems even a sphere shaped globe appears to you as 2 dimensions.

Oh sure those nerdy data visualization types will tell you that your eyes interpret shades of light and dark as distance  … but let’s face it … that just makes the situation worse for us. People fool us in paintings with shades of light/dark into thinking something is 3 dimensional when in fact it isn’t.

Solid Logical Argument for the World being Flat

Don’t trust them. I propose that in our weakened human state of brain power it’s easier if we just get our maps on a piece of paper that is flat rather than some hologram. Hard to argue with solid logic right? If it’s easier … then it must be true.

Alright! You win. The world isn’t flat. But trust me it would be a whole lot easier if it was so when you get to the end of this post you are going to wish you had simply agreed with me.

Round?

Here is the problem not only is the world not flat, it isn’t round either. Did I mention that measuring it continually evolves as we get more and precise instrumentation? Meaning that as we gain technical proficiency are perception of the shape of the globe actually changes.

It gets better. That shape you think continents have … they change.

But wait there’s more … there is this thing called Continental Drift which means those great big things that you think are locked down, are actually moving.

Where are you?

All this poses a serious issue for us … because a lot of how we define ourselves has to do with “where we are.” Tell me where you are so that I can come and verify you really exist.

Seriously! Go for it. Where are you?

You see my point yet? How are you even going to tell me where you are in a world that is ever changing it’s shape and position?

Enter Latitude and Longitude

Oh look there is smarty pants in Illinois pulling out his phone right now and he says he is at X Latitude and Y Longitude. Super duper. That helps me a lot if you can tell me what Datum was used to calculate the Latitude and Longitude you just gave me.

Because if you are just telling me a Lat/Long value it doesn’t really help me reproduce where you are to verify that you exist. I’ve got to be using the same system.

Never heard the word “datum” before? Yeah me neither until I recently ventured into this whole geo analytics stuff. But don’t worry you can find all you ever wanted to know by clicking here to read this great resource. It only took me 3-4 reads before I could pronounce most of the words so feel free to give it a shot.

Precision Matters

Once you are comfortable telling me your Latitude and Longitude and can tell me if that was determined by the NAD27, NAD83 or WGS84 Datum I can get close to your location. Close to proving you exist may or may not be good enough though.

Ok let’s be real … none of you need me to come and prove you exist. But let me ask you this “Do you care if you can find that address you are trying to get to?” Because Latitude and Longitude can actually come in 2 flavors: Centroid and Polygon.

No. For real. It’s not as simple as you think. Take for instance a location that is part of some giant outdoor shopping mall, or a giant apartment complex. As you read about Centroid and Polygon calculations you will discover that some times the lat/long of an address is calculated as a guesstimate based on the entire range of addresses. (For your continued reading pleasure as you dive into Geo Analytics be sure to favorite this awesome GIS Dictionary site thanks to ESRI.)

If you check out public sites like DataLists.com you will find that they provide you the latitude and longitude in both flavors.

Get the “point”

You get the pun there? It would be funnier if you could see my facial expressions as I said “Get  the point” but hey this is a written blog and I haven’t started my Dork Casts video channel yet.

Regardless of flat/round/sphere/ellipsoid … there are a lot, and I mean a lot, of publicly available data sets out there and frankly I want you to tap into as many as you can. In the embedded video you will see that I find one such site and say to myself “Gee I’d like to visualize this data.” It’s a “Shapefile” that contains the polygons (shapes) as well as the data that I need for coloring.

Here is where the Qlik GeoAnalytics applications comes in really handy … I can load the shapefile in it’s ZIP format and ingest the data and the shapes and get right to visualizing the data. Oh yeah, I actually walk you through all that Datum stuff in the video as well so perhaps it will help you make sense of it.

Proving you exist by your location is kind of a joke. But visualizing the world around you is serious business. If you are interested in population health like me then you likely want to tap into all sorts of Social Determinants of Health. But even if you are using your analytics/visualization skills in a totally different field … the point is there is geo data out there … just waiting for you to explore.

Posted in Geo Analytics | Tagged , , , , , | Leave a comment

Removing the clutter

Ask any Data Visualization expert and one of the best pieces of advice that they will give you is “remove the clutter” so that your “data can tell the story.” Would it be going to far if I suggested that perhaps you might even want to remove the underlying map when you are doing Geo Analytics?

I recently came across a post from a data visualization “guru” that I follow named Ken Flerlage. His post was entitled Visualizing Earthquakes.  As always Ken’s post was very clean and informative and could be printed as an “infographic” with little additional work.

Normally I would press “like” and move on, but for some reason I was very intrigued with something that caught my eye. So unlike the multitude of times in the past I felt like I “had” to get involved so I pulled the data set down myself and starting playing with the data using Qlik GeoAnalytics.

North and South America

The first thing that caught my eye with Ken’s image and that jumped off the screen when I visualized it was that as large an area as North and South America are it’s crazy to me that the fault lines clearly lie along the Pacific Ocean side of both continents. Hard to miss isn’t it.

In fact, here is where the title for the blog post jumped into my head … Did I even need the map? Did the map add value? Or did the map in this case actually detract from the story that might jump into you mind if it wasn’t even there?

So I decided to experiment and I removed the map. You tell me. Does the map need to be there, or is there a pretty cool story that lies in the data itself?

Hopefully the question in your head is … Are there other areas in the world where the distinction is so incredibly clear cut?

 

Africa

What I found was a resounding … Yes there are. If you knew that the following image was for another continent could you guess?

If you guessed Antarctica, you might be geographically challenged. I’m guessing that you nailed it immediately … Africa.

Very clean northern and eastern border lines with very few points on the western edges of what is a gigantic continent.

With “normal” dashboards you probably already knew that removing the clutter was an important aspect of really telling a visual story with your data. Would you have ever imagined when you started reading and I suggested that removing the map might actually be a good thing to do?

But wait! The story gets more interesting and why I believe data visualization is so intriguing. Yes less is more, but sometimes more is more as well. There are times where visualizing the same thing in multiple ways can make a huge impact. You see while you could take the image above and lay it right down along the continent of Africa on a map the fact is that the points are inland from the coast. Yet they follow the coastline unbelievably well, wouldn’t you say? And just like North and South America, the earthquakes spare the entire western edge of Africa almost entirely.

Infographics vs Analytics

I can’t recall in the 18 months of doing this blog ever explaining the difference between visual analytics and infographic. This is a great time. An infographic is meant to convey a story from the authors perspective. Visual for sure. But also static. If you read Ken’s post you will see that he goes the extra mile in conveying information even about the sizing of the points. For my analysis I simply used a linear scaling of the point sizes based on the magnitude of the quakes. Ken does a great job of visual helping readers understand that the points should really range a gigantic amount in size. Informative. For my “analytics” I only cared about helping the end user doing the analytics realize that some points were 2.0 and others were 8.5. So I used scale and color but I didn’t go the “extra mile” of “informing with detail” as Ken did. I’ve pulled some images and shared the “story” that I thought was intriguing but the application is very much one that is for analytics.

If I wanted to go the extra mile and I had research to back my “coastline” theories I could certainly turn it into an infographic, but that’s not what I do. That extra mile is what reporters and people like Ken do when documenting the story they want to share visually.  The infographic is a form of data visualization intended to inform and answer questions, while analytics are intended to answer questions you may have and also prompt new ones.

For instance in my application you would see the following image and you could immediately count 8 to answer the question “how many earthquakes has Australia had?” But more importantly hopefully spur you to ask “Why on a continent so large have they only had 8 earthquakes, nearly all along the southern edge, and yet nearly all of the surrounding area is covered with earthquakes.

While many online infographics are becoming interactive to some degree, analytics would enable you to filter to what you want. “Show me only earthquakes over 6.5 in magnitude” “How many square miles were affected by earthquakes in the USA?” “How much money did earthquakes cost from 1990-2000 vs 2000 – 2010?” “Which decade from 1900 until present had the most earthquakes?” “If you added the magnitudes together what would a line chart look like over time?”

In this case that time component might have it’s own story to share. So of course you would want to see earthquakes animated over time. In this video I cover quickly the same points above, but I also video the animation. In one screen I animate the decades from 1900 until current and the points are sized and colored based on magnitude. Again only using a linear scaling, not a logarithmic scale as the point sizes would be so big for some they would hide others. Each decade simply displays the earthquakes for that decade. I then do the same animation, but instead I color the points based on the decade. 1900 is a light yellow, while 2010-current is a deep red. The goal for this is that I can then aggregate the earthquake points starting in 1900 and you can tell where the points are for 1910-1920 as they are added and so on.

Analytics would then involve running the same things but coloring the points based on the dollar values that were involved. Or the number of deaths inflicted. Always, always, always searching for ways to visualize in the best way for the data itself to tell the story and help you answer the next question you have.

Popcorn Time

As you enjoy some popcorn while watching this short video of the animation take the time to ponder a philosophical question: What clutter can you remove from your life so that “your personal” story shines brighter?

 

 

Posted in Geo Analytics, Visualization | 1 Comment

What in the world were you thinking???

datasciencehumanmindI can tell you that as the father of two daughters, the grandfather of 7 and a 20 year veteran coach/instructor for thousands of adolescent female athletes I’ve probably said “What in the world were you thinking” at least a thousand times. You know what I mean … children so often do things that just completely defy all logic or known thought processes.

The irony is that as adults we say this mostly in gest as we roll our eyes. All the while knowing full well the problem wasn’t what they thought but the fact that they didn’t think. They simply allowed themselves to be distracted by something else.

Two years ago I began this blogging journey and I’ve greatly enjoyed every minute of research, every post and every conversation that was sparked about Data Visualization topics. But I have to be honest watching the battle of hype versus hope unravel right before my eyes on the Data Science and Big Data fronts has kind of driven me crazy. So as this blogging journey is about me, I find that I need to begin at least intermixing what I’m learning and feel about Data Science and Big Data in with my posts on Data Visualization.

The American Recovery and Reinvestment Act of 2009 pushed $20 Billion into data producing factories in the form of EHR systems. Unlike the common myth data storage isn’t cheap. You need bigger data centers, with more racks of disks, which require more power, which require more cooling, which requires more backups, more network bandwidth both internally and externally for redundancy and  require more staff to manage the infrastructure. Ugh!

Not really sure what they were thinking. To my knowledge real factories don’t produce goods that can’t be consumed. Yet here many of you sit 7 years later with data centers full of unused 0’s and 1’s. Producing them at a frantic pace, but doing nothing with them. Because the push was to collect data but there was no plan on how to utilize the data.

Data Science

datasciencewordcloud

Over the past several years I have spent a great many hours consuming free training about Data Science via Coursera. Why would I read “Data Science for dummies” when geniuses like Roger Peng and Jeffrey Leek of Johns Hopkins are teaching Data Science courses. Free courses! Free courses that I can take from the comfort of my own sofa I should add. When they recently authored Executive Data Science – A guide to training and managing the best Data Scientists, I figured I could afford to pay for their book since I had already MOOChed off of their expertise so much. I bring up their book because they had a profound concept that you may want to write in permanent ink on your monitor … “The key word in data science is not data, it is science. Data science is only useful when the data are used to answer a question. That is the science part of the equation.”

No wonder these guys are professors at Johns Hopkins. Seriously as I start this series on Big Data and Data Science I wanted to ensure that we are all on the same footing. As I refer to the term “Data Science” it’s always, always, always going to be in regards to applying science to data to answer some business question.

Data Science, like anything new, has been greatly over hyped for sure. Many businesses jumped in with both feet and lots of money praying that they would magically uncover a “Beer and Diapers” or “predicting pregnancy” story of their own that would help their company make a billion dollars in the following quarter. What in the world were they thinking? Data science isn’t black magic that you just conjure up answers with … it’s science. It follows scientific principles. It takes discipline.

Unfortunately due to some of the expected failures due to a lack of using reasoning, many, many more are sitting on the sidelines watching their business lose money hand over fist ignoring the fact that data science is available. They don’t understand how data science works so they simply ignore it instead. What in the world are they thinking?

Is data science for everyone? Of course not.  But tucking your head in the sand while other companies use it as a competitive asset just isn’t a good business practice. You want to separate “hype” from “hope” so you know if it is right for you then start with “What is the question I am trying to answer with data.” Follow that up with “Do I even have the data I need to answer it?” If the answer to both is yes, then allow the science to lead you to the answers that is hidden in your data.

Big Data

One of the reasons for so many dashed hopes and dreams is that some organizations starting building massive data lakes thinking the more data I have the better the answers I can get. They had no business questions in mind they just figured if they assembled enough data files together on disk drives that problems would somehow solve themselves. Quite simply they ignored the science and focused on the data. I don’t want you to make the same mistake.

If you abigdatawordcloudre going to undertake anything new like Data Science or Big Data you have to understand that major changes like this require organizational change as well. They aren’t just a technical matter.  If you are going to go with a Big Data solution then for goodness sakes please start by following sound advice like that found in Benjamin Bowen’s book titled Hadoop Operations. He makes it clear that organizations must combine three facets of strategy: Technical, Organizational and Cultural.

The difficulty for many who have succeeded in Analytics but are afraid to jump into Big Data is the simple fact that it’s hard for many to truly understand what Big Data really is. I can’t blame someone for not wanting to invest in something that they can’t understand. At least “science” is a word that people can relate to and that’s why Peng/Leek focused on their phrase immediately as they began their book. It gives you a point of reference.

Unfortunately Big Data is an entirely different beast. I wish I could write something profound like “The most important word in Big Data is big” or “The most important words in Big Data” is data. To help you focus. But the truth is the most important word in “Big Data” is neither big, nor data. The most important word to describe it is actually a set of the 3 words: Volume, Veracity and Variety. However the hard part even for the Qlik Dork to explain is that none of them alone explain the concept and you need to refer to them in combination and here is why:

Volume – Just because your organization has Gajigbytes of data doesn’t mean you need to turn to Big Data. Relational database systems, especially Teradata, can be grown to be as large as you will ever need so it’s not just volume that forces the issue.

Velocity – Simply means the speed with which the data is coming. There are all sorts of interfaces that handle rapidly moving data traffic so again, that alone doesn’t constitute a need for Big Data.

Variety – In the context of the Big Data field it is most often used to refer to the differences between structured and unstructured data. Unstructured data would be things like documents, videos, sound recordings etc. Don’t let me shock you when I say this but “I was storing those things into SQL Server 20 years ago as BLOB’s (binary large objects.)” So guess what, again this “variety” by itself isn’t what big data is about.

So what then is Big Data? It is a combination of all 3 of those things and oh by the way you also need to include business components like time and money. Big Data is centered around the fact that you can use commodity hardware including much cheaper disks than you would typically use for large Storage Area Network (SAN) disk infrastructure. The reason that it is typically considered “faster” in terms of storage is that it doesn’t deal with transactions and rows it simply deals with big old blocks of data so massive files are a breeze to store. The fact that it is block/file oriented means it doesn’t really matter what you throw at it. A stack of CSV or XLS or XML files, a bunch of streaming video or HL7 or sound no problem. You throw and go.

So you can store a wide variety of data, quicker and at less cost than you would using a traditional RDBMS type system. Bonus is also the time savings because nobody in IT really needs to be involved in the process once the infrastructure is put in place. You can have data available and within no time your analysts or your data scientists can begin consuming the data. No requirements documents. No prioritization process. No planning meetings. Very little overhead. And oh by the way it allows the business to actually own the process of solving the problems that they business has. Crazy concept I know.

Examples

Enough of my musing, let’s just get down to a few practical examples.

Vaccinations and Side Effects

This week I met two of the most wonderful young Data Scientists. Liam Watson and Misti Vogt just graduated from Cal State Fullerton and delivered a presentation at the Teradata Conference in Atlanta, Georgia on a phenomenal use for data regarding the side effects of vaccinations. In the coming weeks I will be presenting their research and application, but I wanted to quickly plant a seed regarding their work that I think makes an excellent pitch for those of you who may be on the fence about proceeding with Data Science or Big Data.

Much of the “science” of what they did revolved around data that parents completed to report side effects after getting their child vaccinated. The form, like so many in the healthcare and other industries is a typical check this box for this condition, check that box for that condition … Other (Please type in) kind of thing. The check boxes would be considered structured data. The “other” would certainly be considered unstructured 0’s and 1’s that get manufactured in our EHR factories and left to accumulate dust.

vaccincation

If these two used Static Reporting they would have had no choice but to simply ignore the “other” category and count up how many of A, B, C, D or E were checked. But let’s face it if these two were ordinary I wouldn’t be talking about them. Instead they chose the path of using Data Science (which says you can’t leave data behind just because it doesn’t fit your simple report query model and isn’t clean) and they needed to use Big Data because it provides them with so many wonderful text analytics functions.

What they uncovered was that White Blood Cell Disorder which came from the hand input “Other” text box was the third highest side effect. To me that’s like gold. It’s a discovery that quite simply would be overlooked in a traditional environment because it didn’t fit the “we can only deal with structured data mold.”

There is a lot of time and effort expended in tracking physicians and beating them over the heads if they don’t sign off on documentation in a timely manner. I certainly understand that without their signature the organization doesn’t get paid. But I can’t help but wonder what gold may be lying in the textual notes that physicians dictate daily. Don’t believe your organization is ready for Data Science and Big Data to mine for that gold? Not sure what you are thinking.

Zika

I recently recorded a video showcasing a stunning use of Data Science and Big Data that was created by two of Qlik’s partners, Bardess Group and Cloudera. The application demonstrate the impact that accumulating data quickly from a wide variety of sources like weather, flights, mosquito populations, suspected and reported Zika infections and supply chain data could have when brought to bear on a problem like Zika.

Right now most organizations are still struggling to understand their own costs and understand their own clinical variances. Move to a population health model? Unthinkable for them as they can’t produce the static reports nor consume them fast enough to understand their own patients, let alone begin consuming data from payers, the census bureau etc.

As you watch the video and you hear the variety of data sources involved in the Zika demo, imagine the time and energy that would have to go into a project to do the same thing in a traditional way. As much as I “like” the work they’ve done to help with the Zika virus issue (and the work is continuing with aid agencies and hospitals), I “love, love, love” the use case it makes for the healthcare world that we need to embrace Data Science and Big Data not run from it because neither fits our current working models.

Summary

Blaise Pascal, the 17th century mathematician, once wrote “People almost invariably arrive at their beliefs not on the basis of proof, but on the basis of what they find attractive.” We have science that can help us find truth in data and yet we continue to perpetuate treatment plans based on myths and heresay.

We know our current organizational structures are failing to keep pace with the onslaught of changes and the amounts of data we are generating. But instead of changing to grow cultures that are more data fluent organizations are converting employees to 2×2 cubes so that they can “collaborate” more. No more data is being consumed but at least the status quo is maintained and employees now get to hear endless conversations with spouses and children.

Would I be wrong if I guessed that your organization has a backlog of hundreds of reports, while the previous 10,000 are seldom even if read? What if I guessed that the morale of the report writers is at an all time low because new requests are far outpacing their ability to generate them?

In his book Big Data for Executives author David Macfie puts it pretty eloquently “In a traditional system the data is always getting to you after the event. With Data Science/Big Data the goal is to get the information into your hands before the event occurs.” Put simply static reporting and traditional processes simply aren’t designed to handle the crisis of overrun data centers. I’m not sure what in the world organizations are thinking that are doubling down on static reports.

To be honest I’m not entirely sure what in the world I was thinking taking so long to write this as my thoughts have been bubbling up for so long. If you have yet to actually begin researching or are among those burying your head in the sand and ignoring Data Science and Big Data then you know what is coming … What in the world are you thinking?

Posted in Data Science / Big Data | Tagged , , , , | 1 Comment

Visualizing Data that does not exist … aka Readmissions Dashboard

Many who make requests seem to have a belief that Business Intelligence is magic. They loose their ability to listen to logic and reason and simply ask you to do the impossible.

Magician

Pulling data from 18 different sources, many of which that you don’t even have access to. Childs play like pulling a rabbit from a hat.

Turning bad into good and interpreting the meaning of the data. A little tougher kind of like making your stunning assistant float in midair.

Creating a readmissions dashboard. Hey we aren’t Houdini.

That data doesn’t even really exist. Oh sure it exists in the minds of the people who want you to produce it out of thin air, but I’ve yet to see a single Electronic Health Record that stored readmission data. They only store admission data, not RE-admission data.

Patient Name Admission Date Discharge Date
John Doe 1/1/2016 1/4/2016
John Doe 1/7/2016 1/10/2016
John Doe 1/30/2016 2/4/2016

Those who want dashboards for Readmissions look at data like the above and talk to you like you are insane because in their minds it is clear as day that John Doe was readmitted on 1/7, 3 days after their first visit, and was then readmitted again on 1/30, 20 days after his second visit.

You try to explain to them that there is nothing in any of those rows of data that says that. They have filled in the missing data in their minds but in reality it doesn’t exist in the EHR. They respond with all you need to do is have the “report” do the same thing and compare the admission date to the discharge date for subsequent visits. You respond with “Let’s say I could make SQL which is a row based tool magically compare rows, what should I do about the following which is more like the real data?”

Patient Name Admission Date Discharge Date Patient Type
John Doe 1/1/2016 1/4/2016 Inpatient
John Doe 1/7/2016 1/10/2016 Outpatient
John Doe 1/30/2016 2/4/2016 Inpatient

They say “Oh that’s easy, when you get to the visit on 1/30 just skip the visit from 1/7 because it’s an outpatient row and we don’t really care about those and compare the 1/30 admission to the 1/4 discharge.” To which you respond “Well that’s easy enough now I’ll not only somehow make SQL which can’t compare rows magically try to compare rows and if it is an outpatient row I’ll tell SQL to skip it and compare it to something 2 rows above, or maybe 3 rows above or 10 rows above.”

Just then you remember the reality is more complicated than that. In reality you aren’t just comparing all inpatient visits (other than for fun) what you really care about are if the visits were for the same core diagnosis or not.

Enc ID Patient Name Admission Date Discharge Date Patient Type Diagnosis
1 John Doe 1/1/2016 1/4/2016 Inpatient COPD
2 John Doe 1/7/2016 1/10/2016 Outpatient Stubbed toe
3 John Doe 1/30/2016 2/4/2016 Inpatient Heart Failure
4 John Doe 2/6/2016 2/10/2016 Inpatient COPD
5 John Doe 2/11/2016 2/16/2016 Inpatient Heart Failure

You don’t want to compare the 1/30 visit to the 1/4 discharge because the diagnosis aren’t the same you only want to compare the 2/6 visit to the 1/4 discharge and you need to compare the 2/11 visit with the 2/4 discharge.

If you think this is like making a 747 disappear before a crowd of people on all sides, just wait it gets worse.

Not only does the EHR not include the “readmission” flags, it doesn’t really tell you what core diagnosis the visit should count as. Instead what they really store is a table of 15-25 diagnosis codes

Enc ID ICD9_1 ICD9_2 ICD9_3 ICD9_4 ICD9_…. ICD9_25
1 491.1 023.2 33.5 V16.9 37.52

Good thing for your company you used to be a medical coder so you actually understand what the mysterious ICD9 or ICD10 codes stand for. You know for instance that the 491.1 really means “Mucopurulent chronic bronchitis.” It would be nice if that correlated directly to saying “This patient visit is for COPD.” But since we are uncovering magic why not explain the whole trick. You see if the primary diagnosis code is any of the following:

491.1, 491.20, 491.21, 491.22, 491.8, 491.9, 492.0, 492.8, 493.20, 493.21, 493.22, 494.0, 494.1, 496

 Then the visit may be the result of COPD but you also have to check all of the other diagnosis codes and ensure that none of them contain any of the following other diagnosis codes:

33.51, 33.52, 37.51, 37.52, 37.53, 37.54, 37.62, 37.63′, 33.50, 33.6, 50.51, 50.59, 52.80, 52.82, 55.69′,’196.0, 196.1, 196.2, 196.3, 196.5, 196.6, 196.8, 196.9, 197.0, 197.1, 197.2, 197.3, 197.4, 197.5, 197.6, 197.7, 197.8, 198.0, 198.1, 198.2, 198.3, 198.4, 198.5, 198.6, 198.7, 198.81, 198.82, 198.89, 203.02, 203.12, 203.82, 204.02, 204.12, 204.22, 204.82, 204.92, 205.02, 205.12, 205.22, 205.82, 205.92, 206.02, 206.12, 206.22, 206.82, 206.92, 207.02, 207.12, 207.22, 207.82, 208.02, 208.12, 208.22, 208.82, 208.92, 480.3, 480.8, 996.80, 996.81, 996.82, 996.83, 996.84, 996.85, 996.86, 996.87, 996.89, V42.0, V42.1, V42.4, V42.6, V42.7, V42.81, V42.82, V42.83, V42.84, V42.89, V42.9, V43.21, V46.11

If you have ever been asked to produce a Readmissions Dashboard you probably understand why I’ve correlated this to magic. Every time you think you know how to grab the rabbit by the ears to accomplish the trick, the rabbit changes into an elephant.

Fortunately your assistant isn’t the traditional 6 foot blonde, your assistant is Qlik. I’m going to explain how to make the 747 disappear in three easy steps that any of you will be able to reproduce:

Step 1

The heavy lifting for this trick actually involves the ICD9/10 codes. If you combine the 15-25 diagnosis codes into 1 field, then you you can use it to more easily compare the values to determine what core diagnosis you need to assign to each encounter. Qlik helps you accomplish that with simple concatenation as you are loading your encounter diagnosis data:

ICD9_Diagnoses_1 & ‘, ‘ & ICD9_Diagnoses_2 & ‘, ‘ & ICD9_Diagnoses_3 & ‘, ‘ & ICD9_Diagnoses_4 & ‘, ‘ & ICD9_Diagnoses_5 & ‘, ‘ & ICD9_Diagnoses_6 & ‘, ‘ & ICD9_Diagnoses_7 & ‘, ‘ & ICD9_Diagnoses_8 & ‘, ‘ & ICD9_Diagnoses_9 & ‘, ‘ & ICD9_Diagnoses_10 & ‘, ‘ & ICD9_Diagnoses_11 &’, ‘ & ICD9_Diagnoses_12 & ‘, ‘ & ICD9_Diagnoses_13 & ‘, ‘ & ICD9_Diagnoses_14 & ‘, ‘ & ICD9_Diagnoses_15 as [All Diagnosis]

Step 2

One of the really nifty tricks that Qlik can perform in data loading is a preceeding load. A preceeding load simply means you have the ability to write code to refer to fields that don’t exist yet and won’t exist until the code is actually run. The following code is abbreviated slightly so that it’s easier to follow logically but the entire set of code is attached to the post so that you can download it. The “Load *” right below Encounters tells Qlik to load all of the other from the second load statement first, then come back and do the code below. This way we can construct the [All Diagnosis] field and refer to it within this code. You could repeat all of the logic for concatenating all of the fields for all 5-10 of the core diagnosis you want to track, or you could load the encounters and simply do a subsequent join load but you don’t have to. The Preceeding load makes your life easy and works super fast.

Encounters:

This is the preceeding load
Load *,
// If the primary matches then it’s possibly COPD and if the none of the other 14 are one of the values listed then it definitely is COPD
IF ( Match([ICD9 Diagnoses 1] , ‘491.1’, ‘491.20’ … ‘493.21’, ‘493.22’, ‘494.0’, ‘494.1’, ‘496’) > 0
And WildMatch([All Diagnosis], ‘*33.51*’, ‘*33.52*’, ‘*37.51*’ … ‘*V43.21*’, ‘*V46.11*’) = 0, ‘COPD’,
 
// If we found COPD great, otherwise we need to check for Sepsis
IF (Match ([ICD9 Diagnoses 1] , ‘003.1’, ‘027.0’, … ‘785.52’ ) > 0
And WildMatch([All Diagnosis], ‘*33.50*’, ‘*33.51*’ … ‘*V43.21*’, ‘*205.32*’) = 0, ‘Sepsis’,
‘Nothing’)) as [Core Diagnosis];
 

This is the regular load from the database or file
LOAD
MRN,
EncounterID,
……
[ICD9 Diagnoses 1],
[ICD9 Diagnoses 2] …..

Step 3

The final step, which many believe to be the hardest is actually the easiest to do within Qlik. In fact truth be told when I was a young whipper snapper starting out on my Qlik journey I tried to do everything in SQL because I knew it so well, and did minimal ETL within Qlik itself until I found about this Qlik ETL function. The function is simply called “Previous.” It does exactly what it sounds like it … it allows you to look at the previous row of data. Seriously, while you are on row 2 you can check the value of a field on row 1. In practice it works just like this:

IF(MRN = Previous(MRN) …..

How cool is that? How do I use it for solving this readmissions magic trick? Just like this:

IF(MRN = Previous(MRN),’Yes’, ‘No’) as [Inpatient IsReadmission Flag],

If the MRN of the row I’m on now, is the same as the MRN of the previous row, then yes this is a readmission, otherwise no this is not a readmission it is a new patients first admission. Actually that’s the simplified version of my code.

My code actually thinks through how the results would need to be visualized. Besides an easy human language Yes/No flag someone is going to want to get a count of the readmissions right? Does the Qlik Dork want to have charts or expressions that would have to use IF statements to say if the flag = Yes, of course not. I want the ability to have field that is both human readable Yes/No, but also computer readable for counting 1/0. That’s where the magic of the DUAL function comes into play. It gives me a single field that can be used for both needs.

IF(MRN = Previous(MRN),Dual(‘Yes’, 1),Dual(‘No’,0)) as [Inpatient IsReadmission Flag],

Using the Dual data type allows me to provide the end user with a list box while also allowing me to provide very fast performing expressions:

Sum([Inpatient IsReadmission Flag])

How does the entire Readmissions load work? After loading the encounters, and allowing the preceeding load to qualify the encounters into core diagnosis types I simply do a self-join to the encounter table referring only to the inpatient records and ordering the data by the MRN and the Admission date and time.

Left Join (Encounters)
LOAD
EncounterID,
IF(MRN = Previous(MRN),Dual(‘Yes’, 1),Dual(‘No’,0)) as [Inpatient IsReadmission Flag],
IF(MRN = Previous(MRN),Previous([Discharge Dt/Tm])) as [Inpatient Previous Discharge Date],
IF(MRN = Previous(MRN),Previous(EncounterID)) as [Inpatient Previous EncounterID],
IF(MRN = Previous(MRN),NUM(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])),’#,##0.00′)) as [Inpatient Readmission Difference],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 30.0, Dual(‘Yes’, 1),  Dual(‘No’,0)), Dual(‘No’,0)) as [Inpatient IsReadmission within 30]
Resident Encounters
Where [Patient Type] = ‘Inpatient’
Order by MRN, [Admit Dt/Tm];

If you are paying attention you’ll notice that the above is simply our “for fun” counts to show all inpatient readmissions and has nothing to do with any of the core diagnosis. In order to perform that trick I do the same basic steps but I enhance my where clause to only look for encounters that have a core diagnosis of COPD and I simply name my flags and other fields differently.

Left Join (Encounters)
LOAD
EncounterID,
IF(MRN = Previous(MRN),Dual(‘Yes’, 1),Dual(‘No’,0)) as [COPD IsReadmission Flag],
IF(MRN = Previous(MRN),Previous([Discharge Dt/Tm])) as [COPD Previous Discharge Date],
IF(MRN = Previous(MRN),Previous(EncounterID)) as [COPD Previous EncounterID],
IF(MRN = Previous(MRN),NUM(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])),’#,##0.00′)) as [COPD Readmission Difference],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 30.0, Dual(‘Yes’, 1), Dual(‘No’,0)), Dual(‘No’,0)) as [COPD IsReadmission within 30],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 90.0,’Yes’, ‘No’), ‘No’) as [COPD IsReadmission within 90]
Resident Encounters
Where [Patient Type] = ‘Inpatient’ and  [Core Diagnosis] = ‘COPD’
Order by MRN, [Admit Dt/Tm];

And just when you think I’ve pulled as much handkerchief out of my sleeve that it can possibly I hold I do the same steps for Sepsis this time.

Left Join (Encounters)
LOAD
EncounterID,
IF(MRN = Previous(MRN),Dual(‘Yes’, 1),Dual(‘No’,0)) as [Sepsis IsReadmission Flag],
IF(MRN = Previous(MRN),Previous([Discharge Dt/Tm])) as [Sepsis Previous Discharge Date],
IF(MRN = Previous(MRN),Previous(EncounterID)) as [Sepsis Previous EncounterID],
IF(MRN = Previous(MRN),NUM(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])),’#,##0.00′)) as [Sepsis Readmission Difference],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 30.0,Dual(‘Yes’, 1), Dual(‘No’,0)), Dual(‘No’,0)) as [Sepsis IsReadmission within 30],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 90.0,’Yes’, ‘No’), ‘No’) as [Sepsis IsReadmission within 90],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) <= 120.0,’Yes’, ‘No’), ‘No’) as [Sepsis IsReadmission within 120],
IF(MRN = Previous(MRN),IF(Interval([Admit Dt/Tm]-Previous([Discharge Dt/Tm])) > 120.0,’Yes’, ‘No’), ‘No’) as [Sepsis IsReadmission > 120]
Resident Encounters
Where [Patient Type] = ‘Inpatient’ and  [Core Diagnosis] = ‘Sepsis’
Order By MRN, [Admit Dt/Tm];

And then for AMI. And then for CHF. And then for … Oh you know the handkerchief can go on forever and eventually we end up with a data model that includes all of these awesome fields that didn’t exist when we began so that we can actually do our work.

ReadmissionsFields

Voila a Readmissions Dashboard

Not only can we then provide a really nice looking dashboard which includes accurate statistics we can do it using very simple expressions that are incredibly fast.

ReadmissionsDashboard

readmissionsDashboard_MeasureForSum

Click this link to get the entire Readmissions Code start script: ReadmissionsCodeScript

Posted in Visualization | Tagged , | 5 Comments

Have you ever wondered …

Have you ever wondered what events happen to patients after a particular surgery is performed?

SurgicalEventCompare

Well I did. Like I seriously can’t sleep when I start wondering about things like that. I start believing crazy things like we can change the world by using analytics. What do you when you get crazy analytical questions in your head? Do you just let them go or do you dig and scratch and claw until you pull the data together and solve the puzzle?

In this case even though it’s just a hypothetical example for a blog post I still worked crazy hours setting up the data, building the application, filming the video and writing this post. Why? Because I think there is huge value in tracking not just the variances in costs and timing for individual procedures but in analyzing an entire series of events as well.

Notice I used the word “events” and not just “procedures.” Certainly it would be nice to know if having 1 procedure leads to another procedure in 75% of the cases for a physician. But wouldn’t it also be nice to know how often a procedure leads to a patient having a Code Blue? Or having to have a tube placed? You know … KEY MEDICAL EVENTS in a patients stay. Or … even their return after a stay?

Ok now that we are all agreed me working crazy hours to set this up is a valuable exercise let’s examine what I will demonstrate in my video.

  1. I use an Aster NPath SQL-MR query just like in a previous post to process a set of surgical event data that I’ve loaded.
  2. I also take advantage of Qlik’s ability to do some cool ETL things on the fly and I capture the First Event and the Last Event so that in the UI I can choose which procedure I want to start with or likewise in your world you could select the last event to occur and find the various paths that led to that preceded that event’s occurrence.
  3. While I was at it I also load in some sample patient demographic information to demonstrate that the advanced analytics you can do with Teradata Aster doesn’t have to be visualized in a vacuum. Of course you will want to take advantage of the Qlik Associative model and load data from as many sources as needed.
  4. The application consists of two basic screens. The first is a blah-blah-blah you can filter the data using demographic information and see the results of the NPath query visualized in a Sankey Diagram just like you would expect. The second screen is more a “Are you kidding me I didn’t know you could do Alternate States in Qlik Sense like you can in QlikView” kind of thing you would expect from a Qlik Dork. I demonstrate the ability to compare the event paths between different patient sets thanks to the great extensions built by Svetlin Simeonov.

I could have just shared the video but where is the fun in that I had to do a little creative setup so that you would understand what you were watching.

Do I think you are going to run right out and start building an application like this to analyze surgical events?

Of course I do. I’m a dreamer. I wouldn’t put this kind of effort into something if I didn’t believe it would spark an interest in at least a few of the readers to really start putting advanced analytics to work. Perhaps not for this specific situation but certainly there is some other big problem you’ve wanted to tackle that is like this. You have all of the pieces you need at your fingertips … so GET GOING!

 

Posted in Uncategorized | 1 Comment