GNS Healthcare Blog

The Data Analytics Pyramid – Climbing to Optimization & Inference


The availability of data combined with the power of Artificial Intelligence (AI) is causing disruption and raising questions across a number of industries, including healthcare. The electronic medical record has provided digitized health information. Genomic data is now working its way into datasets. There have been impactful innovations in the areas of targeted intervention, drug and device development, software applications, population health approaches, collaboration among healthcare stakeholders and, precision medicine.

While these new developments are exciting, what matters most is still determining which technologies produces real results are actually improving patient care outcomes and lowering overall healthcare costs. The question remains: where is the value?

Collecting and analyzing data is a good start but it is not enough to obtain the answers to that important question. We need to use the ever-growing repository of data as fuel to reverse engineer the underlying system that produced the data in the first place. To truly benefit from the promise of precision medicine, we must leverage this data, provider by provider, prescription by prescription, and procedure by procedure in a way that allows us to simulate the many millions of possible decisions and their outcomes into the systems.


Climbing the Data Analytics Pyramid

So how do we cut through the noise AI, big data, machine learning, and data analytics to achieve an advanced level of discovery?

One way to gain clarity is by examining the work of Tom Davenport, a distinguished professor of information technology and management at Babson College and Harvard University. Davenport’s 2007 book titled Competing on Analytics describes companies in a variety of industries including financial services, retail and gaming and how to gain a competitive advantage by adopting increasingly sophisticated data analytics and machine learning tools.

The book introduces a pyramid with a series of levels, similar to Maslow’s Hierarchy of Needs, that shows the available tool set and key questions that can be answered at each step.

At the base of the of the pyramid are the Standard Reports that most organizations are continuously generating. They provide a historical description for a population and answer the question: “What happened?” In a healthcare context, an insurance company could be asking, “What was the cost of diabetes in my patient population last year?”

The next level up is Statistical Analysis which attempts to derive patterns and trends from the historical data. It answers the question “Why is this happening?” Staying with the diabetes example, the insurance company might ask, “What percentage of those diabetics were admitted to the hospital?”

Moving up another level of data analytics takes us to Forecasting and Extrapolation. This studies the data further to answer questions like “What if these trends continue? What will be the cost of diabetes in my patient population next year and the year after?”

Next comes Predictive Modeling which answers the question “What happens next? Which diabetics in my population are going to get admitted to the hospital or which are going to develop chronic kidney disease?” Many people think this is the ultimate level of machine learning and AI but as Davenport points out, that isn’t the case. The question we have to ask ourselves is this: What is the point of predicting the future if we can’t change it? Predicting outcomes in itself is not sufficient.

That’s why we need to rise to the top of the pyramid, mining the data to enable Optimization and Inference. This is where the true discoveries happen. It’s not just about predicting what’s going to happen, but instead revealing what’s going to happen under many possible future scenarios and interventions. This ultimate level answers the question: “What will happen if I do this or that?”


Achieving Precision Medicine

At this apex of the pyramid, the enormous computing power that exists today can be combined with sophisticated mathematical algorithms to answer the “What if” questions using causal machine learning (CML), an advanced form of AI. Using CML, healthcare can go beyond correlation to uncover causality. CML empowers users to examine a nearly limitless number of hypotheses to determine what happens with various drug treatments, medical procedures and care plans.

Reaching the top of the pyramid makes it possible to identify the underlying causal mechanisms that initially gave rise to the data. We can then use these data driven models to perform hypothesis-free experiments in silico. Answering these ultimate “What if I do this?” questions is where real impact can be made. An impact like determining the right intervention at the right time for the right patient. In other words, precision medicine.  

Subscribe to the GNS Blog

Recent Posts: