Why the Data, Not Just the Tools, Needs to Be the Focus for AI in Healthcare
Remember when IBM’s Watson supercomputer won “Jeopardy!”? It wasn’t that long ago, but the euphoria that accompanied the feat — and the subsequent disappointment that followed when the technology failed to live up to its commercial aspirations — should loom large in the minds of healthcare and Life Sciences researchers who are now being inundated with pitches about how generative AI will transform healthcare.
There is no doubt that the technology has now been optimized and the game show parlor tricks have been replaced by real-world achievements like transforming new-drug discovery, accelerating new healthcare technology development, and reshaping the way Life Sciences teams approach research. However, it is important to put the value in perspective by focusing less on the shiny new AI tools coming to market and more on the underlying data that is being used to develop and train them.
The Data Science of AI
AI and large language models are not new. The underlying thesis behind the technology — ingesting massive volumes of data, identifying patterns in that data, and generating predictive outcomes based on pattern recognition — has been around since the earliest days of data science. What is new is the nearly boundless, widely accessible computing power required to process this information at scale in fractions of a second. But, just as a single miskeyed data field or duplicate entry could derail a data scientist’s research in the ’80s and ’90s, so too can bad data in a generative AI model proliferate quickly, turning today’s promising idea into tomorrow’s headache. A predictive model based on general interest and publicly available data scraped from the internet will always make incorrect assumptions by virtue of the fact that it is not being informed by the right dataset.
Healthcare Presents Unique Data Challenges
This is what makes healthcare insights such a challenge for generative AI. The underlying information, typically from medical claims, electronic health records (EHR), lab results, and other structured and unstructured data sources, is notoriously challenging to work with. At Komodo, we’ve had a front-row seat to this challenge, as we’ve spent the last nine years building a Healthcare Map™ that tracks the complete healthcare journeys of over 330 million patients across the U.S.
The Healthcare Map is the most comprehensive and detailed source of patient journey data in the industry, but getting to that point has not been not easy. In fact, over the course of our work building the Healthcare Map, we’ve encountered all manner of challenges with data quality. Whether it’s a structural break in the time series being used, patient mastering and tokenization issues whereby a single patient is represented by multiple tokens or multiple patients are captured in a single token, or even inconsistencies in how data is keyed in by providers, there are countless ways in which flawed data can create wildly inaccurate results.
Weigh the Source
Addressing those inaccuracies and rooting out the kinds of duplicates, missing encounters, fragmented patient journeys, and anomalies that stymie most AI-powered research tools are at the absolute core of research-grade data science. Our team is dedicated to remedying these issues and creating a more devout data discipline. With solutions like MapLab™, which seamlessly joins our Healthcare Map and fully integrated insight-workflow solutions, powerful analytics, and highly customizable tools, we’re making it possible for Life Sciences teams to unlock the power of AI-driven research rooted in a robust, specialized data platform.
The real focus of the generative AI revolution in healthcare needs to be on the provenance of the data being used to train these models and the expertise of those doing the training. The details matter — nuances are critical when it comes to both the inputs and the outputs of generative AI models. As we navigate this evolving landscape, it is imperative that we prioritize building a foundation of high-fidelity, high-quality data to fully realize AI’s potential in healthcare and reduce the burden of disease.
For more information about how Komodo is using AI to power healthcare, check out our latest piece with COTA CEO Miruna Sasu, “From Data to Discovery: How AI Is Accelerating Progress in Oncology.”
To see more articles like this, follow Komodo Health on X, LinkedIn, or YouTube, and visit Insights on our website.