How we helped a clinical terminology company expedite the consistency of medical descriptions using machine learning.
Something consultants often experience as they go from client to client is that each company, department, or even team has its own jargon or dialect. Every organization has different assumptions, acronyms, interpretations, and shorthand based partly on the work its members do, partly on “someone said it that way once, and it stuck,” and partly on “this is just the way I say it.”
Healthcare teams are no exception to this, but they have unique concerns about continuity of care, patient safety, tracking diseases, learning the most effective treatments, regulatory compliance and even getting reimbursed for care provided. These concerns in healthcare make it critical to reach a common understanding across teams despite the difference in dialects.
Our client, a clinical terminology company, is in the business of helping create that shared understanding. They help translate the dialects of more than 4,500 hospitals and 500,000 physicians into a consistent, common language. It’s a daunting task. Take medical procedure descriptions, for example — our client maintains a growing list of over 350,000 distinct medical procedures, often dealing with nuances as minute as the difference between “physical therapy for 15 minutes” and “physical therapy up to 15 minutes.”
Unsurprisingly, it took highly trained, hard-to-find experts to map millions of procedure descriptions into this shared vocabulary. The rarity of these experts presented a particular, quickly growing concern to our client. For medical procedures alone, it saw over 1000 new ways to describe medical procedures each month. To help sustain this growth, our client turned to natural language processing (NLP) and machine learning (ML) to make the task less labor-intensive and began to build a data science team to take this on.
Enter Centric: Using Machine Learning to Map New Descriptions
While searching for a second data scientist, our client engaged Centric Consulting to assist with Machine Learning and Data Science for three months to embed with their team, advise them, and accelerate results. The combined team focused on improving the capacity of the team of experts responsible for mapping new descriptions for medical procedures to the current common vocabulary by helping them with the mapping task.
The process these experts followed had two layers that every new description would go through:
- Experts research each description and propose a mapping to one of the 350,000+ standard procedure descriptions.
- Senior experts review each proposed mapping to approve or correct it.
There was already a tool in place available to the first layer of experts to help identify potential mappings. This tool gave them a list of 10 potential mappings, but the correct mapping was only in the list 50% of the time. Our goal for improving capacity was to develop a better recommendation tool which would also include a level of “confidence” to enable:
- Pass through very high confidence recommendations without any expert review.
- Skip the first level of expert research and go straight to the final review for high confidence recommendations.
- Present a list of strong recommendations to the first level of research for the remaining new descriptions.
Our client’s first data scientist had developed a potential replacement for this tool using machine learning. This advanced approach relied on an open-source, deep learning neural network algorithm called BERT (developed by Google) combined with a K-Nearest Neighbors (KNN) model to help identify sentences with similar meanings. This approach was much better than the original, with the correct term in the top 10 list 72% of the time. Centric’s Senior Data Scientist recommended an ensemble approach and developed two additional recommendation algorithms.
The Results: Higher Success Rates of Correct Mapping
When we combined the results of the three algorithms in an ensemble, we were able to achieve dramatic results:
- The client could map 10% of new descriptions to a single recommendation with enough accuracy to pass through without expert review.
- The client could map 28% of new descriptions to a top-3 recommendation list with enough accuracy to be sent directly to the final review.
- The client could map 62% of the remaining new descriptions to a top-10 recommendation list where the correct procedure was in the list 74% of the time.
As an embedded member of our client’s team, Centric helped deliver several other technical items:
- Used Amazon Web Services (AWS) to implement a scalable, message-driven service to produce BERT and KNN models
- Wrote a custom extension to the Sci-kit Learn python package to add functionality to K-Nearest Neighbors models
- Implemented a procedure to train a custom BERT model from scratch using AWS EC2 spot instances to minimize cost
- Implemented Python program to convert Tensorflow 2.0 models to Pytorch models
- Proposed architecture for a fully-automated refresh of the BERT-driven KNN models.
At the end of the work, Centric improved the consistency of the recommended medical description mappings by delivering multiple improvements to the current tool. This improvement enabled our client to have more confidence in the results while also increasing the number of mappings completed.