Medical Terminology Machine Learning Client Story

Clinical Terminology Company Uses Machine Learning to Improve Efficiency

Enter Centric: Using Machine Learning to Map New Descriptions

While searching for a second data scientist, our client engaged Centric Consulting to assist with Machine Learning and Data Science for three months to embed with their team, advise them, and accelerate results. The combined team focused on improving the capacity of the team of experts responsible for mapping new descriptions for medical procedures to the current common vocabulary by helping them with the mapping task.

The process these experts followed had two layers that every new description would go through:

  1. Experts research each description and propose a mapping to one of the 350,000+ standard procedure descriptions.
  2. Senior experts review each proposed mapping to approve or correct it.

There was already a tool in place available to the first layer of experts to help identify potential mappings. This tool gave them a list of 10 potential mappings, but the correct mapping was only in the list 50% of the time. Our goal for improving capacity was to develop a better recommendation tool which would also include a level of “confidence” to enable:

  1. Pass through very high confidence recommendations without any expert review.
  2. Skip the first level of expert research and go straight to the final review for high confidence recommendations.
  3. Present a list of strong recommendations to the first level of research for the remaining new descriptions.

Our client’s first data scientist had developed a potential replacement for this tool using machine learning. This advanced approach relied on an open-source, deep learning neural network algorithm called BERT (developed by Google) combined with a K-Nearest Neighbors (KNN) model to help identify sentences with similar meanings. This approach was much better than the original, with the correct term in the top 10 list 72% of the time. Centric’s Senior Data Scientist recommended an ensemble approach and developed two additional recommendation algorithms.

The Results: Higher Success Rates of Correct Mapping

When we combined the results of the three algorithms in an ensemble, we were able to achieve dramatic results:

  1. The client could map 10% of new descriptions to a single recommendation with enough accuracy to pass through without expert review.
  2. The client could map 28% of new descriptions to a top-3 recommendation list with enough accuracy to be sent directly to the final review.
  3. The client could map 62% of the remaining new descriptions to a top-10 recommendation list where the correct procedure was in the list 74% of the time.

As an embedded member of our client’s team, Centric helped deliver several other technical items:

  • Used Amazon Web Services (AWS) to implement a scalable, message-driven service to produce BERT and KNN models
  • Wrote a custom extension to the Sci-kit Learn python package to add functionality to K-Nearest Neighbors models
  • Implemented a procedure to train a custom BERT model from scratch using AWS EC2 spot instances to minimize cost
  • Implemented Python program to convert Tensorflow 2.0 models to Pytorch models
  • Proposed architecture for a fully-automated refresh of the BERT-driven KNN models.

At the end of the work, Centric improved the consistency of the recommended medical description mappings by delivering multiple improvements to the current tool. This improvement enabled our client to have more confidence in the results while also increasing the number of mappings completed.