
Using AI to Capture Unstructured Medical Records Data
At a Glance
We used hyperautomation — the combination of AI, RPA, and machine learning — to collect and process high volumes of unstructured data with great accuracy and and much lower cost.
As a pharmaceutical company specializing in infusion, specialty pharmacy and telepharmacy, our client, CarepathRx, uses technology to transform how health systems, hospitals, and pharmacies take care of their patients. CarepathRx partners with these providers to deliver the most comprehensive and seamless pharmacy care in the industry.
Like all industries, pharmaceuticals face dramatic changes — and opportunities — because of artificial intelligence. AI accelerates the development of groundbreaking new medications and approaches, such as gene therapy.
In addition, AI will help speed the identification of human trial participants. Both factors mean that pharmaceutical companies will need to find new ways to speed up their operations to manage the influx of new medications and help them get the drugs they need to patients as quickly as possible.
One barrier to improving operations has been collecting data from medical documents, such as hospital discharge notes. These documents are vital for accurate pharmacy care, but they come in various formats and contain complex information. Manually processing them is time-consuming and error-prone, leading to inefficiencies in patient care and increased operational costs.
To continue providing the best pharmaceutical care while improving accuracy, reducing costs, and preparing for the future, CarepathRx needed a modern way to capture data from its medical records.
Processing Unstructured Data With AI
The ‘70s called, and they want their faxes back… In all seriousness, the process of connecting people with healthcare providers, such as pharmacies, is still driven by large amounts of unstructured data. Automation opportunities presented with AI/ML solutions can bridge the gap until we get to a more integrated landscape.
By the Number
Enter Centric: From Unstructured Data to Automated Processing
To kick off the project, CarepathRx and Centric Consulting’s Operational Excellence team began by exploring robotic process automation (RPA) to solve the company’s operational challenges. While we reviewed the company’s business case, we determined that combining RPA with emerging large language model (LLM) technology would provide a more robust solution that would meet our client’s near and long-term goals.
The combination of RPA and AI, known as hyperautomation, joins RPA’s ability to automate simple, straightforward tasks with AI’s power to add intelligence to the equation. While RPA can automate the intake and routing of documents, AI can capture the data and transform it into a usable form.
The high volume of discharge notes that come into a pharmaceutical provider like CarepathRx qualifies them as big data — a large amount of always-growing information that is an important part of decision-making processes and efficiency improvements. When data scientists classify big data, they generally place it into three buckets:
- Structured: Easily accessible data, especially when well-maintained and stored in a database.
- Unstructured: Data that is hard to access because it is unorganized. For example, it may appear in different places on forms, be handwritten, or otherwise difficult for a computer to capture and understand.
- Semi-structured: Like unstructured data, semi-structured data does not come in a neat, tabular format, but it may contain metadata that provides instructions about what it is.
Hospital discharge notes fall into the category of unstructured data. To address the challenge of capturing this data from CarepathRx’s notes, we first built an RPA solution that captures the data from the PDF documents using optical character recognition tools.
Next, we used Python and the large language model (LLM) power of AWS Bedrock (Claude2) to translate the optical scans into data computers can understand, regardless of the documents’ varying formats, locations within forms, and highly detailed content.
Another of our solution’s standout features is its ability to segment text into manageable portions. As anyone who has tried to use LLMs for very large projects knows, these tools are limited in how much information they can process in one request. Text segmentation overcomes such LLM “context window” restraints, ensuring comprehensive and precise processing of the entire discharge note — a significant step forward in automating and refining the extraction of essential healthcare data.
Our solution promises to revolutionize the way CarepathRx manages and uses essential patient information.
The Results: Near 100 Percent Accuracy With No Added Labor Costs
Currently, we are talking with CarepathRX about potential next steps to bring the solution to market, but the early results are impressive. The technology helped CarepathRx streamline hospital discharge note extraction and improve operational efficiency with near 100 percent accuracy, and our economical solution supports growth without adding labor costs.
Conclusion
Businesses have long recognized traditional RPA’s ability to speed the performance of repetitive tasks, but the addition of LLM technology can expand RPA’s applications into use cases that have more variability, such as hospital discharge notes.
By understanding the types of information that these notes contain, rather than simply the specific areas of the form that contain that information, hyperautomation solutions like ours can ingest and categorize data no matter where it appears. While our engagement only documents this capability in one area, the potential for other applications is very exciting.