Raw data in insurance needs processing, cleaning, and polishing. Only then can it be packaged. This is what we call data enrichment.
Effectively mining the data you already have can be a low–cost effort with a high–value return. We’ve all heard of the Titanic disaster of 1912. Many of us know about “Unsinkable” Molly Brown. She performed actions that saved lives on that cold April morning, and untold thousands since.
However, we don’t know about her husband, James Joseph Brown. He worked for the IBEX Mining Company. In 1893, the Sherman Silver Purchase Act caused a free fall in silver prices. The company needed a new strategy. Enter J.J. Brown’s ingenuity. He developed a new method to hold back loose sand. The company was able to dig past the silver they found in the “Little Jonny Mine” to find an enormous vein of gold. J.J. Brown found a way to better mine the land he already owned.
Today, the insurance industry has mountains of data. Too many insurance companies lack the process to mine this mountain of data to its full potential. They have found a way to get the more accessible silver, but the gold buried under a mountain of loose sand remains elusive.
Just like in precious metal mining, data mining cannot effectively use the “Raw” ore. The raw ore must be processed. Impurities removed, and the valuable parts are then polished, packaged, and put on the market. Data mining is the same. Raw data needs processing, cleaning, and polishing – conforming into recognizable and inter-relatable structures. Only then can it packaged in a report and put on the “market.” We know this process as enrichment.
What is Enriched Data?
Many insurance companies have more than one source system. Over the years, they acquired other companies that use a different system for policy, claims, or billing. Many systems, that have run their company for a decade or longer, do not keep up with the insurance company’s pace of change.
New products, new processes, and poor performance under an ever-increasing load forced companies to migrate to new platforms. Even within the same systems, errors are identified and fixed, but the existing data still reflects some of these errors. The data is not the same across all systems, and within each system, there remain data anomalies.
How can you have a comprehensive report if the underlying data is in different structures with disparate codes and questionable quality? You must enrich the data.
The Value of Enrichment
There are many ways to discuss value. Let’s talk about the cost of not correctly enriching the data. An insurance company looked into agency performance. They set up criteria to measure which agencies brought in the most profitable business and which agencies cost the company money. They identified several agencies writing very few P&C policies, and in general, the loss ratios were very high.
They decided to put these companies on a development path. There were some rather harsh adjustments recommended. Seems like a good use of data, right? Not so fast. Let’s close this loop. The company failed to realize their data warehouse did not include its L&A book of business. Several of these “poorly” performing agencies were not actively marketing P&C insurance. They only sold P&C insurance to existing L&A clients as a service to keep the more lucrative L&A business. One of these “poorly” performing agencies was the largest, most profitable L&A agency the company had.
The lack of proper enrichment led this company to conclusions that were incorrect and put a very valuable relationship at risk.
The Cost of Enrichment
Enriched data is good. Enriching data is hard. In this blog series, we have discussed some of the risks and challenges. Data governance drives the enrichment process. Data governance is a load for the business side and the technology side. For every source system, every object, every attribute, and every measure there are multiple steps and many people working together to ensure accuracy and comprehension. This process is not cheap. Nor is it fast.
To ensure proper review and thought, data cannot become enriched overnight. Companies without a careful process might find their properly enriched data co-mingled with improperly enriched data that cause a complete loss of trust in the entire data set.
The Future is Hard to Predict
From where will tomorrow’s challenges emerge? Unfortunately, there is no crystal ball. A downturn in the economy may turn your primary three–year thrust from expanding into new ventures toward lowering costs or cash flow belt-tightening. While both are great goals, only one is the primary driver.
Because the future is hard to predict, data governance – that drives data enrichment and in turn drives formal reporting – must remain flexible. Also, the value propositions for finding new ways to glean information from raw data facts is not a straight road. An insurance executive was talking the other day about how the idea that “credit scores could predict claim costs” was not universally accepted when first introduced. Many insurance executives, at that time, did not understand how the credit score relates to if an incident would happen, or how severe it would be.
But now we know that credit score is a very useful predictor. How many other ideas didn’t pan out? How many were other ideas not pursued because there was no supporting data? The future is unpredictable and undoubtedly full of unexpected twists and turns.
Find Balance with a Structured Data Lake
Let’s embrace the coming uncertainty. Let’s prepare for the inevitable change. There is a simple and easy solution. Let’s collect all the data we can co-locate. We know collection as a data lake. Much like a city builds a reservoir to hold water for consumption by its citizens at a future time, we can build a reservoir to hold data.
Some of this data we immediately pipe into our data warehouse and onto operational reports and analytics dashboards. The rest of the data stays in the reservoir until we find a proper use. Some analyst, someday is going to develop a big, new idea. That future analyst needs data to drive conclusions. If we haven’t collected the data along the way, that analyst has to spend weeks, if not months collecting the data. Much of the data is likely lost over time.
So, now that analyst requires years to collect enough history to test her hypothesis. With a full reservoir of historical data collected in our Data Lake, that analyst can write her algorithms, test her hypothesis, and if correct, bring the next big idea forward in a small fraction of the time.
Defeat Property & Casualty Insurance Challenges with a Modern Analytics Approach – Ebook, Insurance
An analyst finds the next big idea because we previously thought to build a Data Lake. This new idea receives validity using a combination of cleansed, enriched data and some new raw data that nobody thought was of much value.
Let’s close this loop. Let’s get these new data feeds, objects or attributes over to our data governance team. Let’s figure out how much cleaning and polishing we need to do to promote this to our data warehouse and get this new idea democratized to everyone in our enterprise. Let’s allow necessity to help us figure out the right elements to on which to spend our precious data governance and enrichment budget.
J.J. Brown used hay to help hold back the loose sand and found one of the largest gold strikes in Colorado history. Let’s use a co-located comprehensive Data-Lake strategy to prepare us to find our next big data treasure.