Dealing with insurance data? Apply these four concepts for a single version of the facts, a single interpretation into truth and confidence in your conclusions.
A common story. A tragic story. An avoidable story. An insurance company has recently marked retention as a Key Performance Indicator (KPI) for a new three-year thrust.
A C-level executive calls a kick-off meeting, and the agenda is set to discuss each department’s contribution. Department heads vigorously prepare, but the enthusiasm is quickly dampened. They discover that each of the ten department heads has a separate version of the current retention rate. There is no discussion about improving retention, and the meeting devolves into an argument over which number reflects the “truth.”
If you have worked around data warehouses for any length of time, I’m sure you have similar stories. It turns out that there is a simple, but not easy, solution. Stated simply, “Provide everyone with a single version of truth.” A simple goal, but even the definition is hard.
What is “truth?”
Facts are Not Truth, and Truth is Not Only Facts
A fact is a piece of information. That piece of information can be correct, or not. That piece of information can be shared by all, or not. The systems that load a data warehouse will usually have steps in place to conform and clean facts. But there are questions we should ask:
- Are these steps complete and correct?
- Were they designed by someone who had a holistic view of the fact?
- Does everyone have the same fact?
Truth is a little harder to define. Philosophy may try to convince us that there is no truth. I like to define truth by looking at an ancient Chinese slogan “shí shì qiú shì” meaning “Seek truth from facts.” A dictionary describes truth as a statement or principle that is generally considered to be true.
In a data warehouse, truth is an agreed-on interpretation of facts. A universally accepted truth must be built on accepted facts and an accepted formula or interpretation. We should ask the same questions of truth that we ask about facts. Get a single version of facts with a single interpretation into truth and have confidence in conclusions. Simple, right?
So, why do so many companies struggle to achieve this simple goal? Answers to these five questions usually provide clues:
- What facts are stored?
- What truths are available?
- How are the facts and truths stored?
- Where are the facts and truths stored?
- How are the facts and truths shared?
What is the solution? This is by no means an exhaustive list, but let’s look at four concepts today:
1. Data Governance of the Business, by the Business, for the Business
The only way to ensure that the appropriate facts and truths are complete, correct and relevant is through Data Governance. Data is one the most valuable commodities a company has. Yet too often data is resource-starved.
I was recently working with an insurer to review their data warehouse architecture. This insurer had spent hundreds of thousands of dollars, with one of the Big 4 consulting companies. The insurer had the right goal. They wanted to get their information into a trusted and usable format for self-service reporting and analytics. This insurer trusted this major consulting company to lead them in the right direction.
Only after delivery did the insurer realize that the “data warehouse” wasn’t built for the business to use but built to help the consulting company with its other work. The structure was too complex. There was no way for anyone on the business side to use the “data warehouse.”
Why did this happen? There are two sides to every data warehouse project. The business and technology. Just like a person needs both food and water to survive, a data warehouse project needs both the business and technology. Technology cannot deliver without clear direction from the business. Only by working together can either succeed. The business must take a leadership role to have any chance of true success.
2. Data Modeling on the Cutting Edge of Business
Now that we know the facts and truths we are going to provide, we need to create a storage model.
Regardless of which of the many platforms you choose, data needs to speak the language of the business. The underlying structures of entities, attributes and measures should be modeled around business terminology and activities. The data belongs to the business, they should understand it. There should be no requirement for a secret decoder ring.
A data warehouse will either provide value to the business or die. Why should a data warehouse design make it harder for the business to understand, harder for the business to use? Think of the time and money that could have been saved if the insurer had insisted that the data model be business-centric?
3. Spreadmarts Spread Chaos
A spreadmart is a data store (e.g. Excel, Access database, and so on) created and maintained by an individual or group that provides shadow data warehouse functions.
Let’s review the life of a Spreadmart. A departmental analyst is trying to fulfill a request by someone in the department. The analyst cannot find a dataset that contains the required data. The existing stores are too narrow, too shallow, or not consolidated. The analyst must produce the request and compiles a desktop data store in Excel. That initial request becomes a part of the analyst’s monthly assignments. The analyst keeps updating the spreadsheet and producing reports. An auspicious beginning. However, at the same time, an analyst from a separate department receives a similar request and creates a similar spreadsheet.
Maybe both spreadsheets agree, but usually over time differences grow. And before long you have those department heads meeting that C-Level executive with different retention numbers.
4. Visualizations Say the Right Things to the Right People
There are four bases in baseball. It is quite an achievement to hit a triple and end up on third base. But if the inning ends and you’re still on third, the score does not change.
The same is true in Data Warehousing. Building a business-centric model on business-provided governance that properly cleanses and conforms to data is quite an achievement. But if you stop there, the score does not change.
This is where reporting applications come in. Over the past 20 years, we have seen canned reports become self-service, self-service become analytics, analytics become machine learning. Each of these advancements was driven in part by some visualization application.
Delivering down the home stretch is where a project turns from a theoretical achievement to a practical one. The visualization application must provide the right data, at the right time, to the right people, without making a mistake.
Conclusion
Back to that story where everybody shows up at a meeting with their own versions of data. This could have been avoided if the company had instituted a business-driven data governance process.
The business simply needed to identify what measures it wanted to know, how those measures would be calculated, who was responsible to provide the accurate numbers and deliver those numbers through a single, available repository.