Machine learning changes how we interact with data. With the help of data scientists, you can apply the benefits of machine learning to make it part of your bigger data solution.
Developing a well-thought-out strategy to implement machine learning while using the expertise of a data scientist creates the best long-term solution for your business.
The Basics of Machine Learning
Machine learning is everywhere right now. No matter what business problem you search, it seems like machine learning is part of the answer.
Anyone tech-savvy can do it. Simply pop on to whatever tool you choose, follow a few tutorials, and next thing you know, you’ve created a model!
But, what do you do with that model and how do you make it part of your bigger data solution?
You need a data scientist.
Machine learning is an optimization tool. It looks for the best answer based on the data you provide, but it can only answer the question asked.
It doesn’t have the ability to ingest information and guess what you need to know to solve a problem. It can’t tell you that it might not have the right information, the correct answers, or even the right questions.
Putting Machine Learning Into Play
So how do you go about using a utility like machine learning? The tool requires solid data from your data warehouse, an understanding of business needs from your business analysts, and a strong knowledge of analytics from your data analysts. At the intersection of all of those is your data scientist.
A data scientist asks the tough questions, like:
- Is optimization the right solution for this problem?
- Is machine learning the right optimization tool to use?
- Are we asking the right questions?
- Are we providing the right data for these questions?
- Are we providing good data?
A data scientist works with stakeholders and technology architects to fit machine learning into the right place in your data picture. The process is more than just machine learning development.
Standalone machine learning development works with flat files exported from a data warehouse or system from a given point in time and produces a one-time solution. Integrating machine learning into the architecture of your data solution provides the ability to create a model that can be kept current and can deliver solutions at scheduled intervals or on demand.
A Case Study in Machine Learning
The simple way to use machine learning is to export information on all your employees from your HR, payroll, and evaluation systems and combine them into one large file that is fed into the model.
The model spits out a prediction for each employee based on five features (in this case: job title, location, current pay, last pay raise percentage, and previous evaluation score.) The prediction tells you whether the employee is likely to leave or not.
You form a team to target the employees likely to leave, and the team works to keep those employees.
A year goes by, and you are concerned about retention or attrition again. You repeat the process (but this time there are six features used in the prediction and only two overlap the previous results.) A new team is required, and work must be redone to compensate for the changes in the results.
Additionally, you find that some of the employees are the same from the previous list and you must start working with them again to try to keep them from leaving. You probably lost ground with them in the year that went by.
The Perks of Using A Data Scientist
Let’s look at the case study through a different lens in which the machine learning piece is just part of the larger picture.
Instead of exporting the information into a static system and doing a stand-alone development project, let’s leverage data science to look at the bigger picture.
First, we know that optimization, and machine learning, is a good fit for this type of problem. Next, are we asking the right question? The previous model told us if an employee was likely to leave or not.
A better question might be how likely an employee is to leave in the next week or month or year.
Putting Machine Learning into Practice
To answer that question, we must reconsider the data the model receives. Is there a source for updatable, time-based data that can provide insight into the problem?
For example, instead of just the last evaluation result, every evaluation report from the past could be included and the dataset could be expanded to include all future reports at the point of creation.
Alternatively, perhaps there is a monthly metric we can add. Both would allow you to indicate trends within the problem areas not previously identifiable.
A report can be created regularly using our new time-based model with the most current data to determine which employees are likely to leave in the next week, or month, or quarter.
This process allows for your retention team to focus on the right employees at the right time consistently.
By developing a long-term strategy to solve the problem, you won’t have to worry about redoing all the work every time that retention becomes an issue again.
This way, retention doesn’t become an issue because your data scientist helped you apply machine learning to its fullest, and you created a long-term solution that stays relevant.
Part of the Centric National Data and Analytics team, Tanya Kannon has immersed herself in the world of data for the last decade. She has worked in areas ranging from Test and Evaluation for Unmanned Aircraft to Budget Development for the Air Force to Routing Optimization dealing with Aircraft Refueling. Her current passion is Machine Learning specifically focusing on Employee Retention and Attrition modeling.