7 Ways Machine Learning Projects Fail

Machine learning is transforming the world. However, not all machine learning projects succeed. Through our years of experience in this field, we’ve identified several common ways that machine learning projects fail. Understanding these problems – and why they occur – will help you better assess the viability of your next machine learning project, and, most importantly, align the expectations of your team with actual outcomes.

  1. Bad Data
  2. Picking the Wrong Goal: Explanation vs Prediction
  3. Confusing Correlation for Causation
  4. Optimization without Exploration
  5. Unanticipated Data Bias
  6. Not Defining “Done”
  7. Picking the Wrong Time Window

1. Bad Data

Good data is essential to machine learning.  Such data should be clean, available, and relevant.

Clean data means data which is complete (e.g., no missing dates), correct (audited for accuracy, and fixed/estimated where necessary), and consistent (has the same data format for all the times/events being considered).  There should be internal processes and tools, both manual and automatic, to test that the data is clean now and into the future.

Available data means data that can be quickly accessed for both ad-hoc exploration and large scale model training.  Data for exploration should be available in one place (a data lake, data warehouse, or similar data platform) and data scientists should be able to query it freely without worrying about interfering with live systems.  This is generally accomplished by building or implementing a specially designed analytics database (see OLAP).

Most importantly, the data being used for machine learning must be relevant. For example, if you are trying to use machine learning to understand customers, you need data points about each customer beyond just their name and email address. And if the goal of machine learning is to predict some sort of outcome, your model training data should include many varied examples of such an outcome.

If your data doesn’t meet the above requirements, your machine learning project will fail. Make sure you fully understand your data before scoping a machine learning project.


2. Picking the Wrong Goal: Explanation vs. Prediction

There is a fundamental tradeoff in machine learning models: either prioritize explaining the world to people, or predicting outcomes. The type of goal you choose will deeply impact the implementation process.

If the business goal of a machine learning project is to provide real-world insight, or if there are strict regulatory requirements related to how decisions are made, then it is more important that your model provides a strong explanation of what it is doing than generating highly accurate predictions. Consequently, this means data transformations — and your choices of model — must be kept simple.

If your goal is to provide the best predictions, a different set of tools are available. For example, sophisticated transformations can be performed on the data being fed into the model in order to emphasize important differences over unimportant ones. You can also use complex models such as neural networks, which operate more as “black boxes.”.

It is vital that a choice be made early on as to whether explanation or prediction is the goal of a particular project. A misunderstood goal can force your team to start a project over from scratch. It’s worth noting that ongoing research is being conducted on ways to get better explanations out of a predictive model, and get better predictions out of explanatory models.

3. Confusing Correlation for Causation

Correlation is not causation. It is vital to keep this fundamental mantra from Statistics 101 in mind when applying the outputs from machine learning. Machine learning finds correlations in data, but not direct causal relationships.

For example, a model may find that people who view the FAQ page on your website are more likely to purchase your product. However, that doesn’t mean you should immediately launch a massive marketing campaign with the goal of driving everyone directly to your FAQ page. In this example, it’s more likely that a common cause, like interest in your product, is causing both behaviors. Instead, it’s possible the information in the FAQs is not being adequately presented earlier in the customer journey, so you may want to try making the link to the FAQ page more prominent.

When making a strategic business decision based on the explanations derived from machine learning, it is vital to conduct an experiment to establish causation. In the above example, you should establish causation by giving a fraction of your customers a different experience (say a more prominent link to the FAQ page), to see how it affects their purchase behavior. In this case, the power of machine learning is finding hidden and unexpected correlations in the mountains of data you have available — and to inspire experiments, which lead to new and better understanding.


4. Optimization Without Exploration

When deploying the output of machine learning, it is important to build in the ability to continually validate and improve the model. Without that, your machine learning project won’t adapt to real-world changes, which effectively leads to “blind deployment.”

When building a model that automatically decides upon an action, such as a product recommendation engine, it is important to understand the value of not simply using the “best” model for your entire audience. Some portion of the audience should be shown a different set of recommended products so that you can explore other ways of recommending products. Without such a process, even the best model will eventually start making the wrong decisions. That’s because it will have stopped learning.

For models that provide explanations, enough variation should be retained in the data to continually validate those explanations – and to generate new insights. For example, if a prior machine learning project has found that there is an optimal way to market to your audience, a portion of the marketing budget should be set aside to try new strategies in order to ensure that the current strategy is still effective.

If you only optimize machine learning results, it can stop the learning process and lead to failure.

5. Unanticipated Data Bias

At its core, what a machine learning model does is find structure and patterns in data in an automated fashion. That automation is incredibly powerful, but it can also lead to unanticipated risks.

For example, an early defense image recognition project aimed at distinguishing between friendly and enemy tanks showed a lot of promise in the lab, but ended up failing in the field. All the images of friendly tanks used to train the model were taken in the daytime; the model ended up detecting daytime vs. nighttime rather than friendly vs. enemy tanks. Because it was trained automatically, the model did not know that this was the wrong distinction to make.

An Amazon initiative to use machine learning for hiring recently failed because it was based on biased data. It tried to filter through resumes to find the best candidates to hire. But because it was based off of historical hiring data, with past gender bias, it ended up filtering out female candidates.

To a machine learning algorithm, reality is represented by the data it consumes. It is vitally important to ensure that your data reflects the right reality as closely as possible, otherwise the entire project may fail to have any real-world use.

6. Not Defining “Done”

Data science and machine learning are never really “done.” But in order to be practical, every project must have a finish line. It’s always possible to improve a model’s predictions or to have more confidence in the explanations it gives. However, this requires additional human and computing resources with a real business cost. It is important to clearly define guidelines so you know what constitutes too much or too little resources for a particular project.

There are hundreds of machine learning algorithms out there, with more being generated each day. There are even more implementations of those algorithms and ways of using them. If you don’t allow enough time to explore a few different models adequately, you may be missing out on the one that works best for your particular data and business case. On the other hand, you can’t iterate forever to try to improve a model by an extra fraction of a percent. Furthermore, sometimes data may actually represent random noise with no basis for making any predictions. For these reasons, it is important to place a limit on how many resources to throw at one project, as those resources could also be used elsewhere.

What we do at Retina is to apply lessons from agile software development: Hit “done” several times until we reach “done for now.” The idea is to start simple to initially limit resources for a project, then add more complexity by repeatedly reaching results from the model and deciding whether or not to continue based on how those results look. This ensures that you can achieve quick wins as well as avoid chasing phantom goals.

time window

7. Picking the Wrong Time Window

Machine learning models use data to understand the world. However, because the world is always changing, you must properly account for the rate at which the world changes.

If a model is trained over too small a window of time, it will not know how to handle events which occur infrequently. For example, it may miss the effects of the Christmas holiday season on buying behavior.

However, if a model is trained over too large a window of time, it may try to understand the way the world was, not how it will be. For example, you might end up inadvertently modeling behavior from the Great Recession of 10 years ago.

Choosing an appropriate time window that captures the right level of change for your business is ultimately a judgement call. It should be made jointly by the machine learning team and those who are using the model’s outputs based on some exploratory analysis. Don’t wait until the end of the machine learning project to think about the window of time your model should consider.

About Retina

At Retina we obsess over increasing customer LTV and reducing CAC (Customer Acquisition Cost) through data science.

As the cost of customer acquisition skyrockets, it is important to get focused on LTV at the customer level. Companies who computed their LTV to CAC ratio wrong are failing in an increasingly competitive environment i.e. (Blue Apron, Wayfair, Chef’d). Companies that get it right are achieving sustainable profit and growth i.e. (FAANG companies, Dollar Shave Club, Ring, etc.).

The Retina Platform computes predictive-LTV and early customer behavioral drivers of LTV, at the customer level, using next-gen machine learning algorithms (Built on 30 years of academic research). Within a matter of days, Retina automatically builds audiences (Facebook, Google, LinkedIn, Snap) for your marketers to acquire new high-LTV customers and retain your existing high-value customers.

Interested in what we do? Go to https://staging-retinaai.kinsta.com/story for a more detailed presentation about what we do, how we work and how we can help your business grow to the next level. If you’d like to see a demo of our solutions, go to https://staging-retinaai.kinsta.com/schedule-demo/.