In this blog, we show you how to predict and control customer churn using machine learning in a data visualization tool.

Customer churn is important to every for-profit business (and even some non-profits) because of the direct loss of revenue associated with lost customers. To compound that loss, the cost of acquiring a new customer is usually more expensive than retaining a current customer. Company reputations can also be at risk when they lose customers in today’s world of instant news and social media.

Traditional Versus Advanced Analytics

Traditional analytics tools can help us see who churned and the associated revenue loss. While both are very important metrics, we are stuck looking backwards and making guesses as to how to reduce customer churn. When we introduce advanced analytics into the picture, we can start to look forward and answer questions like “who is at risk of churning?” and “why are customers churning?” Ultimately, the phase we want to achieve is “prescriptive analytics” i.e. “what can I do about customer churn before it happens?”

Example Scenario: Customer Churn for a Telecommunications Company

Customer churn is a unique challenge for B2C telcos because the target market is massive, consumers have several alternatives to choose from, and there is little difference in competitive offerings. In the following demo, we use a combination of traditional and advanced analytics to analyze a set of example telco data to understand customer attrition for that company and how they could control the churn.

Note: the following screenshots are pulled from a model in Qlik, but the same findings could be accomplished in other data visualization tools in the market. The focus here is not the technology, but rather that traditional and advanced analytics can be combined to answer very important business questions.

The image below is a pretty typical executive level dashboard showing customer churn KPIs. The waterfall chart on the left shows dollars associated with current or lost customers, and on the right we see customer counts.

Traditional Analytics: Last Month’s Customer Churn Summary

The dashboard quickly provides a grim story: we have more customers leaving our customer base than entering – both in raw counts and in dollars. At the top right, we also see that the value of the customers who are leaving is higher than new customers. So, not only is churn a problem, but so is the profile of customers that are leaving and entering the customer base.

At this point, those analyzing this data would typically go a level or two deeper and slice the data by various attributes and dimensions to understand what is causing these problems. Here’s a drill-down dashboard showing more detail.

Traditional Analytics: Last Month’s Customer Churn Detail

At the top of this image, distribution plots allow you to see a “spread” – those who have churned are represented in red, and those who have remained are in blue.  The bigger the spread, the less churn, and vice versa – the smaller the spread, the more churn there is (indicative of something problematic).

Some things represented here: there is low churn among customers with 2-year contracts, and high churn among month-to-month contracts. However, we also see that the month-to-month category represents the highest count and value among customers represented on the graph. Given this information, we cannot discount month-to-month contracts; we need to dig deeper to understand the overall profile of those customers leaving versus staying.

This dashboard shows customers broken down by the type of internet service they use.

Traditional Analytics: Internet Service Type Detail

This shows that DSL customers are pretty loyal, but fiber optic customers are more likely to churn. Again, we cannot look at this data in a vacuum – this doesn’t mean that fiber optic customer are bad, because there are many other attributes at play among telecommunications customers.

This dashboard shows customers broken down by their payment type.

Traditional Analytics: Payment Type Breakdown

Those on auto-pay and who mail checks are loyal, but customers who pay via electronic check are much more likely to churn. But, we know better than to attribute the churn problem to those paying by electronic check.

Despite all the important data points we get from our data visualization tools, we still need to understand how all customer attributes work in concert to influence churn. Even the most sophisticated data visualization tools cannot present the data in a way for us to comprehend this, because the human brain cannot analyze multiple dimensions at the same time. Though we have unearthed some interesting information, we still have not satisfied the goal of isolating why customers churn.

Introducing Machine Learning to the Data

If our brains can’t compute at the level needed to get ahead of customer churn, machine learning can help. Machine learning isn’t magical – all it’s doing is seeking patterns in the data provided. In this demo, we told the model that we want to see a Churn Confidence level for each customer. We defined this Churn Confidence number as somewhere between 0 and 1; the closer to 1, the more likely the machine predicts the customer will leave.

The Machine Learning Model Scores Data and Adds the Churn Confidence Number

With this new data point in our tool, we can build new visualizations that help us look forward to predict churn with some level of confidence. An executive can also do what-if analysis with this information. We can set the churn confidence thresholds to a higher and lower number (depending on how optimistic we are feeling) to see how churn predictions affect numbers like overall customer counts and revenue.

Churn Confidence Thresholds Allow for What-if Analysis

Going Farther – What is the Profile of our Loyal vs Churning Customers?

Along with the Churn Confidence number, the model we applied to our data provides us with granular detail about customer attributes.

The Machine Learning Model Provides Attribute Detail For Each Customer

This is incredibly rich data, but it’s impossible to eyeball any patterns. The table is telling us that these are the most loyal customers, but it’s hard to identify why. So rather than having the machine learning model spit out granular level detail and score each customer line-by-line, we can ask it to identify patterns and create customer profile groupings for us.

Machine Learning Models Can Automatically Make Logical Groupings and Associations

Now, we have clear profiles of Most Likely to Churn and Most Loyal customers. With the data organized in this manner, the business can start to take action – Product Management can utilize this as they consider new product and package features, and Marketing will have the information needed target the right audiences with appropriate messaging.

More Questions to Ask of the Data

In this example, we went down one path of questions we could ask of our data – what’s the profile of a loyal customer versus one likely to churn. But there are infinite questions we can explore with machine learning. For example, with customer profile information at hand, this begs the next question: who is mostly likely to respond to our marketing campaigns via what channels? Other examples of questions we can ask of the data to predict and control customer churn could include:

  • What are my loyal customers doing with phone-only service?
  • How do I define a “high-value” customer? It may not be the customer paying more this month; it could be the customers paying most over time. What is the balance between loyalty + margin + TLV
  • What kind of month-to-month customers do we want?
  • What are the warning signs of a customer becoming at-risk?
  • What should Customer Salvage reps offer to customers trying to leave?
  • Which customers need proactive attention?
  • What behavioral changes have the least friction?
  • How sensitive are our customers to price?

All these questions could be explored via traditional dashboards, and you may even be able to uncover some answers after several hour/days/weeks of grueling examination; but machine learning is required to answer everything in a reasonable amount of time. And as your machine learning models mature, so should your datasets. It’s important to continue to augment your data to answer more questions to get ahead of issues such as customer churn before they negatively impact the business.

Watch our full Customer Churn webinar

Matt Levy Matt Levy is a Managing Consultant at Analytics8. Practicing what we call “ethical data science,” Matt specializes in making sure that our customers avoid bias when building machine learning models so that their projects bring real value to their organization. Matt wrote his Masters Capstone thesis on fantasy golf analysis, and is a consistent winner of A8 fantasy sports competitions.
Subscribe to

The Insider

Sign up to receive our monthly newsletter, and get the latest insights, tips, and advice.

Thank You!