Strategy 9 min read ·
By Mark Ashworth · Founder, ChurnTools

AI Churn Prediction: How to Build a Model That Actually Works

Most churn prediction models get built, demoed, then abandoned. Here is how to build one that your team will actually use, including which features matter, what accuracy you need, and how to connect predictions to real interventions.

I've seen dozens of SaaS companies build churn prediction models. The pattern is almost always the same: data science team spends 2-3 months building a model, presents impressive accuracy numbers in a deck, and then the model slowly dies in a Jupyter notebook nobody opens.

The problem is rarely the model. The problem is that nobody planned what happens after the prediction. This guide is about building the full system, not just the model.

Do you actually need a churn prediction model?

Honest answer: maybe not yet. If you have fewer than 1,000 customers and fewer than 200 churn events in your data, a simple rule-based approach will work just as well. "Customers who haven't logged in for 14 days and have an NPS below 7" is a perfectly good prediction for most early-stage companies.

You need an ML model when:

  • Simple rules miss too many churning customers (false negatives above 40%)
  • Simple rules flag too many healthy customers (false positives above 50%)
  • You have enough data to train on (200+ churn events, 12+ months of history)
  • You have intervention mechanisms ready to act on predictions (retention emails, save flows, CSM outreach)

If you don't have those interventions built yet, go build them first. Read the AI churn reduction guide for the full recommended sequence.

Which features actually predict churn?

After working with many datasets, here's what I've found consistently matters, ranked by predictive power:

Tier 1 (almost always predictive):

  • Login frequency trend over the last 30 days vs. the prior 30 days. A decline is more predictive than the absolute number.
  • Core feature usage depth. Not just "did they use it" but "how much did they use it relative to their own baseline."
  • Days since last meaningful action. Define "meaningful" for your product. For a project management tool, it might be creating a task. For analytics, it's running a report.
  • Support ticket sentiment and volume. A spike in negative-sentiment tickets is a strong signal.

Tier 2 (usually predictive):

  • Contract or billing changes (downgrades, removed seats, switched from annual to monthly)
  • NPS or CSAT score trends
  • Number of integrations connected (more integrations = higher switching cost = lower churn)
  • Time since last expansion or upgrade

Tier 3 (sometimes predictive, depends on your product):

  • Company firmographics (industry, size, funding stage)
  • Acquisition channel
  • Onboarding completion percentage
  • Number of active users on the account vs. total seats

The single most common mistake: using static snapshots instead of trends. A customer logging in 3 times this week isn't useful information on its own. A customer who logged in 15 times per week for six months and now logs in 3 times per week is about to churn. Always use rate-of-change features.

What accuracy do you actually need?

Forget accuracy percentage. It's misleading when your classes are imbalanced (which they always are with churn). If 5% of customers churn monthly, a model that predicts "no one will churn" is 95% accurate and completely useless.

The metrics that matter:

  • AUC-ROC of 0.75+ is the minimum for automated interventions (emails, in-app messages)
  • AUC-ROC of 0.80+ is good enough for most use cases
  • AUC-ROC of 0.85+ is excellent and hard to beat without massive datasets
  • Precision at the top decile: of the customers your model flags as highest risk, what % actually churn? This matters more than overall accuracy because you'll act on the highest-risk segment first. Aim for 40%+ precision in the top 10%.

The right algorithm (it's simpler than you think)

Use gradient boosted trees. Specifically XGBoost or LightGBM. I've seen teams spend months experimenting with deep learning, random forests, and ensemble methods. Gradient boosted trees win for SaaS churn prediction roughly 90% of the time because:

  • They handle mixed feature types (numeric, categorical) without heavy preprocessing
  • They're interpretable (you can explain why a specific customer was flagged)
  • They work well with datasets of 1,000-100,000 customers
  • They train in minutes, not hours

Skip neural networks unless you have 100,000+ customers with rich behavioral data. The added complexity isn't worth the marginal accuracy gain for most SaaS companies.

How to connect predictions to interventions

This is where most projects fail. The model outputs a churn probability score. Now what?

Build a tiered response system:

  • Score 0.8-1.0 (critical risk): Immediate CSM outreach + personalized retention email + in-app message. This is your "save now or lose them" bucket.
  • Score 0.6-0.8 (high risk): Automated email sequence + health score alert to the CSM team. Proactive but not emergency.
  • Score 0.4-0.6 (moderate risk): Soft-touch automated engagement. Feature discovery emails, usage tips, "did you know" prompts.
  • Score below 0.4: No action needed. Don't waste intervention budget on healthy customers.

Review these thresholds monthly. As your model improves and your interventions get better, adjust the cutoffs.

Building vs. buying

If you have a data scientist on staff and 12+ months of clean data, build it custom. You'll get better features and tighter integration with your product. Budget 3-4 weeks for the first version.

If you don't, use an off-the-shelf solution. Amplitude, Mixpanel, and several customer success platforms now offer built-in churn prediction. They're less customizable but you'll be up and running in days, not months.

Either way, the model is 30% of the work. The intervention system is 70%. Don't start with the model. Start with what you'll do when the model tells you a customer is at risk. Check our health score monitoring experiment and competitive evaluation detection experiment for specific playbooks.

For the full picture of where prediction fits in the AI retention stack, read the complete AI churn reduction guide.

MA

Written by Mark Ashworth

Founder of ChurnTools. I spend my time studying how SaaS companies lose customers and building tools to help them stop. Previously worked in SaaS growth and retention across multiple B2B products.

Ready to run your first retention experiment?

Browse 30+ proven playbooks for reducing churn across every stage of the customer lifecycle.

Browse Experiments →