Churn Prediction with ML: Retain Your Best Customers

The Churn Problem

Customer acquisition costs are 5-25x higher than retention costs, yet many organizations spend 80% of their resources acquiring new customers while high-value existing customers quietly churn. A 5% improvement in retention can increase profits by 25-95%, making churn prediction one of the highest-ROI applications of machine learning.

This guide provides a comprehensive framework for building, deploying, and operationalizing churn prediction models that identify at-risk customers early enough to intervene successfully.

Understanding Churn Types

Contractual vs Non-Contractual Churn

Contractual Churn (Subscription businesses)

Clear churn event (cancellation, non-renewal)
Observable: You know exactly when churn occurred
Examples: SaaS, telecom, gym memberships
Prediction window: 30, 60, or 90 days before contract renewal

Non-Contractual Churn (Transactional businesses)

Fuzzy churn definition (customer stops buying)
Must define: What constitutes “inactive”?
Examples: E-commerce, retail, restaurants
Prediction approach: Probability of next purchase

Voluntary vs Involuntary Churn

Voluntary Churn

Customer actively decides to leave
Driven by dissatisfaction, competition, price
Preventable through intervention
Focus of predictive modeling

Involuntary Churn

Payment failures, expired cards
Not a satisfaction issue
Preventable through payment recovery
Different intervention strategy

Building Churn Prediction Models

Phase 1: Define Churn (Week 1)

Contractual Business Definition

def define_contractual_churn(customer_data, observation_date):
    """
    Clear churn definition for subscription business
    """
    # Churn = customer with active subscription that cancelled
    churned_customers = customer_data[
        (customer_data['subscription_status'] == 'cancelled') &
        (customer_data['cancellation_date'] <= observation_date) &
        (customer_data['cancellation_date'] >= observation_date - pd.Timedelta(days=90))
    ]

    return churned_customers

Non-Contractual Business Definition

def define_noncontractual_churn(transaction_data, observation_date, inactivity_threshold_days=90):
    """
    Define churn based on inactivity period
    """
    # Get last purchase date per customer
    last_purchase = transaction_data.groupby('customer_id')['purchase_date'].max()

    # Calculate days since last purchase
    days_since_purchase = (observation_date - last_purchase).dt.days

    # Define churned customers
    churned_customers = days_since_purchase[days_since_purchase > inactivity_threshold_days].index

    return churned_customers

Key Decisions

Prediction window: 30/60/90 days ahead?
Observation period: How much historical data to use?
Churn definition: When is a customer considered churned?

Phase 2: Feature Engineering (Weeks 2-3)

Behavioral Features

def engineer_churn_features(customer_data, transaction_data, support_data, observation_date):
    """
    Create comprehensive churn prediction features
    """
    features = {}

    # === RFM Features ===
    features['recency_days'] = (observation_date - transaction_data.groupby('customer_id')['date'].max()).dt.days
    features['frequency'] = transaction_data.groupby('customer_id').size()
    features['monetary'] = transaction_data.groupby('customer_id')['amount'].sum()

    # === Engagement Trends ===
    # Last 30 vs previous 30 days
    last_30_days = transaction_data[transaction_data['date'] >= observation_date - pd.Timedelta(days=30)]
    prev_30_days = transaction_data[
        (transaction_data['date'] >= observation_date - pd.Timedelta(days=60)) &
        (transaction_data['date'] < observation_date - pd.Timedelta(days=30))
    ]

    features['purchases_last_30d'] = last_30_days.groupby('customer_id').size()
    features['purchases_prev_30d'] = prev_30_days.groupby('customer_id').size()
    features['purchase_trend'] = features['purchases_last_30d'] / features['purchases_prev_30d'].replace(0, 1)

    # === Product Engagement ===
    features['unique_products'] = transaction_data.groupby('customer_id')['product_id'].nunique()
    features['product_diversity'] = features['unique_products'] / features['frequency']
    features['avg_order_value'] = features['monetary'] / features['frequency']

    # === Support Interactions (Strong churn signal) ===
    support_counts = support_data.groupby('customer_id').size()
    features['support_tickets_total'] = support_counts
    features['support_tickets_30d'] = support_data[
        support_data['created_date'] >= observation_date - pd.Timedelta(days=30)
    ].groupby('customer_id').size()

    # Complaint indicators
    features['has_complaint'] = (
        support_data.groupby('customer_id')['ticket_type']
        .apply(lambda x: ('complaint' in x.values).astype(int))
    )

    # === Temporal Patterns ===
    features['days_as_customer'] = (observation_date - customer_data['signup_date']).dt.days
    features['expected_purchase_interval'] = (
        transaction_data.groupby('customer_id')['date']
        .apply(lambda x: x.diff().mean().days)
    )
    features['days_overdue'] = features['recency_days'] - features['expected_purchase_interval']

    # === Contract/Subscription Features (if applicable) ===
    features['subscription_tier'] = customer_data['subscription_tier']
    features['days_until_renewal'] = (customer_data['renewal_date'] - observation_date).dt.days
    features['lifetime_value'] = customer_data['lifetime_value']

    # === Payment Issues (Strong involuntary churn signal) ===
    features['failed_payments'] = customer_data['failed_payment_count']
    features['payment_method_updated'] = customer_data['payment_updated_recently'].astype(int)

    return pd.DataFrame(features)

Leading Indicators of Churn

Decreased login frequency
Reduced feature usage
Increased support ticket volume
Negative sentiment in communications
Failed payment attempts
Competitors mentioned
Downgrade requests

Phase 3: Model Development (Weeks 4-6)

Train-Test Split Strategy

from sklearn.model_selection import train_test_split

# Time-based split (more realistic for churn)
def temporal_train_test_split(data, test_months=3):
    """
    Split based on time to simulate production scenario
    """
    cutoff_date = data['observation_date'].max() - pd.DateOffset(months=test_months)

    train = data[data['observation_date'] < cutoff_date]
    test = data[data['observation_date'] >= cutoff_date]

    return train, test

train_data, test_data = temporal_train_test_split(customer_features)

Handling Class Imbalance

Churn is typically rare (5-30% churn rate), creating imbalanced datasets:

from imblearn.over_sampling import SMOTE
from sklearn.utils.class_weight import compute_class_weight

# Option 1: SMOTE (Synthetic Minority Over-sampling)
smote = SMOTE(sampling_strategy=0.5, random_state=42)
X_resampled, y_resampled = smote.fit_resample(X_train, y_train)

# Option 2: Class weights (preferred for large datasets)
class_weights = compute_class_weight('balanced', classes=np.unique(y_train), y=y_train)
class_weight_dict = dict(enumerate(class_weights))

# Use in model training
model = xgb.XGBClassifier(scale_pos_weight=class_weights[1]/class_weights[0])

Model Training & Evaluation

import xgboost as xgb
from sklearn.metrics import roc_auc_score, classification_report, precision_recall_curve

# Train XGBoost (top performer for churn prediction)
model = xgb.XGBClassifier(
    n_estimators=500,
    max_depth=6,
    learning_rate=0.01,
    scale_pos_weight=10,  # Handle imbalance
    eval_metric='auc',
    early_stopping_rounds=50
)

model.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],
    verbose=False
)

# Predict probabilities (more useful than binary predictions)
churn_probabilities = model.predict_proba(X_test)[:, 1]

# Evaluate
auc_score = roc_auc_score(y_test, churn_probabilities)
print(f"AUC-ROC: {auc_score:.3f}")

# Find optimal threshold (maximize F1 or custom business metric)
precision, recall, thresholds = precision_recall_curve(y_test, churn_probabilities)
f1_scores = 2 * (precision * recall) / (precision + recall)
optimal_threshold = thresholds[np.argmax(f1_scores)]

print(f"Optimal Threshold: {optimal_threshold:.3f}")

Model Comparison

from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression

models = {
    'XGBoost': xgb.XGBClassifier(scale_pos_weight=10),
    'Random Forest': RandomForestClassifier(class_weight='balanced'),
    'Gradient Boosting': GradientBoostingClassifier(),
    'Logistic Regression': LogisticRegression(class_weight='balanced')
}

results = {}
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred_proba = model.predict_proba(X_test)[:, 1]
    auc = roc_auc_score(y_test, y_pred_proba)
    results[name] = auc

# Display results
for name, auc in sorted(results.items(), key=lambda x: x[1], reverse=True):
    print(f"{name}: {auc:.3f}")

Typical Performance

Good churn model: AUC 0.75-0.85
Excellent churn model: AUC 0.85-0.95
Below 0.75: Need better features or more data

Phase 4: Intervention Strategy (Week 7)

Risk Segmentation

def segment_churn_risk(churn_probabilities, customer_value):
    """
    Segment customers for targeted intervention
    """
    # Define risk tiers
    conditions = [
        (churn_probabilities >= 0.7),  # High risk
        (churn_probabilities >= 0.4) & (churn_probabilities < 0.7),  # Medium risk
        (churn_probabilities < 0.4)  # Low risk
    ]
    risk_tiers = ['High', 'Medium', 'Low']
    risk_level = np.select(conditions, risk_tiers)

    # Combine with customer value
    segments = pd.DataFrame({
        'customer_id': customer_ids,
        'churn_probability': churn_probabilities,
        'risk_level': risk_level,
        'customer_value': customer_value
    })

    # Prioritize high-value, high-risk customers
    segments['priority'] = segments.apply(
        lambda x: 'Critical' if x['risk_level'] == 'High' and x['customer_value'] > 10000
        else 'High' if x['risk_level'] in ['High', 'Medium'] and x['customer_value'] > 5000
        else 'Medium' if x['risk_level'] == 'High'
        else 'Low',
        axis=1
    )

    return segments

Retention Campaign Design

# Map risk segment to intervention
intervention_strategy = {
    'Critical': {
        'channel': 'Executive outreach',
        'offer': 'Custom retention package',
        'timeline': 'Immediate',
        'budget': 500  # per customer
    },
    'High': {
        'channel': 'Account manager call',
        'offer': '20% discount + 3 months free',
        'timeline': 'Within 48 hours',
        'budget': 200
    },
    'Medium': {
        'channel': 'Personalized email',
        'offer': '15% discount',
        'timeline': 'Within 1 week',
        'budget': 50
    },
    'Low': {
        'channel': 'Automated re-engagement',
        'offer': 'Feature highlight',
        'timeline': 'Monthly',
        'budget': 5
    }
}

Production Deployment

Real-Time Scoring Pipeline

from airflow import DAG
from airflow.operators.python import PythonOperator

# Daily churn scoring DAG
dag = DAG(
    'churn_scoring_daily',
    schedule_interval='@daily',
    start_date=datetime(2025, 1, 1)
)

def score_customers(**context):
    """
    Score all active customers for churn risk
    """
    # Load latest customer data
    customers = get_active_customers()

    # Engineer features
    features = engineer_churn_features(customers, observation_date=datetime.now())

    # Load model
    model = joblib.load('churn_model_v3.pkl')

    # Score
    churn_scores = model.predict_proba(features)[:, 1]

    # Segment and prioritize
    segments = segment_churn_risk(churn_scores, customers['lifetime_value'])

    # Save to database
    save_churn_scores(segments)

    # Trigger interventions
    trigger_retention_campaigns(segments)

score_task = PythonOperator(
    task_id='score_customers',
    python_callable=score_customers,
    dag=dag
)

A/B Testing Framework

def retention_experiment(high_risk_customers, test_percentage=0.5):
    """
    A/B test retention intervention effectiveness
    """
    # Split high-risk customers
    treatment = high_risk_customers.sample(frac=test_percentage, random_state=42)
    control = high_risk_customers.drop(treatment.index)

    # Treatment: Send retention offer
    send_retention_campaign(treatment, offer='20% discount')

    # Control: No intervention

    # Track results after 30 days
    results = {
        'treatment_size': len(treatment),
        'control_size': len(control),
        'treatment_churn_rate': calculate_churn_rate(treatment, days=30),
        'control_churn_rate': calculate_churn_rate(control, days=30)
    }

    # Calculate lift
    results['churn_reduction'] = (
        (results['control_churn_rate'] - results['treatment_churn_rate']) /
        results['control_churn_rate']
    )

    return results

Monitoring & Model Refresh

def monitor_model_performance():
    """
    Track model performance over time
    """
    # Get predictions from last 30 days
    recent_predictions = get_predictions(days=30)

    # Get actual churn outcomes (where available)
    actual_churn = get_actual_churn(recent_predictions['customer_id'])

    # Calculate metrics
    if len(actual_churn) >= 100:  # Minimum sample size
        auc = roc_auc_score(actual_churn['churned'], recent_predictions['churn_probability'])

        # Alert if performance degrades
        if auc < 0.70:  # Threshold
            alert_team(f"Churn model AUC degraded to {auc:.3f} - consider retraining")

        # Log metric
        log_metric('churn_model_auc', auc)

    # Check for data drift
    check_feature_drift(recent_predictions)

Business Impact Measurement

ROI Calculation

def calculate_retention_roi(intervention_results):
    """
    Calculate ROI of churn prediction + intervention
    """
    # Customers saved from churning
    customers_saved = (
        intervention_results['control_churn_rate'] -
        intervention_results['treatment_churn_rate']
    ) * intervention_results['treatment_size']

    # Revenue saved (assuming average customer value)
    avg_customer_ltv = 5000
    revenue_saved = customers_saved * avg_customer_ltv

    # Intervention costs
    total_cost = intervention_results['treatment_size'] * 200  # $200 per customer

    # ROI
    roi = (revenue_saved - total_cost) / total_cost

    return {
        'customers_saved': customers_saved,
        'revenue_saved': revenue_saved,
        'total_cost': total_cost,
        'roi': roi,
        'roi_percentage': f"{roi:.1%}"
    }

Example Results

Treatment group: 1,000 high-risk customers
Control churn rate: 40%
Treatment churn rate: 25%
Churn reduction: 37.5%
Customers saved: 150
Revenue saved: $750,000
Intervention cost: $200,000
ROI: 275%

Advanced Techniques

Survival Analysis

from lifelines import CoxPHFitter

# Model time-to-churn (not just churn/no-churn)
def survival_analysis_churn(customer_data):
    """
    Predict when customers will churn, not just if
    """
    # Prepare survival data
    survival_data = pd.DataFrame({
        'duration': customer_data['days_as_customer'],
        'event': customer_data['churned'].astype(int),
        # Add features
        'recency': customer_data['recency_days'],
        'frequency': customer_data['frequency'],
        'support_tickets': customer_data['support_tickets_30d']
    })

    # Fit Cox Proportional Hazards model
    cph = CoxPHFitter()
    cph.fit(survival_data, duration_col='duration', event_col='event')

    # Predict survival curves
    survival_functions = cph.predict_survival_function(customer_data)

    # Expected time to churn
    median_survival_time = survival_functions.median()

    return median_survival_time

Causal Inference

from econml.dml import CausalForestDML

# Estimate treatment effect of retention campaigns
def estimate_treatment_effect(historical_campaigns):
    """
    Understand which customers benefit most from intervention
    """
    # Features
    X = historical_campaigns[['recency', 'frequency', 'monetary', 'support_tickets']]

    # Treatment (received retention offer)
    T = historical_campaigns['received_offer']

    # Outcome (churned or not)
    Y = historical_campaigns['churned']

    # Estimate heterogeneous treatment effects
    causal_forest = CausalForestDML()
    causal_forest.fit(Y, T, X=X)

    # Predict who benefits most from treatment
    treatment_effects = causal_forest.effect(X)

    # Target customers with highest expected benefit
    high_benefit_customers = X[treatment_effects < -0.2]  # Reduces churn by 20%+

    return high_benefit_customers

Conclusion

Churn prediction transforms customer retention from reactive firefighting to proactive strategy. By identifying at-risk customers early, understanding the drivers of churn, and deploying targeted interventions, organizations can protect revenue, improve customer lifetime value, and optimize retention investment.

The key is building accurate models with strong predictive features, integrating predictions into operational workflows, and continuously measuring intervention effectiveness to optimize ROI.

Next Steps:

Define your churn metric and prediction window
Engineer behavioral features from customer data
Build baseline churn model and establish performance benchmarks
Design retention intervention strategy
Deploy production scoring pipeline and measure business impact