Feature Engineering for Revenue Models: Data Science Mastery
Master advanced feature engineering techniques that transform raw data into powerful predictors for revenue optimization and customer value models.
The Feature Engineering Advantage
In machine learning, the quality of features often matters more than the choice of algorithm. A simple linear model with well-engineered features typically outperforms a complex deep learning model with raw, unprocessed data. For revenue prediction models—churn forecasting, lifetime value estimation, price optimization—feature engineering is the difference between mediocre and exceptional performance.
This guide explores advanced feature engineering techniques specifically designed for revenue optimization use cases, with practical examples and production-ready code.
Core Feature Engineering Principles
1. Domain Knowledge is Critical
Start with Business Understanding
- What drives customer behavior in your industry?
- What events signal high/low value customers?
- Which temporal patterns matter? (weekly, monthly, seasonal)
- What external factors influence revenue? (economy, competition, weather)
Revenue Model Feature Categories
- Behavioral: What customers do (purchases, engagement, support)
- Temporal: When they do it (recency, frequency, seasonality)
- Relational: Who they interact with (network effects, referrals)
- Contextual: Circumstances surrounding actions (device, location, channel)
- Predictive: Early signals of future behavior (leading indicators)
2. Feature Transformation Hierarchy
Level 1: Raw Features (Minimal Processing)
- Direct database columns
- Basic type conversions
- Missing value imputation
Level 2: Derived Features (Single Column Transformations)
- Logarithmic/exponential transforms
- Binning/discretization
- One-hot encoding
- Target encoding
Level 3: Aggregate Features (Multi-Row Operations)
- Counts, sums, averages over time windows
- Rolling statistics
- Cumulative metrics
Level 4: Interaction Features (Multi-Column Relationships)
- Ratios and percentages
- Cross-products
- Polynomial features
- Domain-specific combinations
Level 5: Advanced Features (Complex Derivations)
- Time-series decomposition
- Embeddings from sequences
- Graph-based features
- External data integration
Revenue-Specific Feature Engineering Patterns
Pattern 1: RFM Enhancement
Beyond Basic RFM
Standard RFM (Recency, Frequency, Monetary) is just the starting point:
def engineer_advanced_rfm(transaction_data, reference_date):
"""
Create advanced RFM features for customer revenue models
"""
features = {}
# Basic RFM
features['recency_days'] = (reference_date - transaction_data['last_purchase_date']).dt.days
features['frequency'] = transaction_data.groupby('customer_id').size()
features['monetary'] = transaction_data.groupby('customer_id')['amount'].sum()
# Advanced Recency
features['days_since_first_purchase'] = (reference_date - transaction_data['first_purchase_date']).dt.days
features['customer_age_days'] = features['days_since_first_purchase']
features['recency_ratio'] = features['recency_days'] / features['customer_age_days'] # How recent vs total tenure
# Advanced Frequency
features['purchase_rate'] = features['frequency'] / features['customer_age_days'] # Purchases per day
features['purchase_interval_mean'] = transaction_data.groupby('customer_id')['days_between_purchases'].mean()
features['purchase_interval_std'] = transaction_data.groupby('customer_id')['days_between_purchases'].std()
features['purchase_regularity'] = features['purchase_interval_std'] / features['purchase_interval_mean'] # CV
# Advanced Monetary
features['average_order_value'] = features['monetary'] / features['frequency']
features['monetary_trend'] = calculate_monetary_trend(transaction_data) # Increasing/decreasing spend
features['max_single_purchase'] = transaction_data.groupby('customer_id')['amount'].max()
features['monetary_concentration'] = features['max_single_purchase'] / features['monetary']
return features
Temporal RFM Evolution
def temporal_rfm_features(transaction_data, window_days=[30, 90, 180, 365]):
"""
Calculate RFM for multiple time windows to capture trends
"""
features = {}
for window in window_days:
window_data = transaction_data[transaction_data['days_ago'] <= window]
features[f'frequency_{window}d'] = window_data.groupby('customer_id').size()
features[f'monetary_{window}d'] = window_data.groupby('customer_id')['amount'].sum()
features[f'aov_{window}d'] = features[f'monetary_{window}d'] / features[f'frequency_{window}d']
# Trend features (comparing windows)
features['frequency_trend_90_to_30'] = features['frequency_30d'] / features['frequency_90d']
features['monetary_acceleration'] = (features['monetary_30d'] / 30) / (features['monetary_365d'] / 365)
return features
Pattern 2: Behavioral Sequences
Extracting Patterns from Event Sequences
def sequence_features(customer_events):
"""
Engineer features from sequential customer behavior
"""
features = {}
# Session-based features
features['avg_session_duration'] = customer_events.groupby('session_id')['duration'].mean()
features['max_session_duration'] = customer_events.groupby('session_id')['duration'].max()
features['total_sessions'] = customer_events['session_id'].nunique()
# Event type patterns
event_counts = customer_events['event_type'].value_counts()
features['event_diversity'] = event_counts.size # Number of unique event types
features['event_entropy'] = calculate_entropy(event_counts) # Distribution uniformity
# Conversion funnel metrics
features['browse_to_cart_ratio'] = event_counts.get('add_to_cart', 0) / max(event_counts.get('product_view', 1), 1)
features['cart_to_purchase_ratio'] = event_counts.get('purchase', 0) / max(event_counts.get('add_to_cart', 1), 1)
features['overall_conversion_rate'] = event_counts.get('purchase', 0) / max(event_counts.get('product_view', 1), 1)
# Temporal event patterns
features['avg_time_between_events'] = customer_events['timestamp'].diff().mean()
features['event_velocity'] = len(customer_events) / customer_events['timestamp'].ptp().total_seconds() * 3600 # Events per hour
return features
N-Gram Features (Product Purchase Sequences)
from collections import Counter
def product_sequence_ngrams(purchase_history, n=2):
"""
Create n-gram features from product purchase sequences
"""
sequences = purchase_history.groupby('customer_id')['product_id'].apply(list)
features = {}
for customer_id, products in sequences.items():
# Bigrams (pairs of consecutive products)
bigrams = [tuple(products[i:i+n]) for i in range(len(products)-n+1)]
bigram_counts = Counter(bigrams)
features[customer_id] = {
'unique_bigrams': len(bigram_counts),
'most_common_bigram_freq': bigram_counts.most_common(1)[0][1] if bigrams else 0,
'bigram_diversity': len(bigram_counts) / max(len(bigrams), 1)
}
return features
Pattern 3: Trend and Momentum Features
Capturing Directional Changes
def trend_momentum_features(time_series_data, customer_id_col='customer_id', metric_col='revenue', date_col='date'):
"""
Calculate trend and momentum features
"""
features = {}
for customer_id in time_series_data[customer_id_col].unique():
customer_data = time_series_data[time_series_data[customer_id_col] == customer_id].sort_values(date_col)
# Linear trend
X = np.arange(len(customer_data)).reshape(-1, 1)
y = customer_data[metric_col].values
slope, intercept = np.polyfit(X.ravel(), y, 1)
features[customer_id] = {
'trend_slope': slope,
'trend_direction': 1 if slope > 0 else -1,
# Momentum (rate of change)
'momentum_30d': (customer_data[metric_col].iloc[-1] - customer_data[metric_col].iloc[-30]) / 30 if len(customer_data) >= 30 else 0,
'momentum_90d': (customer_data[metric_col].iloc[-1] - customer_data[metric_col].iloc[-90]) / 90 if len(customer_data) >= 90 else 0,
# Acceleration (change in momentum)
'acceleration': calculate_second_derivative(customer_data[metric_col]),
# Volatility
'volatility_30d': customer_data[metric_col].rolling(30).std().iloc[-1],
'coefficient_of_variation': customer_data[metric_col].std() / customer_data[metric_col].mean()
}
return features
Pattern 4: Seasonality and Cyclicality
Encoding Temporal Patterns
def seasonality_features(transaction_data):
"""
Extract seasonal purchase patterns
"""
features = {}
# Cyclical encoding (preserves circular nature of time)
transaction_data['month_sin'] = np.sin(2 * np.pi * transaction_data['month'] / 12)
transaction_data['month_cos'] = np.cos(2 * np.pi * transaction_data['month'] / 12)
transaction_data['day_of_week_sin'] = np.sin(2 * np.pi * transaction_data['day_of_week'] / 7)
transaction_data['day_of_week_cos'] = np.cos(2 * np.pi * transaction_data['day_of_week'] / 7)
# Seasonal purchase preferences
seasonal_spend = transaction_data.groupby(['customer_id', 'quarter'])['amount'].sum().unstack(fill_value=0)
features['strongest_quarter'] = seasonal_spend.idxmax(axis=1)
features['seasonal_concentration'] = seasonal_spend.max(axis=1) / seasonal_spend.sum(axis=1)
# Day-of-week patterns
dow_purchases = transaction_data.groupby(['customer_id', 'day_of_week']).size().unstack(fill_value=0)
features['weekend_shopper'] = (dow_purchases[[5, 6]].sum(axis=1) / dow_purchases.sum(axis=1) > 0.5).astype(int)
features['weekday_diversity'] = dow_purchases.astype(bool).sum(axis=1) # How many different days they shop
return features
Pattern 5: Customer Lifecycle Features
Maturity and Lifecycle Stage
def lifecycle_features(customer_data, reference_date):
"""
Features representing customer lifecycle position
"""
features = {}
# Time-based lifecycle
features['days_as_customer'] = (reference_date - customer_data['first_purchase_date']).dt.days
features['lifecycle_stage'] = pd.cut(
features['days_as_customer'],
bins=[0, 30, 90, 365, float('inf')],
labels=['new', 'growing', 'established', 'mature']
)
# Engagement-based lifecycle
features['purchases_last_30d'] = count_purchases(customer_data, window_days=30)
features['purchases_last_90d'] = count_purchases(customer_data, window_days=90)
features['engagement_trend'] = features['purchases_last_30d'] / max(features['purchases_last_90d'] / 3, 0.1)
# Lifecycle transitions
features['days_since_last_purchase'] = (reference_date - customer_data['last_purchase_date']).dt.days
features['expected_next_purchase'] = predict_next_purchase_days(customer_data)
features['overdue_days'] = features['days_since_last_purchase'] - features['expected_next_purchase']
features['is_at_risk'] = (features['overdue_days'] > 30).astype(int)
# Value progression
features['first_order_value'] = customer_data['first_order_amount']
features['last_order_value'] = customer_data['last_order_amount']
features['value_progression'] = features['last_order_value'] / features['first_order_value']
return features
Pattern 6: Cross-Feature Interactions
Revenue-Relevant Feature Combinations
def interaction_features(customer_features):
"""
Create meaningful feature interactions for revenue models
"""
interactions = {}
# Engagement x Value
interactions['engagement_value_score'] = customer_features['frequency'] * customer_features['avg_order_value']
# Recency x Frequency (engagement quality)
interactions['rf_score'] = customer_features['frequency'] / np.log1p(customer_features['recency_days'])
# Growth indicators
interactions['growth_velocity'] = customer_features['frequency_trend'] * customer_features['monetary_trend']
# Efficiency metrics
interactions['value_per_session'] = customer_features['total_revenue'] / customer_features['total_sessions']
interactions['conversion_efficiency'] = customer_features['conversion_rate'] * customer_features['avg_order_value']
# Risk indicators
interactions['churn_risk_score'] = (
customer_features['recency_days'] * customer_features['purchase_interval_std'] /
max(customer_features['frequency'], 1)
)
# Product affinity
interactions['category_focus'] = customer_features['primary_category_purchases'] / customer_features['total_purchases']
interactions['cross_category_buyer'] = (customer_features['unique_categories'] > 3).astype(int)
return interactions
Advanced Techniques
Automated Feature Engineering
Featuretools Example
import featuretools as ft
# Create entity set
es = ft.EntitySet(id='customer_revenue')
# Add entities (tables)
es = es.add_dataframe(
dataframe_name='customers',
dataframe=customers_df,
index='customer_id'
)
es = es.add_dataframe(
dataframe_name='transactions',
dataframe=transactions_df,
index='transaction_id',
time_index='transaction_date'
)
# Define relationship
es = es.add_relationship('customers', 'customer_id', 'transactions', 'customer_id')
# Automated feature generation
feature_matrix, feature_defs = ft.dfs(
entityset=es,
target_dataframe_name='customers',
agg_primitives=['sum', 'mean', 'max', 'min', 'std', 'count', 'trend'],
trans_primitives=['month', 'weekday', 'is_weekend'],
max_depth=2
)
Target Encoding
Encoding Categorical Features with Target Information
from sklearn.model_selection import KFold
def target_encode(train_data, test_data, categorical_col, target_col, n_splits=5, smoothing=10):
"""
Target encoding with cross-validation to prevent overfitting
"""
# Calculate global mean
global_mean = train_data[target_col].mean()
# Create folds for train data
kf = KFold(n_splits=n_splits, shuffle=True, random_state=42)
train_encoded = np.zeros(len(train_data))
for train_idx, val_idx in kf.split(train_data):
# Calculate mean target by category on train fold
category_means = train_data.iloc[train_idx].groupby(categorical_col)[target_col].mean()
# Apply smoothing (regularization)
category_counts = train_data.iloc[train_idx].groupby(categorical_col).size()
smoothed_means = (category_means * category_counts + global_mean * smoothing) / (category_counts + smoothing)
# Encode validation fold
train_encoded[val_idx] = train_data.iloc[val_idx][categorical_col].map(smoothed_means).fillna(global_mean)
# Encode test data using full train data
category_means = train_data.groupby(categorical_col)[target_col].mean()
category_counts = train_data.groupby(categorical_col).size()
smoothed_means = (category_means * category_counts + global_mean * smoothing) / (category_counts + smoothing)
test_encoded = test_data[categorical_col].map(smoothed_means).fillna(global_mean)
return train_encoded, test_encoded
Embedding Features
Neural Embeddings for Categorical Variables
import tensorflow as tf
def create_embeddings(categorical_data, embedding_dim=8):
"""
Learn embeddings for high-cardinality categorical features
"""
# Build embedding model
num_categories = categorical_data.nunique()
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=num_categories, output_dim=embedding_dim),
tf.keras.layers.Flatten()
])
# Train embeddings (on a supervised task)
# ... training code ...
# Extract learned embeddings
embeddings = model.layers[0].get_weights()[0]
return embeddings
Feature Selection
Correlation-Based Selection
def remove_correlated_features(feature_matrix, threshold=0.95):
"""
Remove highly correlated features to reduce multicollinearity
"""
corr_matrix = feature_matrix.corr().abs()
# Select upper triangle of correlation matrix
upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(bool))
# Find features with correlation greater than threshold
to_drop = [column for column in upper.columns if any(upper[column] > threshold)]
return feature_matrix.drop(columns=to_drop)
Feature Importance
import xgboost as xgb
from sklearn.inspection import permutation_importance
def select_important_features(X, y, n_features=50):
"""
Select top N features by importance
"""
# Train model
model = xgb.XGBRegressor(n_estimators=100, random_state=42)
model.fit(X, y)
# Get feature importances
importances = model.feature_importances_
# Select top N features
indices = np.argsort(importances)[-n_features:]
top_features = X.columns[indices]
return top_features
Production Considerations
Feature Store Architecture
class FeatureStore:
"""
Simple feature store for online/offline feature serving
"""
def __init__(self, online_db, offline_db):
self.online_db = online_db # Redis/DynamoDB for real-time
self.offline_db = offline_db # Data warehouse for batch
def compute_and_store_features(self, customer_ids, features_to_compute):
"""
Compute features and store in both online and offline stores
"""
# Compute features
features = self.compute_features(customer_ids, features_to_compute)
# Store in offline store (data warehouse)
self.offline_db.insert(features)
# Store in online store (low-latency key-value)
for customer_id, feature_values in features.items():
self.online_db.set(f"customer:{customer_id}:features", feature_values, ttl=86400)
def get_online_features(self, customer_id):
"""
Retrieve features for real-time inference
"""
return self.online_db.get(f"customer:{customer_id}:features")
def get_offline_features(self, customer_ids, feature_names):
"""
Retrieve features for batch training
"""
return self.offline_db.query(customer_ids, feature_names)
Feature Documentation
feature_definitions = {
'recency_days': {
'description': 'Days since last purchase',
'type': 'numeric',
'range': '[0, inf)',
'missing_strategy': 'Use max value in dataset + 1',
'business_meaning': 'Customer engagement recency - lower is better'
},
'frequency': {
'description': 'Total number of purchases',
'type': 'numeric',
'range': '[1, inf)',
'missing_strategy': 'Not applicable (all customers have >= 1)',
'business_meaning': 'Purchase frequency - higher indicates loyalty'
},
# ... all features documented
}
Conclusion
Feature engineering is where domain expertise meets data science technique. For revenue models, well-crafted features capture the nuances of customer behavior—engagement trends, lifecycle position, value progression—that directly predict future revenue outcomes.
The key is combining systematic feature generation frameworks (RFM, temporal aggregations, interactions) with deep business understanding to create predictive signals that generalize to new data.
Next Steps:
- Audit existing features and baseline model performance
- Implement RFM enhancement and temporal aggregations
- Add behavioral sequence and trend features
- Test feature importance and eliminate low-value features
- Build feature store for production serving
Ready to Transform Your Business?
Let's discuss how our AI and technology solutions can drive revenue growth for your organization.