Random Forest and Gradient Boosting models to predict which OLA drivers will leave the platform — helping OLA retain drivers before they churn.
OLA's gig economy model depends on driver supply. Driver churn is expensive — acquiring and onboarding new drivers costs far more than retaining existing ones. The goal: build a model that identifies at-risk drivers early, so OLA's retention team can intervene before the driver leaves.
| Metric | Random Forest | Gradient Boosting | Winner |
|---|---|---|---|
| Accuracy | 81% | 84% | GBM ✓ |
| ROC-AUC | 0.844 | 0.856 | GBM ✓ |
| Churn Recall (Class 1) | 0.91 | 0.93 | GBM ✓ |
| Non-Churn Recall (Class 0) | 0.61 | 0.64 | GBM ✓ |
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn.metrics import classification_report, roc_auc_score # Random Forest rf = RandomForestClassifier(n_estimators=100, class_weight='balanced', random_state=42) rf.fit(X_train_s, y_train) print("RF ROC-AUC:", roc_auc_score(y_test, rf.predict_proba(X_test_s)[:,1])) # 0.844 # Gradient Boosting gb = GradientBoostingClassifier(random_state=42) gb.fit(X_train_s, y_train) print("GBM ROC-AUC:", roc_auc_score(y_test, gb.predict_proba(X_test_s)[:,1])) # 0.856 — GBM wins