Comparative Analysis of Machine Learning Algorithms in Customer Churn Prediction
Document Type
Poster Presentation
Publication Date
4-17-2026
Keywords
fsc2026
Abstract
Banks face high customer acquisition costs relative to retention, making churn prediction highly valuable. This project uses a large dataset of bank customers (n=10,000) including information regarding demographic, financial, and behavioral features. The machine learning algorithms aim to classify customers into two distinct categories: churn or not churn.
To accomplish this, we implement and compare two classification algorithms (Random Forest and Naive Bayes) alongside a stacked ensemble combining both. Random Forest is selected for its robustness to noisy data and strong performance on tabular datasets, while Naive Bayes serves as a probabilistic baseline that is computationally efficient and straightforward to interpret. Each model is trained on 80% of the data using repeated cross validation with downsampling to address the dataset’s class imbalance, where roughly 80% of customers did not churn. Model performance is then evaluated on the remaining 20% using accuracy, kappa, sensitivity and specificity, with careful attention to the business implications of each metric in the context of customer retention.
Publication Information
Scheuermann, Evan; Scheidelman, Jake; and Tamburino, Jack, "Comparative Analysis of Machine Learning Algorithms in Customer Churn Prediction" (2026). Fisher Showcase 2026. Paper 172.
https://fisherpub.sjf.edu/fsc2026/172
Please note that the Publication Information provides general citation information and may not be appropriate for your discipline. To receive help in creating a citation based on your discipline, please visit https://libguides.sjf.edu/citations.
Comments
Poster presented at the 2026 Fisher Showcase, St. John Fisher University, April 17, 2026.