Comparative Analysis of Machine Learning Algorithms in Customer Churn Prediction

Document Type

Poster Presentation

Publication Date

4-17-2026

Keywords

fsc2026

Abstract

Banks face high customer acquisition costs relative to retention, making churn prediction highly valuable. This project uses a large dataset of bank customers (n=10,000) including information regarding demographic, financial, and behavioral features. The machine learning algorithms aim to classify customers into two distinct categories: churn or not churn.
To accomplish this, we implement and compare two classification algorithms (Random Forest and Naive Bayes) alongside a stacked ensemble combining both. Random Forest is selected for its robustness to noisy data and strong performance on tabular datasets, while Naive Bayes serves as a probabilistic baseline that is computationally efficient and straightforward to interpret. Each model is trained on 80% of the data using repeated cross validation with downsampling to address the dataset’s class imbalance, where roughly 80% of customers did not churn. Model performance is then evaluated on the remaining 20% using accuracy, kappa, sensitivity and specificity, with careful attention to the business implications of each metric in the context of customer retention.

Comments

Poster presented at the 2026 Fisher Showcase, St. John Fisher University, April 17, 2026.

This document is currently not available here.

Additional Files

Share

COinS