Cynthia Fonderson
November 15, 2022

Customer Attrition Prediction for a Telecom Company

Posted on November 15, 2022  •  3 minutes  • 460 words

Service providers across different industries use customer attrition analysis because the cost of retaining an existing customer is far less than acquiring a new one. In this project, I apply machine learning algorithms to predict a telecom company’s customer churn based on several factors, including tenue rate, gender and payment methods amongst others.

Exploratory Data Analysis

The dataset used in this project was curated from Kaggle, and the column descriptions are as follows:

Name Description
State U.S. state code (string)
Account length Account tenure in days (integer)
Area code U.S. area code (integer)
International plan Does the customer have an international subscription plan (string)
Voice mail plan Does the customer have a voice mail subscription plan (string)
No. vmail messages Number of voicemail messages on plan (integer)
Total day minutes Total number of minutes used during the day (double)
Total day calls Total number of calls made during the day (integer)
Total day charge: Total charge accrued during the day (double)
Total eve minutes Total number of minutes used in the evening (double)
Total eve calls Total number of calls made in the evening (integer)
Total eve charge Total charge accrued in the evening (double)
Total night minutes Total number of minutes used during at night (double)
Total night calls Total number of calls made at night (integer)
Total night charge Total charge accrued at night (double)
Total intl minutes Total number of international minutes used (double)
Total intl calls Total number of calls made (integer)
Total intl charge Total charge accrued from international transactions (double)
CS calls Number of customer service calls from the user
Churn Customer retension metric (bool)

Following data quality check and feature engineering, I explored pertinent relationships in the dataset. For instance, the relationship between churn and day and night charges. day night

Insight: In general, customers who churned had higher daily charges than customers who were retained, but the attrition distribution was similar when comparing night charges

Predicting churn using classification algorithms

Once I gained a deeper understanding of the dataset, I trained and evaluated five classifier algorithms to predict customer loss, including the

To visualize the performance of the five models trained in this project, I used ROC curves. The Random Forest and Naïve Bayes Classifiers outperformed the other algorithms in predicting customer churn, with accuracies of 90% and 82%, respectively.

roc curve

Lastly, I tried to optimized the Random Forest Classifier by tuning its hyperparameters and increased its prediction accuracy to 99%. According to this model, the most important factors that influence customer attrition at this company are the charges paid, number of calls places to customer service, subscription to an international plan and services (minutes, calls) during the day. night

Full Project

Contact me / Contactez-moi

I work on everything data / J'aime tout ce qui concerne la science des données.