DEV Community

Cover image for Predicting Customer Churn with TensorFlow – A Beginner-Friendly Guide
Oussama Belhadi
Oussama Belhadi

Posted on

Predicting Customer Churn with TensorFlow – A Beginner-Friendly Guide

Introduction

Customer churn is when customers leave a company. Predicting churn helps businesses retain valuable customers and increase revenue.

In this tutorial, I’ll show you how to use TensorFlow, pandas, and scikit-learn to build a neural network that predicts churn based on a real dataset.

You can find a working ready to test/use example in my Github

No heavy theory — just step-by-step coding, explanations, and visuals.

We have a .csv file that holds the customers data and we will use it as a dataset to train our model today, It's also available in the Github Repo

Step 1: Setting Up the Environment

We need these libraries:

pip install pandas numpy scikit-learn tensorflow matplotlib 
Enter fullscreen mode Exit fullscreen mode
  • pandas → for data manipulation
  • numpy → for numeric computations
  • scikit-learn → preprocessing, scaling, train/test splitting
  • tensorflow → building neural networks
  • matplotlib → plotting results

Step 2: Load and Inspect the Dataset

Load the dataset with pandas:

import pandas as pd df = pd.read_csv("customer_churn.csv") df.head() 
Enter fullscreen mode Exit fullscreen mode

Tip: ⚠️ Always check your column names. Spaces or extra characters can break code later:

df.columns = df.columns.str.strip().str.replace(" ", "_") 
Enter fullscreen mode Exit fullscreen mode

Step 3: Clean the Data

Convert numeric columns with potential issues:

df['Total_Charges'] = pd.to_numeric(df['Total_Charges'], errors='coerce') 
Enter fullscreen mode Exit fullscreen mode

Drop missing rows and irrelevant columns:

df = df.dropna() df.drop('Customer_ID', axis=1, inplace=True) 
Enter fullscreen mode Exit fullscreen mode

Step 4: Encode Categorical Variables

Neural networks cannot process text. Convert categories to numbers:

from sklearn.preprocessing import LabelEncoder df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0}) cat_cols = df.select_dtypes(include='object').columns le = LabelEncoder() for col in cat_cols: df[col] = le.fit_transform(df[col]) 
Enter fullscreen mode Exit fullscreen mode

Example: Male → 1, Female → 0. Similarly for other categories.

Step 5: Split Features and Target

Separate input features (X) and output (y);
Before scaling, each feature (column) has its own mean and standard deviation. Neural networks learn better when features are roughly in the same range.

Mean: average value of the feature
Standard Deviation: measures how spread out the values are

*The formula for the mean (μ) of a dataset with N values is:
*

mean calculation formula

Standard Scaler subtracts the mean and divides by the standard deviation,
The formula for the standard deviation is:

standard deviation calculation formula

After scaling, each feature has mean ~0 and std ~1.

normal distribution example

X = df.drop('Churn', axis=1) y = df['Churn'] 
Enter fullscreen mode Exit fullscreen mode

Scale features (important for neural networks):

from sklearn.preprocessing import StandardScaler scaler = StandardScaler() X_scaled = scaler.fit_transform(X) 
Enter fullscreen mode Exit fullscreen mode

Split into train/test sets:

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42) 
Enter fullscreen mode Exit fullscreen mode

Step 6: Build and Train the Neural Network

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([ Dense(32, activation='relu', input_shape=(X_train.shape[1],)), Dense(16, activation='relu'), Dense(1, activation='sigmoid') ]) 
Enter fullscreen mode Exit fullscreen mode

Why these layers and activations?

neural network layers

  • Dense(32) and Dense(16) → number of neurons in each hidden layer. Experiment to see what works best.
  • ReLU activation → introduces non-linearity, helps the network learn complex patterns.
  • Sigmoid in output → outputs a probability between 0 and 1, perfect for binary classification.
Optimizer: Adam model.compile( optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] ) 
Enter fullscreen mode Exit fullscreen mode

Why Adam?

  • Adaptive optimizer: adjusts learning rate automatically
  • Combines advantages of Momentum and RMSProp
  • Works well out-of-the-box for most problems
  • Loss function: binary_crossentropy → suitable for predicting 0/1 outcomes.
  • Metric: accuracy → how often the model predicts correctly.
  • Training
history = model.fit( X_train, y_train, validation_data=(X_test, y_test), epochs=20, batch_size=32 ) 
Enter fullscreen mode Exit fullscreen mode

Epochs = 20 → model sees the dataset 20 times.

Batch size = 32 → updates weights every 32 samples.

Step 7: Evaluate and Visualize

import matplotlib.pyplot as plt plt.plot(history.history['accuracy'], label='Train') plt.plot(history.history['val_accuracy'], label='Validation') plt.title('Accuracy over Epochs') plt.legend() plt.show() 
Enter fullscreen mode Exit fullscreen mode

Ouput Example :

output chart example

Train vs Validation curves → check for overfitting/underfitting.

Step 8: Predict Churn for a New Customer

import numpy as np import pandas as pd new_customer = pd.DataFrame([{ 'Gender': 0, 'Senior_Citizen': 0, 'Partner': 1, 'Dependents': 0, 'tenure': 12, 'Phone_Service': 1, 'Multiple_Lines': 0, 'Internet_Service': 0, 'Online_Security': 2, 'Online_Backup': 0, 'Device_Protection': 1, 'Tech_Support': 0, 'Streaming_TV': 0, 'Streaming_Movies': 1, 'Contract': 0, 'Paperless_Billing': 1, 'Payment_Method': 2, 'Monthly_Charges': 50.0, 'Total_Charges': 500.0 }]) new_customer_scaled = scaler.transform(new_customer) churn_prob = model.predict(new_customer_scaled)[0][0] churn_label = int(churn_prob > 0.5) print(f"Churn Probability: {churn_prob:.2f}") print(f"Churn Prediction: {churn_label} ({'Yes' if churn_label==1 else 'No'})") 
Enter fullscreen mode Exit fullscreen mode

Conclusion

You now have a complete pipeline to:

  • Clean and preprocess data
  • Train a neural network in TensorFlow
  • Evaluate model performance
  • Predict churn for new customers

This workflow is reusable for other tabular datasets and binary classification problems.

Top comments (0)