Home About Expertise Projects Blogs Contact
EDACustomer ProfilingCompleted

Aerofit Treadmill
Customer Profiling

Exploratory data analysis to build customer profiles for each treadmill model — helping Aerofit's sales team recommend the right product to the right buyer.

TypeEDA & Probability Analysis
DomainFitness / Retail
Dataset180 customers · 9 features
ToolsPython · Pandas · Seaborn
CourseScaler Academic Case Study
180
Customer Records
3
Treadmill Models
KP281
Best-Selling Model
9
Features Analysed
01 — Business Problem

Who buys which treadmill?

Aerofit sells three treadmill models — KP281 (entry-level), KP481 (mid-range), and KP781 (premium). The fitness retail company wants to understand who buys what — so their sales team can make data-driven recommendations during purchase consultations.

🎯
The goal
Build customer profiles for each treadmill model using age, income, fitness level, gender, and usage frequency. Give the sales team a clear picture of the typical buyer for each product.
02 — Dataset

What we're working with

FeatureTypeDescription
ProductCategoricalKP281 / KP481 / KP781
AgeNumericalCustomer age in years
GenderCategoricalMale / Female
EducationNumericalYears of education
MaritalStatusCategoricalSingle / Partnered
UsageNumericalPlanned weekly treadmill uses
FitnessNumerical (1-5)Self-rated fitness level
IncomeNumericalAnnual income in USD
MilesNumericalExpected weekly miles
03 — Methodology

How I built the profiles

01
Univariate Analysis
Distributions for age, income, fitness, usage, miles. Identified that income and miles are right-skewed with outliers.
histplot · boxplot
02
Bivariate Analysis by Product
Cross-tabulated each feature against product type. Used countplots, boxplots, and mean comparisons to isolate model-specific patterns.
groupby · crosstab · boxplot(hue)
03
Correlation Analysis
Built correlation heatmap to understand feature relationships. Income–Education and Usage–Miles are the strongest correlations.
heatmap · pairplot
04
Marginal & Conditional Probability
Calculated P(Product) and P(Feature | Product) to build probabilistic customer profiles for each treadmill.
value_counts(normalize=True)
Python — customer_profiling.py
# Conditional probability: P(Gender | Product)
pd.crosstab(df['Product'], df['Gender'], normalize='index')

# Mean stats per product
df.groupby('Product')[['Age','Income','Fitness','Usage','Miles']].mean()

# Output:
#          Age    Income  Fitness  Usage  Miles
# KP281   28.5   46,400    3.0    3.3    82
# KP481   28.9   48,900    3.1    3.5    87
# KP781   29.1   58,500    4.2    4.8   166
04 — Customer Profiles

The three buyer types

KP281
Entry Level · Most Popular
Age ~28 · Income ~$46K
Fitness: 3/5 · Miles: 82/wk
KP481
Mid Range · Balanced
Age ~29 · Income ~$49K
Fitness: 3.1/5 · Miles: 87/wk
KP781
Premium · Power Users
Age ~29 · Income ~$58K
Fitness: 4.2/5 · Miles: 166/wk
💰
Income is the clearest separator
KP781 buyers earn ~25% more than KP281 buyers. Income is the strongest predictor of product tier.
🏃
Miles run per week is the key usage signal
KP781 users plan to run 166 miles/week vs 82 for KP281. Usage intensity is a critical profiling dimension.
Fitness level drives premium purchase
KP781 buyers rate themselves 4.2/5 vs 3.0/5 for KP281. Premium treadmills are bought by already-fit customers.
👫
Gender and marital status matter less
Males slightly prefer KP781. Partnered individuals lean toward mid-range. But these are weak signals compared to income and fitness.
05 — Recommendations

What Aerofit should do with this

01
Build a recommendation system
Ask 3 questions at purchase: fitness level (1–5), weekly budget for exercise, and income range. These three alone predict the right product.
02
Upsell KP281 buyers aggressively
KP281 buyers are younger and lower-income — but if they become regular users, they're prime upgrade candidates in 12–18 months.
03
Target KP781 via fitness communities
Premium treadmill buyers already have high fitness ratings. Gym partnerships, running apps, and fitness influencer channels are the right media channels.
06 — Tech Stack
Python 3PandasSeabornMatplotlibCrosstab AnalysisGoogle Colab
← Back to Projects View on GitHub ↗