Home About Expertise Projects Blogs Contact
Hypothesis TestingStatistical AnalysisCompleted

Yulu Electric Cycle
Demand Drivers

Hypothesis testing to identify what factors significantly impact demand for Yulu's shared electric cycles in India — season, weather, working day, or temperature?

TypeHypothesis Testing · ANOVA · Chi-Square
DomainMicro-mobility / Transport
Dataset10,886 records · Bike sharing data
ToolsPython · Scipy · Seaborn
CourseScaler Academic Case Study
10,886
Records
191
Avg Daily Rides
p<0.05
Season Effect Confirmed
4
Hypothesis Tests Run
01 — Business Problem

Why is Yulu losing revenue?

Yulu, India's leading micro-mobility provider, has seen a dip in revenues. The company needs to understand which external factors significantly drive demand for shared electric cycles — so it can plan fleet deployment, pricing, and maintenance around those factors.

The key questions: Does demand change across seasons? Does weather affect rides? Does holiday vs working day matter? Are season and weather correlated?

🔬
Hypothesis-first approach
This project doesn't just describe the data — it tests specific business hypotheses using statistical tests (t-test, ANOVA, Chi-Square) to distinguish genuine effects from random variation.
02 — Hypothesis Tests Conducted

What was tested

HypothesisTest UsedResultBusiness Impact
Holiday vs Non-Holiday demand2-Sample T-Testp=0.57 — Fail to reject H₀Holiday status doesn't matter
Working Day vs Non-Working Day2-Sample T-Testp=0.23 — Fail to reject H₀Weekday/weekend doesn't matter
Season vs DemandOne-Way ANOVAp≈0 — Reject H₀ ✓Season significantly affects rides
Weather vs DemandOne-Way ANOVAp≈0 — Reject H₀ ✓Weather significantly affects rides
Weather dependent on Season?Chi-Square Testp≈0 — Reject H₀ ✓Weather & season are correlated
03 — Methodology

Statistical approach

01
Data Preparation
Converted categorical columns (season, holiday, weather) from integer codes to category dtype. Parsed datetime, extracted hour, month, year, weekday features.
astype(category) · dt.hour · dt.weekday
02
Assumption Checking
Before parametric tests: checked normality (histograms + Q-Q plots) and homogeneity of variance (Levene's test). CLT applies with 10K+ samples.
scipy.stats.levene · histplot
03
T-Tests for Binary Factors
Ran independent samples t-test comparing mean ride counts: holiday vs non-holiday, working vs non-working day.
scipy.stats.ttest_ind()
04
ANOVA for Multi-Level Factors
Ran one-way ANOVA for season (4 levels) and weather (4 levels). F-statistic indicates significant between-group variance.
scipy.stats.f_oneway()
05
Chi-Square for Dependency
Tested whether weather condition distribution depends on season — to understand if these two factors compound their effect on demand.
scipy.stats.chi2_contingency()
Python — hypothesis_tests.py
from scipy import stats

# T-Test: Holiday vs Non-Holiday
holiday    = df[df['holiday']==1]['count']
nonholiday = df[df['holiday']==0]['count']
t, p = stats.ttest_ind(holiday, nonholiday)
# p = 0.5737 → no significant difference

# ANOVA: Season vs Demand
groups = [df[df['season']==s]['count'] for s in [1,2,3,4]]
F, p = stats.f_oneway(*groups)
# F=236.9, p≈0 → season significantly affects demand ✓

# Chi-Square: Weather dependent on Season?
ct = pd.crosstab(df['season'], df['weather'])
chi2, p, _, _ = stats.chi2_contingency(ct)
# p≈0 → weather IS dependent on season ✓
04 — Key Findings

What actually drives demand

🌤️
Season is the strongest demand driver
ANOVA F=236.9, p≈0. Summer and Spring see highest demand. Fleet expansion should be timed seasonally.
⛈️
Bad weather kills rides
ANOVA on weather: p≈0. Clear weather = most rides. Category 4 (heavy rain/storms) sees near-zero demand.
🏖️
Holidays don't significantly affect demand
T-test p=0.57. People use Yulu equally on holidays and regular days — it's a commuter AND leisure product.
🌡️
Temperature sweet spot is ~25–30°C
Rides increase with temperature up to ~30°C then slightly drop. Extreme heat reduces cycling demand.
📅
Fleet planning insight
Season and weather are correlated (Chi-Square p≈0). Plan fleet expansions for May–July (peak season + good weather). Use mean daily demand CI of 188–195 for baseline planning.
05 — Tech Stack
Python 3PandasScipy StatsSeabornT-TestANOVAChi-Square
← Back to Projects View on GitHub ↗