Home About Expertise Projects Blogs Contact
UpcomingMachine LearningOriginal Project

Predictive Maintenance
Dashboard

A Streamlit app that predicts machine and equipment failure before it happens — showing failure probability, time-to-failure estimates, and which sensor triggered the alert. Built on the NASA CMAPSS dataset.

StatusStarting Soon
TypeML · Manufacturing · Streamlit App
DatasetNASA CMAPSS FD001 (free)
StackPython · Random Forest · Streamlit · Pandas
DomainPredictive Maintenance · Industry 4.0
CategoryOriginal Project
🔧
Upcoming Project
Development starting soon. This page documents the full plan, technical approach, and domain rationale before a single line of code is written.
NASA
CMAPSS Dataset
FD001
Turbofan Engine Data
RUL
Remaining Useful Life Target
Streamlit
Deployment Platform
01 — The Problem

Machines don't fail instantly. They give warnings.

In manufacturing, equipment failure is expensive — not just the repair cost, but production downtime, missed SLAs, and safety risks. Traditional maintenance is either reactive (fix it after it breaks) or scheduled (replace it every 6 months whether it needs it or not).

Predictive maintenance is the third way: use sensor data to predict when a machine is likely to fail, so maintenance happens exactly when needed — before the failure, not after.

🏭
Why this project fits my profile
I work in digital manufacturing and PLM — systems that manage how products are made and how factories operate. Predictive maintenance sits right at the intersection of manufacturing engineering and AI. This project demonstrates that I understand not just the data science, but the domain it's applied to.
02 — The Dataset — NASA CMAPSS

Real turbofan engine sensor data

The Commercial Modular Aero-Propulsion System Simulation (CMAPSS) dataset was published by NASA and is one of the most widely used benchmarks for predictive maintenance research. It simulates turbofan engine degradation under realistic operating conditions.

FeatureDescription
unit_numberEngine ID — each engine has its own degradation trajectory
time_in_cyclesOperational cycle count — increases until failure
op_setting_1/2/3Operational settings that affect degradation rate
sensor_1 to sensor_2121 sensor measurements per cycle (temperature, pressure, flow rate etc.)
RUL (target)Remaining Useful Life — cycles until failure. This is what we predict.
📊
FD001 subset
FD001 is the cleanest subset: one operating condition, one failure mode. Ideal for building the first version of the model before adding complexity.
03 — Technical Approach

How the model will work

01
Data Loading & Exploration
Load NASA CMAPSS FD001 train/test splits. Explore sensor distributions, identify constant-value sensors (can be dropped), and understand degradation patterns across engine lifecycles.
Pandas · Matplotlib · df.describe()
02
RUL Label Engineering
The dataset doesn't include RUL directly — it must be computed. For each engine, RUL at cycle t = max_cycle - t. This converts the raw data into a supervised regression problem.
max_cycle computation · label creation
03
Feature Engineering
Create rolling statistics (mean, std) over a window of cycles for each sensor. Captures degradation trend, not just point-in-time readings. Normalise per engine to remove individual variation.
Rolling windows · StandardScaler · groupby
04
Model Training — Random Forest
Train RandomForestRegressor to predict RUL. Also experiment with binary classification: will the engine fail within the next N cycles? The classification version is more actionable for the dashboard.
RandomForestRegressor · RandomForestClassifier
05
Streamlit Dashboard
Build an interactive Streamlit app where users can select an engine and see: current failure probability (%), estimated time-to-failure (cycles), which sensors are showing anomalous readings, and historical trend charts.
Streamlit · Plotly · real-time display
Python — rul_engineering.py
# Compute Remaining Useful Life for each engine
def add_rul(df):
    max_cycle = df.groupby('unit_number')['time_in_cycles'].max()
    df = df.merge(max_cycle.rename('max_cycle'), on='unit_number')
    df['RUL'] = df['max_cycle'] - df['time_in_cycles']
    return df.drop(columns=['max_cycle'])

# Binary classification: will engine fail in next 30 cycles?
df['failure_soon'] = (df['RUL'] <= 30).astype(int)

# Rolling features per sensor per engine
sensors = [f'sensor_{i}' for i in range(1, 22)]
for col in sensors:
    df[f'{col}_roll_mean'] = df.groupby('unit_number')[col]\
        .transform(lambda x: x.rolling(10, min_periods=1).mean())
04 — The Streamlit Dashboard

What the app will show

🎛️
Engine Selector
Dropdown to select any engine from the test set. Dashboard updates in real time.
📊
Failure Probability Gauge
A visual gauge showing current failure probability (0–100%). Red zone above 70%.
⏱️
Time-to-Failure Estimate
Predicted remaining cycles until failure — converted to human-readable time estimate.
🚨
Sensor Alert Panel
List of sensors currently reading outside normal ranges — ranked by deviation severity.
📈
Degradation Trend Charts
Line charts showing how key sensors have trended over the engine's lifecycle.
🏭
Domain Recommendation Layer
Based on failure probability, the app recommends: Continue, Schedule Maintenance, or Immediate Action.
⚙️
The domain-aware differentiator
Most ML projects stop at model accuracy. This dashboard adds a manufacturing domain layer — translating model outputs into maintenance decisions that an engineer on a factory floor can actually act on. That's the PLM+AI bridge I'm building.
05 — Why This Project Matters

Industry demand + domain fit

DimensionWhy It Matters
Industry DemandPredictive maintenance is one of the top 5 industrial AI use cases — every manufacturer is investing in it
Domain FitDirectly aligned with manufacturing and PLM background — this isn't generic data science, it's domain-specific ML
Portfolio SignalShows ability to take raw sensor data all the way to a deployed, interactive application
Technical DepthTime series features, RUL computation, classification threshold tuning, Streamlit deployment
Recruiter AppealA live Streamlit app is far more impressive than a Jupyter notebook — it demonstrates full-stack data science
06 — Tech Stack
Python 3PandasScikit-learnRandom ForestStreamlitPlotlyNASA CMAPSSRolling Features
← Back to Projects View on GitHub ↗