Skip to content

Y-R-A-V-R-5/FragileML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FragileML — When Simpler Is Safer

A structured study of ML model fragility across 5 independent failure axes on an e-commerce dataset.

Central thesis: A model can be statistically better while being systemically less reliable.


Axes

# Axis What varies Frozen
1 Capacity Model tier (LR → DT → RF → XGB → LGB) Everything else
2 Fidelity Label noise / feature noise levels Model, features, seed
3 Stability Random seeds / bootstrap draws Model, data, features
4 Representation PCA variance / top-k features Model, noise, seed
5 Temporal Train/test time window Model, features, noise

Setup

# 1. Clone
git clone https://dagshub.com/Y-R-A-V-R-5/FragileML.git
cd FragileML

# 2. Install dependencies
pip install -r requirements.txt

# 3. Add your dataset
#    Place dataset.csv at:  data/dataset.csv
#    (file is gitignored — not tracked in the repo)

# 4. (Optional) connect DagsHub for experiment tracking
dagshub login

Run Experiments

# Axis 1 — full model sweep (default)
python scripts/run_axis.py

# Axis 2 — label noise (default) or feature noise or both
python scripts/run_axis.py --axis fidelity --task C4
python scripts/run_axis.py --axis fidelity --fidelity-mode feature_noise
python scripts/run_axis.py --axis fidelity --fidelity-mode both --models LR RF XGB

# Axis 3 — bootstrap (default) or seed variation or both
python scripts/run_axis.py --axis stability --task C4
python scripts/run_axis.py --axis stability --stability-mode seed
python scripts/run_axis.py --axis stability --stability-mode both

# Axis 4 — PCA (default) or top-k or both
python scripts/run_axis.py --axis representation --task C4
python scripts/run_axis.py --axis representation --repr-mode topk
python scripts/run_axis.py --axis representation --repr-mode both

# Axis 5 — temporal drift
python scripts/run_axis.py --axis temporal --task C4

# See plan without running
python scripts/run_axis.py --dry-run
python scripts/run_axis.py --axis stability --dry-run

# Track to DagsHub
python scripts/run_axis.py --track
python scripts/run_axis.py --axis fidelity --fidelity-mode both --track

Model aliases

Alias Full name Tier
LR Logistic Regression 1 — linear
LinReg Linear Regression 1 — linear
DT Decision Tree 2 — single tree
RF Random Forest 3 — ensemble
XGB XGBoost 4 — boosting
LGB LightGBM 4 — boosting

Data

The dataset (data/dataset.csv) is not tracked in git (244 MB).

Place your dataset at data/dataset.csv before running any experiments.
Column schema is described in config/data.yaml.


Project structure

FragileML/
├── config/
│   ├── data.yaml          # Dataset schema, tasks, preprocessing
│   ├── models.yaml        # Model registry (4 tiers, 6 models)
│   └── axes/              # One config per axis
├── src/
│   ├── axes/              # 5 axis runners + BaseAxis
│   ├── data/              # Loading, splitting, noise, representation
│   ├── metrics/           # Performance, stability, calibration, reporter
│   ├── models/            # Registry, evaluator
│   └── utils/             # Constants, logger, seed
├── scripts/
│   └── run_axis.py        # Main entry point
├── data/                  # Dataset goes here (gitignored)
└── artifacts/             # Experiment outputs (gitignored)

About

Examines model fragility across capacity, fidelity, stability, representation, and temporal axes. Isolated and multi-axis experiments reveal sensitivity to noise, drift, and randomness. Metrics prioritize stability and reliability over raw accuracy, highlighting when simpler models outperform complex ones in low-fidelity or shifting environments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors