Updated Feb 20, 2026

Probabilistic Stock Volatility Forecasting

Built a machine learning pipeline to forecast 5-day stock volatility using historical market data.

- Improved forecast accuracy by 23–28% compared to simple baseline methods across SPY, QQQ, and AAPL.
- Shipped a Streamlit dashboard for visualizing forecasts, uncertainty bands, and model performance.

Problem

The project forecasts 5-day-ahead realized volatility for SPY, QQQ, and AAPL while quantifying uncertainty for monitoring and decision support.

- Downloaded daily OHLCV data from yfinance and cached it locally.
- Engineered realized-volatility features including RV lags, return statistics, calendar cyclic features, and volume z-scores.
- Trained rolling-window exact Gaussian Processes with PyTorch and GPyTorch.
- Generated 50%, 90%, and 95% prediction intervals and surfaced anomaly/regime alerts in a Streamlit dashboard.

- The GP model outperformed persistence and EWMA baselines across all three tickers in the latest walk-forward snapshot.
- The dashboard provides both point forecasts and interval bands so forecast confidence is visible.
- Artifacts are versioned so deployment does not require retraining on startup.

Ticker	RMSE	Baseline RMSE	RMSE Reduction	MAE	Baseline MAE	MAE Reduction	90% Coverage	Avg 90% Width
SPY	0.0103	0.0140	26.5%	0.0062	0.0083	25.1%	0.9801	0.0439
QQQ	0.0117	0.0163	27.8%	0.0081	0.0105	23.2%	0.9682	0.0450
AAPL	0.0198	0.0258	23.2%	0.0133	0.0175	23.8%	0.8986	0.0527

Average reduction across SPY/QQQ/AAPL: RMSE 25.8%· MAE 24.0%.

- Exact GP training with rolling windows increases runtime and tuning cost.
- Committed artifacts simplify deployment but require periodic refresh to stay current with new market data.