Feaws: How I Built a Bitcoin Forecasting Engine from Scratch
I started working on Feaws in late 2025, not because I thought I could beat the market, but because the existing tools frustrated me. Here's the technical breakdown of what I built.
I started working on Feaws in late 2025, not because I thought I could “beat the market” in any absolute sense, but because the existing landscape of crypto prediction tools fell into two camps that both frustrated me.
On one side, you had Twitter accounts and Telegram bots posting directional calls with no quantified uncertainty — just “BTC to $150K by March.” On the other, the academic literature offered rigorous volatility models that were well-calibrated for traditional equities but rarely adapted to the 24/7, jump-heavy, regime-switching reality of cryptocurrency markets.
What I wanted was something in between: a system that gives a directional signal — up or down — but wraps it in honest probabilistic bounds. If the model says “UP” with 62% confidence, I want to also see that the 90th-percentile range is [$58,000, $84,000], so I know how uncertain that call really is.
Feaws is the result of roughly 4 months of evenings and weekends. The full codebase — data pipelines, model training, and the Next.js dashboard — is open-source.
The Four-Layer Architecture
Feaws is organised into four composable layers. Each layer exposes a calibrate() method that trains exclusively on data available up to a cutoff date, enforcing a strict zero-lookahead constraint at the API level.
Layer 1: Volatility Engine (GARCH)
The volatility engine provides three complementary estimates of conditional variance, all calibrated on log-returns.
I fit a standard GARCH(1,1) specification using a Student-t innovation distribution and an AR(1) mean model. On the full training sample through February 2026, the estimated parameters are ω = 0.1448, α = 0.0716, β = 0.9284, giving a persistence α + β = 1.0000. The near-unit persistence is expected — Bitcoin volatility clusters heavily, and shocks decay slowly. The unconditional annualised volatility is 69.1%.
As a cross-check, I compute an exponentially weighted moving average with λ = 0.94 (the RiskMetrics convention). The resulting annualised volatility of 69.5% is close to the GARCH estimate, which is reassuring.
The regime-conditional volatilities are where it gets interesting:
| Regime | Annualised Volatility |
|---|---|
| CRISIS | 126.1% |
| RISK_OFF | 62.8% |
| NEUTRAL | 68.4% |
| RISK_ON | 64.8% |
| EUPHORIA | 52.4% |
CRISIS volatility is roughly double that of the non-crisis regimes, consistent with the leverage effect in risk assets.
Layer 2: Drift Engine (ML + Factor Alpha)
This is where the directional signal comes from. It has three sub-components.
ML Signal: A Gradient Boosting Regressor (GBR) with 200 trees, max depth 3, learning rate 0.03, 70% subsample ratio, minimum 30 samples per leaf, and 70% feature subsampling — paired with a Ridge regressor (α = 100) in a 60/40 weighted blend. The target variable is the realised percentage return over the forecast horizon (30 or 90 days), shifted back by the horizon length to avoid any leakage.
The input feature vector comprises 30 engineered features drawn from a 28-asset universe — 15 crypto tokens, 2 commodities (gold, silver), 5 tech stocks (NVDA, AAPL, MSFT, GOOGL, AMZN), and 6 global indices (S&P 500, NASDAQ, Nifty 50, Nikkei 225, Hang Seng, DAX).
What surprised me most about the feature importance: the gold-BTC 90-day rolling correlation dominates at 23.8%, followed by halving cycle features that together account for 32% of total importance. Short-term BTC momentum (7- and 14-day returns) — which I had assumed would be the strongest signal — is notably absent from the top of the list. But on reflection it makes sense: short-term momentum is noisy and mean-reverting in crypto, whereas the macro and cycle features are more persistent and therefore more predictable over 30-day horizons.
Factor Model: A simple CAPM-style regression on S&P 500, Gold, and NVDA captures the systematic excess return of Bitcoin relative to these three factors. This provides a structural, mean-reverting drift component that anchors the forecast when the ML signal is weak or noisy.
Signal Blending: The final drift is a weighted combination: μ_adj = w_ML · μ_ML + w_regime · μ_regime + w_factor · μ_factor, where the weights are defined per model configuration.
Layer 3: Monte Carlo Engine (Merton Jump Diffusion)
Price paths are generated by the Merton (1976) jump-diffusion extension to geometric Brownian motion. This is well-suited to crypto, where ±10% single-day moves are not unusual.
Each forecast generates 5,000 paths (10,000 for the annual forecast). The terminal distribution at horizon T is summarised by percentiles (P5, P10, P25, P50, P75, P90, P95), from which I construct nested confidence intervals.
A key design choice is alpha-risk separation: the Monte Carlo median is shifted to equal the ML point prediction, so the ML model determines where the distribution is centred (the alpha signal) while the stochastic simulation determines how wide it is (the risk model). This means the confidence intervals reflect genuine forecast uncertainty rather than model disagreement.
Layer 4: Regime Classification (Hidden Markov Model)
A five-state Hidden Markov Model classifies the market into CRISIS, RISK_OFF, NEUTRAL, RISK_ON, and EUPHORIA. The states are determined by a composite scoring function that integrates BTC momentum (RSI, moving averages), cross-asset signals (S&P 500, VIX, gold), and on-chain metrics where available.
As of February 2026, we are approximately 22 months since the April 2024 halving — firmly in Phase 2 (Correction). The annualised forecast scenarios for the dashboard reflect this.
Walk-Forward Backtest Results
The backtest is the part of this project I am most careful about, because it is also the easiest part to get wrong. Every prediction is generated under a strict zero-lookahead protocol — the model trains exclusively on data available up to that point.
30-Day Horizon (best configurations):
| Configuration | Dir Acc % | CI90 % | CI95 % | Avg Err % | Score |
|---|---|---|---|---|---|
| ML + Factor | 70.0 | 96.7 | 96.7 | 12.1 | 85.6 |
| Balanced Hybrid | 70.0 | 93.3 | 96.7 | 12.2 | 84.7 |
| ML + Regime | 66.7 | 96.7 | 96.7 | 13.4 | 84.1 |
Every ML-based configuration outperforms the Historical Drift baseline (63.3%) on directional accuracy. The gap is not enormous — 70.0% vs. 63.3% — but 6.7 percentage points of directional edge, compounded over dozens of quarterly decisions, translates to meaningful portfolio alpha.
The 90-day results are humbling. Directional accuracy drops to the 48–55% range for most configurations — barely better than a coin flip. This is not surprising: predicting the sign of a 3-month return in a market as volatile as Bitcoin is genuinely hard, and anyone claiming 70%+ accuracy at this horizon should be viewed with suspicion.
Risk Management
No forecasting model, however well-calibrated, can protect a portfolio from a 50% overnight crash in Bitcoin. The FTX collapse, the Luna/UST de-peg, and the March 2020 COVID crash all produced drawdowns that exceeded any reasonable confidence interval.
For this reason, Feaws enforces a model-independent hard stop-loss: if the portfolio drawdown from peak value exceeds −20%, all positions are liquidated regardless of the current forecast.
What Feaws Cannot Do
I want to be honest about the limitations. The 90-day directional accuracy of 48–55% is essentially a coin flip with a slight edge. The 30-day accuracy of 70%, while strong, is measured over only 30 out-of-sample windows — not enough to establish statistical significance at conventional levels.
The feature set is entirely price-based and macro-based. On-chain data (MVRV, SOPR, exchange inflows/outflows, mining hash rate) and sentiment data (social media volume, funding rates, options skew) are not yet integrated. These could plausibly improve the directional signal.
Why Open Source
Building this system taught me — more than any textbook could — that the difficult part of quantitative finance is not the maths but the discipline: resisting the temptation to peek at future data during backtesting, reporting configurations that performed badly alongside those that performed well, and acknowledging that a 30-observation backtest is informative but not conclusive.
I hope the open-source release of Feaws contributes to a culture of transparency in crypto forecasting, where predictions come with error bars and backtest protocols are reproducible.
The full codebase, data pipelines, and dashboard are available at feaws.xyz.
Views are personal, not financial advice.