- The paper presents BC-ACI, a novel extension to adaptive conformal inference that corrects for persistent model bias in time series forecasting.
- It employs online exponentially-weighted bias estimation combined with dead-zone filtering to recenter prediction intervals and achieve up to a 32% reduction in Winkler scores under bias shifts.
- The method maintains asymptotic coverage guarantees with minimal computational overhead, making it especially useful for fixed, offline models in MLOps workflows.
Introduction and Motivation
Conformal prediction techniques provide distribution-free uncertainty quantification for regression and time series forecasting. Traditional split conformal prediction assumes exchangeability between calibration and test data, a condition usually violated under regime shifts in time series applications. Adaptive Conformal Inference (ACI) methods address this by adaptively tuning quantile thresholds online, maintaining proper marginal coverage under bounded nonstationarity. However, a central architectural limitation persists: ACI and its variants only adapt interval width via quantile threshold modulation, always centering intervals at the model's raw prediction. This is suboptimal in deployment scenarios where the base model (e.g., ridge regression trained once and not retrained online) develops persistent bias after a distribution shift.
The paper "Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting" (2604.13253) introduces BC-ACI, a lightweight extension that augments ACI with online exponentially-weighted residual bias estimation and dead-zone filtering. BC-ACI computes and applies a per-horizon bias correction to nonconformity scores pre-quantile, yielding tightened, recentered intervals without inflating width, while preserving the original ACI's asymptotic coverage guarantee.
Methodology
Structural Limitation of Threshold-Only ACI
ACI's reliance on symmetric intervals, centered at an unbiased forecast, becomes inefficient under persistent bias: post-shift, the standard approach must inflate both flanks of the prediction interval, incurring a Θ(∣b∣) width overhead proportional to the bias magnitude. In contrast, an oracle that knows and corrects for the bias can maintain coverage with much narrower intervals.
Figure 1: Prediction intervals around a mean shift at t=1000 (Ridge model, h=1). Top: Standard ACI widens symmetrically to maintain coverage. Bottom: BC-ACI detects the bias, shifts the interval centre, and achieves tighter intervals post-shift while preserving coverage.
Online Bias Estimation and Correction
BC-ACI maintains a per-horizon exponentially-weighted moving average (EWM) of signed residuals to estimate bias, bt(h)​, and only applies correction when ∣bt(h)​∣ exceeds an adaptive, MAD-based dead zone threshold to prevent spurious recentering due to estimation noise. The method entails three modifications to standard ACI per horizon:
- EWM bias tracking from recent residuals,
- Dead-zone thresholding based on the buffer’s median absolute deviation (MAD),
- Pre-quantile (not post-hoc) translation of the nonconformity scores and recentering of intervals.
Intervals are thus constructed around corrected predictions, and quantiles are computed over bias-corrected calibration residuals.
Multi-Horizon Design
In line with multi-horizon forecasting needs, BC-ACI maintains independent buffers, adaptive levels, and bias estimates for each forecast horizon. This approach allows tailoring the interval statistics to horizon-dependent error distributions and nonstationarities.
Theoretical Results
BC-ACI inherits the ACI coverage guarantee under the same mild, bounded-drift conditions. Theoretical results include:
- Width inefficiency of threshold-only ACI: The excessive width required to maintain coverage scales linearly in the bias for non-self-correcting models.
- Strict width reduction of BC-ACI under bias: When the EWM converges to the bias, BC-ACI recovers oracle width (equal to the spread of unbiased residuals).
- Preservation of ACI coverage: Since the indicator update is unaffected by interval centering, the asymptotic marginal coverage guarantee remains valid.
- Graceful recovery: On stationary, unbiased data, the dead-zone suppresses correction, so BC-ACI matches ACI with negligible (<0.2%) overhead.
Experimental Evaluation
Synthetic and Real-World Scenarios
The paper conducts evaluations on AR(1)-based synthetic datasets with explicit level, variance, and compound shifts, and three common real-world benchmarks (Electricity, Jena Weather, ETTh1) without injected shifts. Two base models are tested:
- Ridge regression (offline, non-self-correcting; develops bias post-shift)
- ARIMA (self-correcting; promptly eliminates bias)
Four forecast horizons and an extensive random seed protocol provide robust coverage.
Strong Numerical Results
Under mean or compound shifts, BC-ACI delivers marked interval width and overall interval quality improvements:
- Up to 32% relative reduction in Winkler score in the compound shift scenario
- No statistically significant harm on stable or volatility-only data, due to the dead-zone’s filtering of estimation noise
- Preservation of empirical coverage near 90% in all settings
The improvement is tightly coupled to the presence of persistent model bias post-shift, and is absent for ARIMA or other self-correcting models, consistent with theoretical expectations.
Figure 2: Winkler score ratios (BC-ACI / ACI) across all datasets. Values below 1 (red bars, left) indicate BC-ACI improvement. Grey bars (right) are real datasets with no known distribution shift in the evaluation window.
BC-ACI’s bias estimator tracks the running mean after a distribution shift, activating correction only when ∣bt​∣ escapes the dead zone, as visualized below.
Figure 3: Top: Online bias estimate bt​ (solid red) tracking the true running mean (dashed blue) on mean-shift data (Ridge, h=1). The grey band shows the dead-zone $|b_t| \leq k \cdot \MAD$. Pre-shift, t=10000 stays inside the dead-zone; post-shift (t=10001), it rapidly converges to t=10002. Bottom: Raw residuals showing the shift in location.
Limitations and Model-Dependency
The method is a conditional improvement: it yields substantial benefit only when the underlying model cannot self-correct for persistent bias. It is neutral or slightly conservative otherwise; on real-world data without significant regime shifts, the method's overhead is minimal and not statistically significant.
Ablation studies demonstrate the efficacy of the dead-zone: without it, noise-induced corrections slightly inflate intervals even on stable data; with a t=10003 dead-zone, the overhead is reduced to near-zero. The EWM’s memory parameter t=10004 balances bias tracking lag and noise suppression.
Implications and Future Directions
BC-ACI addresses a key structural inefficiency in adaptive conformal methods for time series, enabling more efficient post-hoc calibration for fixed, offline models frequently encountered in MLOps workflows where retraining is infeasible or infrequent. For time series regimes prone to persistent distribution shifts (e.g., retail demand, industrial sensor drift, load forecasting), BC-ACI improves uncertainty quantification without compromising coverage.
The method is not a universal replacement for standard ACI, as its gains manifest exclusively in scenarios with model biases unaddressed by the underlying forecaster. Scalability to high-dimensional or very long-horizon settings is maintained, given the method’s minimal per-horizon computational overhead.
Directions for future research include:
- Integration with formal bias-detection hypothesis tests under explicit Type-I error control,
- Extension of the correction mechanism to handle scale (variance) shifts,
- Evaluation on real-world time series exhibiting labeled regime changes,
- Systematic combination with adaptive step-size schemes and quantile regression-based conformal methods.
Conclusion
BC-ACI introduces a principled, low-complexity bias correction mechanism for adaptive conformal prediction in multi-horizon time series forecasting. By shifting the interval center through online bias estimation and dead-zone filtering, BC-ACI eliminates the inefficient width inflation necessitated by threshold-only methods under bias, delivering significant width reduction and improved probabilistic interval quality without sacrificing marginal coverage. The approach stands as a targeted calibration enhancement applicable where persistent model bias is a deployment reality, exhibiting neutral or negligible effect otherwise, and represents a modular advancement in the conformal prediction literature for nonstationary time series (2604.13253).