Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting

Published 14 Apr 2026 in cs.LG, stat.ME, and stat.ML | (2604.13253v1)

Abstract: Adaptive Conformal Inference (ACI) provides distribution-free prediction intervals with asymptotic coverage guarantees for time series under distribution shift. However, ACI only adapts the quantile threshold -- it cannot shift the interval center. When a base forecaster develops persistent bias after a regime change, ACI compensates by widening intervals symmetrically, producing unnecessarily conservative bands. We propose Bias-Corrected ACI (BC-ACI), which augments standard ACI with an online exponentially weighted moving average (EWM) estimate of forecast bias. BC-ACI corrects nonconformity scores before quantile computation and re-centers prediction intervals, addressing the root cause of miscalibration rather than its symptom. An adaptive dead-zone threshold suppresses corrections when estimated bias is indistinguishable from noise, ensuring no degradation on well-calibrated data. In controlled experiments across 688 runs spanning two base models, four synthetic regimes, and three real datasets, BC-ACI reduces Winkler interval scores by 13--17% under mean and compound distribution shifts (Wilcoxon p < 0.001) while maintaining equivalent performance on stationary data (ratio 1.002x). We provide finite-sample analysis showing that coverage guarantees degrade gracefully with bias estimation error.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper presents BC-ACI, a novel extension to adaptive conformal inference that corrects for persistent model bias in time series forecasting.
It employs online exponentially-weighted bias estimation combined with dead-zone filtering to recenter prediction intervals and achieve up to a 32% reduction in Winkler scores under bias shifts.
The method maintains asymptotic coverage guarantees with minimal computational overhead, making it especially useful for fixed, offline models in MLOps workflows.

Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting

Introduction and Motivation

Conformal prediction techniques provide distribution-free uncertainty quantification for regression and time series forecasting. Traditional split conformal prediction assumes exchangeability between calibration and test data, a condition usually violated under regime shifts in time series applications. Adaptive Conformal Inference (ACI) methods address this by adaptively tuning quantile thresholds online, maintaining proper marginal coverage under bounded nonstationarity. However, a central architectural limitation persists: ACI and its variants only adapt interval width via quantile threshold modulation, always centering intervals at the model's raw prediction. This is suboptimal in deployment scenarios where the base model (e.g., ridge regression trained once and not retrained online) develops persistent bias after a distribution shift.

The paper "Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting" (2604.13253) introduces BC-ACI, a lightweight extension that augments ACI with online exponentially-weighted residual bias estimation and dead-zone filtering. BC-ACI computes and applies a per-horizon bias correction to nonconformity scores pre-quantile, yielding tightened, recentered intervals without inflating width, while preserving the original ACI's asymptotic coverage guarantee.

Methodology

Structural Limitation of Threshold-Only ACI

ACI's reliance on symmetric intervals, centered at an unbiased forecast, becomes inefficient under persistent bias: post-shift, the standard approach must inflate both flanks of the prediction interval, incurring a $\Theta(|b|)$ width overhead proportional to the bias magnitude. In contrast, an oracle that knows and corrects for the bias can maintain coverage with much narrower intervals.

Figure 1: Prediction intervals around a mean shift at $t=1000$ (Ridge model, $h=1$ ). Top: Standard ACI widens symmetrically to maintain coverage. Bottom: BC-ACI detects the bias, shifts the interval centre, and achieves tighter intervals post-shift while preserving coverage.

Online Bias Estimation and Correction

BC-ACI maintains a per-horizon exponentially-weighted moving average (EWM) of signed residuals to estimate bias, $b_t^{(h)}$ , and only applies correction when $|b_t^{(h)}|$ exceeds an adaptive, MAD-based dead zone threshold to prevent spurious recentering due to estimation noise. The method entails three modifications to standard ACI per horizon:

EWM bias tracking from recent residuals,
Dead-zone thresholding based on the buffer’s median absolute deviation (MAD),
Pre-quantile (not post-hoc) translation of the nonconformity scores and recentering of intervals.

Intervals are thus constructed around corrected predictions, and quantiles are computed over bias-corrected calibration residuals.

Multi-Horizon Design

In line with multi-horizon forecasting needs, BC-ACI maintains independent buffers, adaptive levels, and bias estimates for each forecast horizon. This approach allows tailoring the interval statistics to horizon-dependent error distributions and nonstationarities.

Theoretical Results

BC-ACI inherits the ACI coverage guarantee under the same mild, bounded-drift conditions. Theoretical results include:

Width inefficiency of threshold-only ACI: The excessive width required to maintain coverage scales linearly in the bias for non-self-correcting models.
Strict width reduction of BC-ACI under bias: When the EWM converges to the bias, BC-ACI recovers oracle width (equal to the spread of unbiased residuals).
Preservation of ACI coverage: Since the indicator update is unaffected by interval centering, the asymptotic marginal coverage guarantee remains valid.
Graceful recovery: On stationary, unbiased data, the dead-zone suppresses correction, so BC-ACI matches ACI with negligible ( $<0.2\%$ ) overhead.

Experimental Evaluation

Synthetic and Real-World Scenarios

The paper conducts evaluations on AR(1)-based synthetic datasets with explicit level, variance, and compound shifts, and three common real-world benchmarks (Electricity, Jena Weather, ETTh1) without injected shifts. Two base models are tested:

Ridge regression (offline, non-self-correcting; develops bias post-shift)
ARIMA (self-correcting; promptly eliminates bias)

Four forecast horizons and an extensive random seed protocol provide robust coverage.

Strong Numerical Results

Under mean or compound shifts, BC-ACI delivers marked interval width and overall interval quality improvements:

Up to 32% relative reduction in Winkler score in the compound shift scenario
No statistically significant harm on stable or volatility-only data, due to the dead-zone’s filtering of estimation noise
Preservation of empirical coverage near 90% in all settings

The improvement is tightly coupled to the presence of persistent model bias post-shift, and is absent for ARIMA or other self-correcting models, consistent with theoretical expectations.

Figure 2: Winkler score ratios (BC-ACI / ACI) across all datasets. Values below 1 (red bars, left) indicate BC-ACI improvement. Grey bars (right) are real datasets with no known distribution shift in the evaluation window.

BC-ACI’s bias estimator tracks the running mean after a distribution shift, activating correction only when $|b_t|$ escapes the dead zone, as visualized below.

Figure 3: Top: Online bias estimate $b_t$ (solid red) tracking the true running mean (dashed blue) on mean-shift data (Ridge, $h=1$ ). The grey band shows the dead-zone $|b_t| \leq k \cdot \MAD$. Pre-shift, $t=1000$ 0 stays inside the dead-zone; post-shift ( $t=1000$ 1), it rapidly converges to $t=1000$ 2. Bottom: Raw residuals showing the shift in location.

Limitations and Model-Dependency

The method is a conditional improvement: it yields substantial benefit only when the underlying model cannot self-correct for persistent bias. It is neutral or slightly conservative otherwise; on real-world data without significant regime shifts, the method's overhead is minimal and not statistically significant.

Ablation studies demonstrate the efficacy of the dead-zone: without it, noise-induced corrections slightly inflate intervals even on stable data; with a $t=1000$ 3 dead-zone, the overhead is reduced to near-zero. The EWM’s memory parameter $t=1000$ 4 balances bias tracking lag and noise suppression.

Implications and Future Directions

BC-ACI addresses a key structural inefficiency in adaptive conformal methods for time series, enabling more efficient post-hoc calibration for fixed, offline models frequently encountered in MLOps workflows where retraining is infeasible or infrequent. For time series regimes prone to persistent distribution shifts (e.g., retail demand, industrial sensor drift, load forecasting), BC-ACI improves uncertainty quantification without compromising coverage.

The method is not a universal replacement for standard ACI, as its gains manifest exclusively in scenarios with model biases unaddressed by the underlying forecaster. Scalability to high-dimensional or very long-horizon settings is maintained, given the method’s minimal per-horizon computational overhead.

Directions for future research include:

Integration with formal bias-detection hypothesis tests under explicit Type-I error control,
Extension of the correction mechanism to handle scale (variance) shifts,
Evaluation on real-world time series exhibiting labeled regime changes,
Systematic combination with adaptive step-size schemes and quantile regression-based conformal methods.

Conclusion

BC-ACI introduces a principled, low-complexity bias correction mechanism for adaptive conformal prediction in multi-horizon time series forecasting. By shifting the interval center through online bias estimation and dead-zone filtering, BC-ACI eliminates the inefficient width inflation necessitated by threshold-only methods under bias, delivering significant width reduction and improved probabilistic interval quality without sacrificing marginal coverage. The approach stands as a targeted calibration enhancement applicable where persistent model bias is a deployment reality, exhibiting neutral or negligible effect otherwise, and represents a modular advancement in the conformal prediction literature for nonstationary time series (2604.13253).

Markdown Report Issue