Papers
Topics
Authors
Recent
Search
2000 character limit reached

Probabilistic Circuits for Irregular Multivariate Time Series Forecasting

Published 30 Apr 2026 in cs.LG | (2604.27814v2)

Abstract: Joint probabilistic modeling is essential for forecasting irregular multivariate time series (IMTS) to accurately quantify uncertainty. Existing approaches often struggle to balance model expressivity with consistent marginalization, frequently leading to unreliable or contradictory forecasts. To address this, we propose CircuITS, a novel architecture for probabilistic IMTS forecasting based on probabilistic circuits. Our model is flexible in capturing intricate dependencies between time series channels while structurally guaranteeing valid joint distributions. Experiments on four real world datasets demonstrate that CircuITS achieves superior joint and marginal density estimation compared to state of the art baselines.

Summary

  • The paper introduces CircuITS, achieving both expressivity and strict marginalization consistency in forecasting irregular multivariate time series.
  • It employs conditional Deep Sigmoidal Flows and Gaussian copulas to model channel dependencies and capture complex joint distributions without exponential parameter growth.
  • Empirical results on USHCN, PhysioNet, and MIMIC datasets demonstrate superior forecast accuracy and efficient computation over prior methods.

Probabilistic Circuits for Reliable Forecasting of Irregular Multivariate Time Series

Motivation and Problem Formulation

Forecasting irregular multivariate time series (IMTS) remains an unresolved challenge due to the complex and heterogeneous nature of the data, characterized by sparse and asynchronous observations across multiple channels. Precise point forecasting is insufficient; instead, probabilistic modeling is necessary to capture inherent randomness and lack of determinism in most real-world systems.

Joint probabilistic forecasting aims to model the conditional density p(y∣Q,X)p(\mathbf{y} \mid \mathcal{Q}, \mathcal{X}) over arbitrary target queries Q\mathcal{Q} given irregularly sampled history X\mathcal{X}. This must satisfy two criteria:

  1. Flexible Joint Modeling: The model must accept variable-length inputs for both observations and queries, and accurately capture channel dependencies by outputting a valid joint distribution.
  2. Marginalization Consistency: Predictions for any subset of query variables must agree with marginalization from the joint density, imposed by the Kolmogorov extension theorem.

Violation of marginalization consistency, as observed in prior work such as ProFITi, results in unreliable forecasts that depend not only on intrinsic properties of the underlying data but also on arbitrary query composition, e.g., contradictory forecasts for rain probability based on querying snow or humidity.

Limitations of Existing Methods

Several modeling paradigms for IMTS have been proposed. Early methods (NeuralFlows, GRU-ODE-Bayes, CRU) only estimated univariate marginals, missing inter-channel dependencies. Gaussian Process Regression (GPR) is consistent but highly restrictive due to Gaussian assumptions.

MOSES (Mixtures of Separable Flows) introduced marginalization consistency and expressive univariate modeling but suffers from exponential scaling for joint multimodal distributions. For CC channels exhibiting bifurcating behavior, the required number of mixture components is 2C2^C (Figure 1). Even given sufficient mixture capacity, optimization landscapes are nontrivial and lead to local minima that fail to represent the true multimodal joint. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Demonstrating MOSES inability to learn a multivariate bifurcation with 4 independent channels. MOSES-K denotes K mixture components; CircuITS utilizes 2 circuit components per channel.

CircuITS: Architecture and Marginalization Consistency

To circumvent the expressivity-consistency dilemma, CircuITS leverages the hierarchical structure of probabilistic circuits (PCs) defined by alternating sum (mixture) and product (independence) nodes:

  • Leaf Nodes: Represent base distributions for each channel, modeled via conditional Deep Sigmoidal Flows (DSFs) and combined via Gaussian copula for intra-channel dependency.
  • Sum/Product Aggregation: Recursively alternates products (factorization) and sums (mixture projection) across channels, enabling tractable modeling of joint and independent modalities without exponential parameter growth.

Crucially, circuit structure and mixture weights depend only on the observation history X\mathcal{X}, not on the query Q\mathcal{Q}. This is formally proven to guarantee marginalization consistency for the conditional density estimator, including at the leaf level (Appendix: proof details).

Encoder and Inference Mechanisms

A setwise encoder aggregates observation triplets (tn,cn,yn)(t_n, c_n, y_n) for arbitrary numbers per channel using multi-head cross-attention with channel-specific embeddings, followed by inter-channel self-attention to capture dependencies. Forecasting query embeddings are formed by combining the target timestamp with the channel's global context.

Mixture weights for circuit sum nodes are inferred from the encoder's output via mask-preserving cross-attention, ensuring semantic alignment of parameters as required for structural consistency.

Leaf Distributions: Expressive Marginals and Copulas

Each channel's leaf distribution models the marginal likelihood using DSF to parameterize monotonic flows conditioned on context, guaranteeing flexibility in the univariate base. Channel-level joint dependencies are expressed via a Gaussian copula, with correlation matrices derived from query-dependent context embeddings. The copula structure ensures strict marginalization consistency for arbitrary query subsets within channels.

Ancestral sampling from the circuit combines top-down traversal of latent mixture components with copula-based sampling and inversion of DSF to generate forecast samples.

Computational Complexity

CircuITS reduces computational complexity from cubic in total query size (O((CN)3)\mathcal{O}((CN)^3)) to cubic in per-channel queries (O(CN3)\mathcal{O}(CN^3)), providing substantial efficiency improvements in balanced IMTS scenarios. Parallelization leverages log-semiring scans and Hillis-Steele algorithms for efficient matrix multiplication in probabilistic circuit inference.

Empirical Evaluation

Extensive experiments on USHCN, PhysioNet’12, MIMIC-III, and MIMIC-IV demonstrate that CircuITS achieves superior normalized joint negative log-likelihood (njNLL) and marginal NLL (mNLL) compared to MOSES and ProFITi across both short-term and long-term forecasting tasks (Table: main results in paper). CircuITS consistently models channel dependencies and multimodal joint distributions, while structurally ensuring valid marginalization.

In synthetic bifurcation experiments (Figure 1; Figure 2), CircuITS resolves multimodal latent structure with minimal components per channel where MOSES fails, evidencing the strength of recursive sum-product aggregation. Figure 2

Figure 2

Figure 2

Figure 2: Comparing samples from ProFITi and CircuITS for the four channel bifurcation task.

Ablation studies confirm that the sum-product network structure, DSF leaves, and copula-based modeling each contribute significantly to performance. Channel ordering sensitivity is empirically negligible.

For large-scale datasets with imbalanced query distributions across channels, CircuITS maintains competitive computational profiles, though gains are attenuated in such extreme settings (see Figure 3). Figure 3

Figure 3: Query distribution for MIMIC-III 12–36, highlighting channel-level variation in query targets.

Practical and Theoretical Implications

CircuITS resolves a longstanding contradiction in probabilistic IMTS modeling: it achieves both expressivity and strict marginalization consistency. This property is critical in safety-critical domains (healthcare, climate science), where reliable uncertainty quantification and avoidance of contradictory forecasts are non-negotiable.

The theoretical framework extends tractable probabilistic circuit inference to the domain of irregular multivariate time series, allowing for future developments in scalable density estimation, interpretable forecasting, and integration with structured probabilistic programming.

Future Directions

Potential avenues include:

  • Incorporating learned channel permutation strategies to further optimize joint dependencies.
  • Extending copula mechanisms for richer intra-channel dependency structures or non-Gaussian correlations.
  • Adapting circuit architecture for irregular observation patterns beyond channel-level partitioning.

Open benchmarking opportunities exist with Transformer-based copula models and hypergraph neural networks, targeting further improvement in multivariate forecast expressivity, computational scaling, and interpretability.

Conclusion

CircuITS establishes a robust framework for probabilistic forecasting of irregular multivariate time series, holistically resolving the consistency–expressivity trade-off. Its empirical superiority and formal guarantees pave the way for reliable uncertainty quantification in complex temporal domains.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.