- The paper introduces the ring toss code, which adapts rejection sampling to reach the lower bound in one-shot relative entropy coding.
- It proves that under bounded density conditions, the expected codeword length is F(X;Y) + log e bits, effectively closing the gap with mutual information bounds.
- The work establishes both non-asymptotic and asymptotic tight redundancy bounds, highlighting significant implications for stochastic channel simulation and data compression.
Optimality of Rejection Sampling for Relative Entropy Coding
Overview
This paper formalizes and resolves a longstanding question in coding theory and information theory related to the simulation of stochastic channels, specifically the fundamental rate limits of relative entropy coding. It advances the theory by demonstrating that classical rejection sampling, when coupled with their novel “ring toss code,” is optimal for single-use (one-shot) channel simulation in terms of a newly characterized measure, the “functional information.” Key contributions include closing the gap between classical mutual information bounds and what is fundamentally achievable, characterizing conditions in which redundancy vanishes or remains constant, and providing the tightest non-asymptotic and asymptotic bounds to date.
Relative Entropy Coding and Rate Measures
Relative entropy coding addresses the problem of simulating a conditional distribution PY∣X, given a source X∼PX, via a code that allows a receiver to generate a correct Y with potentially shared randomness. The classical lower bound for the expected codeword length is given by the mutual information I(X;Y). Poisson functional representations achieve this bound up to an additive logarithmic term, and prior work has shown that this gap is unavoidable for general channels.
Flamich et al. [flamich2025redundancy] introduced a refined lower bound termed the functional information F(X;Y), based on the channel simulation divergence, which is always at least as tight as the mutual information lower bound but was previously unachievable except for special cases.
Ring Toss Code and Main Result
The authors introduce the ring toss code, an encoding method adapted from rejection sampling, which utilizes an alternative encoding of the sample index to achieve an expected rate of F(X;Y)+loge bits for all channels satisfying a bounded density ratio condition. This result establishes that rejection sampling is rate-optimal for relative entropy coding in the sense of achieving the exact lower bound on expected codeword length up to an additive constant.
The central theorem states that:
- For random variables (X,Y)∼PX,Y and proposal distribution PY, with dPY∣X/dPY bounded by M, the ring toss code achieves X∼PX0 for common randomness X∼PX1.
- For singular channels (where X∼PX2 is a function only of X∼PX3 almost everywhere), X∼PX4, so constant redundancy is achieved over the mutual information bound.
Notably, the ring toss code does not rely on Poisson or greedy Poisson representations, but rather directly encodes the conditional distribution derived from the shared randomness and proposal samples.
Implications and Tight Numerical Bounds
The established rate immediately recovers and strengthens previously known asymptotic results:
- The classical lower bound gap of X∼PX5 bits is improved to X∼PX6 bits for general channels, and X∼PX7 bits (i.e., X∼PX8) for singular channels.
- For blocklength X∼PX9, the asymptotic redundancy per symbol is characterized, with the logarithmic gap coefficient being Y0 for singular channels and Y1 for nonsingular channels.
These provide the tightest known achievable bounds for both one-shot and asymptotic regimes and refine the connection between entropy coding rates, mutual information, and the newly formalized functional information.
Structural and Theoretical Insights
The paper identifies the channel simulation divergence (also known as the functional information divergence) as a fundamental quantity for measuring the cost of simulating stochastic channels in the presence of side or shared randomness. The width function, which underpins this divergence, links the geometry of probability densities with achievable code rates.
For singular channels, which include classical examples such as the binary erasure and continuous uniform additive noise channels, the ring toss code demonstrates constant redundancy. Lemma III.1 provides a full characterization of when redundancy vanishes (Y2) in terms of channel singularity.
Importantly, the ring toss code’s search-based interpretation—searching over Y3 instead of Y4—distinguishes it algorithmically from existing entropy codes and elucidates the relationship between the code structure and the underlying stochastic process.
Future Directions
Practical and theoretical extensions suggested include:
- Removing or relaxing the boundedness condition on the density ratio Y5 to handle unbounded or heavy-tailed distributions.
- Determining whether there exist broader channel families, beyond singular channels, for which Y6 is sub-logarithmic or negligible.
- Developing efficient computational methods for the exact evaluation or approximation of Y7 in high-dimensional or structured probabilistic settings.
Given its relation to channel simulation, lossless and lossy compression, and the strong functional representation lemma, this work lays groundwork for further advances in the optimality of stochastic codes across data compression, machine learning, and information theory.
Conclusion
This paper provides a fundamental advance in channel simulation and coding theory, proving that optimized rejection sampling—via the ring toss code—achieves the best possible rate for relative entropy coding in terms of the newly established functional information. The work resolves an important open problem, delivers the tightest non-asymptotic and asymptotic bounds for channel simulation redundancy, and sets a new theoretical standard for both the analysis and practical construction of entropy codes based on stochastic channel simulation (2604.23076).