Stationarity of P-RHPG Stage Gains (PDC Riccati Analogue)

Establish that the time-varying optimal stage gains produced by the Polytopic Receding-Horizon Policy Gradient (P-RHPG) backward recursion become approximately stationary as the horizon N tends to infinity, thereby providing the Parallel Distributed Compensation (PDC) analogue of Riccati convergence and implying convergence of the optimal finite-horizon integrated cost J_N^*(Q_N) to the integrated infinite-horizon optimum for any terminal cost Q_N ⪰ 0.

Background

The paper proves monotone convergence of the optimal integrated cost J_N* in the zero-terminal-cost case and establishes upper and lower bounds (a squeeze characterization) for general terminal costs Q_N ⪰ 0. Achieving full convergence for general Q_N requires showing that the sequence of time-varying optimal stage gains in the backward sweep settles to a stationary policy, analogous to the classical Riccati convergence in LQR.

Because of polytopic cross-terms and the shared weighting functions in the PDC structure, a direct extension of Riccati-based arguments is nontrivial. The authors note that convergence for general terminal costs holds if and only if these stage-optimal gains become approximately stationary as N grows, identifying this as the central unresolved issue for proving universal convergence in the polytopic setting.

References

This limit holds if and only if the time-varying optimal stage gains become approximately stationary as N\to\infty, the PDC analogue of Riccati convergence, which is the key open problem in the polytopic setting.

Receding-Horizon Policy Gradient for Polytopic Controller Synthesis  (2603.29283 - Shakeri et al., 31 Mar 2026) in Remark (Convergence for general Q_N), Section 4.2