Convergence of the Alacaoglu–Wright single-loop stochastic quadratic penalty method under θ>1

Determine whether the momentum-based single-loop stochastic quadratic penalty method introduced by Alacaoglu and Wright (2024)—which applies a recursive momentum estimator to the entire quadratic penalty Q_p(x) = f(x) + (p/2)||c(x)||^2 while treating the penalty parameter p as part of the optimization variables—converges for the deterministically constrained stochastic nonconvex problem min_{x∈X} E[f(x, ξ)] subject to c(x)=0, under the error bound condition dist(0, ∇c(x)c(x) + N_X(x)) ≥ γ ||c(x)||^θ with θ>1 and the average smoothness condition E[||∇f(u, ξ) − ∇f(v, ξ)||^2] ≤ L^2 ||u − v||^2 for all u,v ∈ X.

Background

The paper studies deterministically constrained stochastic nonconvex optimization and proposes new single-loop variance-reduced methods that ensure nearly exact feasibility with certainty. In contrasting their methods with prior work, the authors discuss a single-loop stochastic quadratic penalty algorithm from Alacaoglu and Wright (2024) that applies a recursive momentum estimator directly to the full quadratic penalty function.

For the special case corresponding to θ=1 in the error bound condition, rates for the Alacaoglu–Wright algorithm are known. However, when the error bound exponent satisfies θ>1, the authors explicitly state that the convergence of this prior algorithm is not known, while their new Algorithm 1 does achieve convergence guarantees under the same broader range of θ≥1.

References

Furthermore, under Assumptions 1 and 2 with 0 > 1, the convergence of [1, Algorithm 2] remains unknown, while the sequence {xk} generated by Algorithm 1 satisfies |(x(k)|2 = Õ(k−τ), E[dist2 (0, ∇f(x(uk)) + ∇c(x(uk))λuk + Nx(x(uk)))] = Õ(k−τ) with τ = min{2/(0 + 2), 1/0} for some sequence {λk}.

Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees  (2409.09906 - Lu et al., 2024) in Section 2, paragraph comparing Algorithm 1 with [1, Algorithm 2] (after Algorithm 1)