- The paper introduces a novel class of low-rank, short-recurrence solvers that extend GCR methods to tackle multiterm nonsymmetric matrix equations.
- It employs matrix-oriented recurrences and randomized subspace embeddings to maintain low-rank factorization, ensuring efficient convergence and memory savings.
- Numerical results on PDE benchmarks demonstrate significant reductions in iteration counts, memory usage, and CPU time compared to classical iterative approaches.
Low-Rank Short Recurrences for Nonsymmetric Linear Matrix Equations
Problem Context and Motivation
The paper addresses the numerical solution of large-scale multiterm matrix equations of the form ∑i=1p​Ai​XBi​=CDT, where Ai​∈RnA​×nA​, Bi​∈RnB​×nB​ are nonsymmetric and typically sparse, and C,D are tall, full-rank matrices with q≪nA​,nB​. Many applications, including parametric PDEs, stochastic Galerkin discretizations, PDE-constrained optimization, and imaging, naturally yield such structures. Classical iterative methods, when applied in vectorized form, are infeasible for large nA​,nB​ due to memory constraints. The paper proposes a novel class of iterative solvers that exploit low-rank structure and maintain short recurrences, formally extending Generalized Conjugate Residual (GCR)-type methods to matrix equation settings, particularly for nonsymmetric operators.
Methodological Contributions
Matrix-Oriented Short Recurrence Framework
The proposed methods generalize one-dimensional projection schemes from vector systems to matrix equations. The key idea is to maintain iterates, residuals, and update directions in low-rank factored form. The solution is advanced via short recurrences: for each k, the update takes the form
Xk+1​=Xk​+Vk(l)​αk​(Vk(r)​)T,
where Vk(l)​,Vk(r)​ are low-rank matrices and αk​ is a matrix coefficient determined via a local minimization problem that ensures (matrix) Petrov-Galerkin optimality in the residual norm.
Minimal Residual (MR) and Conjugate Residual (CR)-Type Schemes
- Subspace Minimal Residual (MR): At each iteration, the direction is the current residual factor, and the matrix Ai​∈RnA​×nA​0 is chosen to minimize the residual in Frobenius norm via a projected matrix equation. This generalizes the scalar step size selection of classical MR methods to matrix-valued coefficients.
- Subspace Conjugate Residual (CR)-Type: The update directions (matrices) are constructed with explicit matrix-valued orthogonality conditions, reminiscent of GCR methods. For efficiency and stability, the authors focus on a one-term recurrence analogous to Orthomin(1). The matrix coefficient Ai​∈RnA​×nA​1 ensures Ai​∈RnA​×nA​2-orthogonality between directions.
The resulting reduced matrix equations for Ai​∈RnA​×nA​3 and Ai​∈RnA​×nA​4 involve Ai​∈RnA​×nA​5 multiterm sums, and the right-hand sides are naturally expressed in low-dimensional projected forms, leveraging the inherent Kronecker structure.
Algorithmic Detail
All iterates and auxiliary quantities (solutions, residuals, directions) are maintained in low-rank factored form. Rank truncation is applied via (truncated) SVD, possibly using randomized sketching for large dimensions, to manage memory and preserve numerical stability. Memory costs scale with Ai​∈RnA​×nA​6, and the algorithms include robust rank adaptation and residual norm computation via sketching for extreme dimensions.
Randomization via Subspace Embeddings
To enable scalable low-rank truncation and residual norm calculation when Ai​∈RnA​×nA​7 or Ai​∈RnA​×nA​8 are large, the methods employ randomized subspace embeddings---specifically, randomized subsampled trigonometric transforms---to efficiently sketch matrices and estimate ranks and norms with high probability guarantees. The paper provides bounds on fidelity of norm approximation and outlines how to construct sketched SVDs for truncation.
Preconditioning
The solvers admit flexible preconditioning strategies. The most practical settings involve:
- One-term preconditioners: inversion by a single pair Ai​∈RnA​×nA​9.
- Two-term preconditioners: inversion using a Sylvester solver (typically ADI) on the sum of two leading terms.
The design permits efficient internal solves for projected preconditioners, often critical in practical PDE discretizations.
Theoretical Analysis
The convergence of the methods is analyzed via classical energy norm techniques and subspace projection arguments. For MR-type recurrences, a result generalizing Elman’s bound is proven: if the projected Kronecker operator is positive definite, the residual norm decreases by a predictable factor involving the minimal eigenvalue and norm of the operator. The effect of forced rank truncation is discussed via thick restarted projection theory, showing that the iterative process retains effective search subspaces after truncation, and convergence can be sustained if these subspaces continue to grow.
Numerical Results
Benchmark PDE: Convection-Diffusion
Strong results are reported for a multiterm matrix equation arising from finite difference discretization of a recirculating convection-diffusion problem. Compared against low-rank GMRES and classical BiCGStab applied to the Kronecker vector system, the proposed methods require significantly less memory, converge with far fewer iterations (almost independent of mesh size), and exhibit an order-of-magnitude speedup in CPU time, especially at low diffusion coefficients where classical approaches struggle.
Stochastic Galerkin Mixed Finite Element Problems
For the parameterized Darcy flow (SG-MFEM, both fast and slow decay KL scenarios), the proposed MR and CR-type recurrences efficiently solve systems with up to several billion unknowns:
- Iteration counts are independent of the number of parameters and discretization size.
- The rank of final iterates is tightly controlled and typically equal to chosen maxrank.
- When compared to standard MINRES with constraint preconditioning, the new methods achieve superior performance in memory and CPU time, with MINRES often failing due to resource limitations.
- For slowly decaying KL expansions (ill-conditioning, high-rank solutions), larger maxrank suffices, and randomized sketching enables tractable solves.
- Inner solves (for projected multiterm matrix equations) exhibit stable convergence under reasonable preconditioning, even as Bi​∈RnB​×nB​0 and rank increase.
The methodology demonstrates reliability, scalability, and practical robustness across a spectrum of challenging matrix equation settings.
Implications and Future Developments
This work provides a systematic framework for memory-efficient, low-rank iterative solvers for general multiterm, nonsymmetric matrix equations. Its significance spans major applications in stochastic PDEs, PDE-constrained optimization, high-dimensional inversion, and advanced model reduction. The incorporation of randomized sketching addresses fundamental scalability barriers.
Potential future directions include:
- Extending the framework to tensor equations and multilinear algebraic systems.
- Integration of more advanced preconditioning (e.g., Kronecker-structured approximate inverses).
- Adaptive rank selection via dynamic spectral gap estimation, leveraging recent randomized rank estimation theory.
- Application to broader classes of PDE discretizations, nonlinear matrix equations, and real-time data assimilation.
Conclusion
The paper establishes a novel class of low-rank, short-recurrence iterative methods for multiterm nonsymmetric matrix equations, rigorously generalizing projection-based solvers to matrix settings. Through careful structure preservation, efficient truncation, and scalable randomization, the approach achieves remarkable numerical efficiency and stability, enabling solutions to problems previously considered intractable due to dimensionality and complexity. The theoretical and experimental results validate the methods’ robustness across diverse scenarios, and the framework sets the stage for further methodological advances in large-scale numerical linear algebra and scientific computation (2605.01276).