A class of low-rank short recurrences for nonsymmetric linear matrix equations

Published 2 May 2026 in math.NA | (2605.01276v1)

Abstract: We propose a new class of short matrix recurrences for the solution of nonsymmetric linear equations of the type $\mathbf{A}_1\mathbf{X}\mathbf{B}_1+\ldots+\mathbf{A}_p\mathbf{X}\mathbf{B}_p=CD^T$. These iterative methods combine local subspace projection to speed up convergence with rank truncation strategies and randomization procedures to limit memory consumption. Computational experiments on a benchmark problem as well as a challenging discretized mixed formulation of a diffusion equation with random inputs illustrate the potential of the proposed methodology.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a novel class of low-rank, short-recurrence solvers that extend GCR methods to tackle multiterm nonsymmetric matrix equations.
It employs matrix-oriented recurrences and randomized subspace embeddings to maintain low-rank factorization, ensuring efficient convergence and memory savings.
Numerical results on PDE benchmarks demonstrate significant reductions in iteration counts, memory usage, and CPU time compared to classical iterative approaches.

Low-Rank Short Recurrences for Nonsymmetric Linear Matrix Equations

Problem Context and Motivation

The paper addresses the numerical solution of large-scale multiterm matrix equations of the form $\sum_{i=1}^p A_i X B_i = C D^T$ , where $A_i \in \mathbb{R}^{n_A \times n_A}$ , $B_i \in \mathbb{R}^{n_B \times n_B}$ are nonsymmetric and typically sparse, and $C, D$ are tall, full-rank matrices with $q \ll n_A, n_B$ . Many applications, including parametric PDEs, stochastic Galerkin discretizations, PDE-constrained optimization, and imaging, naturally yield such structures. Classical iterative methods, when applied in vectorized form, are infeasible for large $n_A, n_B$ due to memory constraints. The paper proposes a novel class of iterative solvers that exploit low-rank structure and maintain short recurrences, formally extending Generalized Conjugate Residual (GCR)-type methods to matrix equation settings, particularly for nonsymmetric operators.

Methodological Contributions

Matrix-Oriented Short Recurrence Framework

The proposed methods generalize one-dimensional projection schemes from vector systems to matrix equations. The key idea is to maintain iterates, residuals, and update directions in low-rank factored form. The solution is advanced via short recurrences: for each $k$ , the update takes the form

$X_{k+1} = X_k + V_k^{(l)} \alpha_k (V_k^{(r)})^T,$

where $V_k^{(l)}, V_k^{(r)}$ are low-rank matrices and $\alpha_k$ is a matrix coefficient determined via a local minimization problem that ensures (matrix) Petrov-Galerkin optimality in the residual norm.

Minimal Residual (MR) and Conjugate Residual (CR)-Type Schemes

Subspace Minimal Residual (MR): At each iteration, the direction is the current residual factor, and the matrix $A_i \in \mathbb{R}^{n_A \times n_A}$ 0 is chosen to minimize the residual in Frobenius norm via a projected matrix equation. This generalizes the scalar step size selection of classical MR methods to matrix-valued coefficients.
Subspace Conjugate Residual (CR)-Type: The update directions (matrices) are constructed with explicit matrix-valued orthogonality conditions, reminiscent of GCR methods. For efficiency and stability, the authors focus on a one-term recurrence analogous to Orthomin(1). The matrix coefficient $A_i \in \mathbb{R}^{n_A \times n_A}$ 1 ensures $A_i \in \mathbb{R}^{n_A \times n_A}$ 2-orthogonality between directions.

The resulting reduced matrix equations for $A_i \in \mathbb{R}^{n_A \times n_A}$ 3 and $A_i \in \mathbb{R}^{n_A \times n_A}$ 4 involve $A_i \in \mathbb{R}^{n_A \times n_A}$ 5 multiterm sums, and the right-hand sides are naturally expressed in low-dimensional projected forms, leveraging the inherent Kronecker structure.

Algorithmic Detail

All iterates and auxiliary quantities (solutions, residuals, directions) are maintained in low-rank factored form. Rank truncation is applied via (truncated) SVD, possibly using randomized sketching for large dimensions, to manage memory and preserve numerical stability. Memory costs scale with $A_i \in \mathbb{R}^{n_A \times n_A}$ 6, and the algorithms include robust rank adaptation and residual norm computation via sketching for extreme dimensions.

Randomization via Subspace Embeddings

To enable scalable low-rank truncation and residual norm calculation when $A_i \in \mathbb{R}^{n_A \times n_A}$ 7 or $A_i \in \mathbb{R}^{n_A \times n_A}$ 8 are large, the methods employ randomized subspace embeddings---specifically, randomized subsampled trigonometric transforms---to efficiently sketch matrices and estimate ranks and norms with high probability guarantees. The paper provides bounds on fidelity of norm approximation and outlines how to construct sketched SVDs for truncation.

Preconditioning

The solvers admit flexible preconditioning strategies. The most practical settings involve:

One-term preconditioners: inversion by a single pair $A_i \in \mathbb{R}^{n_A \times n_A}$ 9.
Two-term preconditioners: inversion using a Sylvester solver (typically ADI) on the sum of two leading terms. The design permits efficient internal solves for projected preconditioners, often critical in practical PDE discretizations.

Theoretical Analysis

The convergence of the methods is analyzed via classical energy norm techniques and subspace projection arguments. For MR-type recurrences, a result generalizing Elman’s bound is proven: if the projected Kronecker operator is positive definite, the residual norm decreases by a predictable factor involving the minimal eigenvalue and norm of the operator. The effect of forced rank truncation is discussed via thick restarted projection theory, showing that the iterative process retains effective search subspaces after truncation, and convergence can be sustained if these subspaces continue to grow.

Numerical Results

Benchmark PDE: Convection-Diffusion

Strong results are reported for a multiterm matrix equation arising from finite difference discretization of a recirculating convection-diffusion problem. Compared against low-rank GMRES and classical BiCGStab applied to the Kronecker vector system, the proposed methods require significantly less memory, converge with far fewer iterations (almost independent of mesh size), and exhibit an order-of-magnitude speedup in CPU time, especially at low diffusion coefficients where classical approaches struggle.

Stochastic Galerkin Mixed Finite Element Problems

For the parameterized Darcy flow (SG-MFEM, both fast and slow decay KL scenarios), the proposed MR and CR-type recurrences efficiently solve systems with up to several billion unknowns:

Iteration counts are independent of the number of parameters and discretization size.
The rank of final iterates is tightly controlled and typically equal to chosen maxrank.
When compared to standard MINRES with constraint preconditioning, the new methods achieve superior performance in memory and CPU time, with MINRES often failing due to resource limitations.
For slowly decaying KL expansions (ill-conditioning, high-rank solutions), larger maxrank suffices, and randomized sketching enables tractable solves.
Inner solves (for projected multiterm matrix equations) exhibit stable convergence under reasonable preconditioning, even as $B_i \in \mathbb{R}^{n_B \times n_B}$ 0 and rank increase. The methodology demonstrates reliability, scalability, and practical robustness across a spectrum of challenging matrix equation settings.

Implications and Future Developments

This work provides a systematic framework for memory-efficient, low-rank iterative solvers for general multiterm, nonsymmetric matrix equations. Its significance spans major applications in stochastic PDEs, PDE-constrained optimization, high-dimensional inversion, and advanced model reduction. The incorporation of randomized sketching addresses fundamental scalability barriers.

Potential future directions include:

Extending the framework to tensor equations and multilinear algebraic systems.
Integration of more advanced preconditioning (e.g., Kronecker-structured approximate inverses).
Adaptive rank selection via dynamic spectral gap estimation, leveraging recent randomized rank estimation theory.
Application to broader classes of PDE discretizations, nonlinear matrix equations, and real-time data assimilation.

Conclusion

The paper establishes a novel class of low-rank, short-recurrence iterative methods for multiterm nonsymmetric matrix equations, rigorously generalizing projection-based solvers to matrix settings. Through careful structure preservation, efficient truncation, and scalable randomization, the approach achieves remarkable numerical efficiency and stability, enabling solutions to problems previously considered intractable due to dimensionality and complexity. The theoretical and experimental results validate the methods’ robustness across diverse scenarios, and the framework sets the stage for further methodological advances in large-scale numerical linear algebra and scientific computation (2605.01276).

Markdown Report Issue