Low-Complexity Beamspace Channel Denoiser for mmWave Massive MIMO with Low-Resolution ADCs

Published 9 May 2026 in eess.SP and cs.AR | (2605.08855v1)

Abstract: In this paper, we propose a low-complexity beamspace channel denoising algorithm for millimeter-wave (mmWave) massive multi-input multi-output (MIMO) systems with low-resolution analog-to-digital converters (ADCs). The proposed method exploits the inherent sparsity of mmWave channels in the beamspace domain and formulates the denoising problem as a Bayesian binary hypothesis testing under a Bernoulli-complex Gaussian prior. To capture the distortion induced by low-resolution ADCs in a complexity-efficient manner, thermal noise and quantization noise are jointly modeled as a composite noise. Based on this modeling, a closed-form threshold value and a hard-thresholding-based denoising rule are derived to distinguish signal-dominant and noise-dominant components. The resulting algorithm avoids computationally intensive operations such as matrix inversion, iterative optimization, and parameter searching, and achieves near-linear computational complexity with respect to the number of antennas. Furthermore, a hardware-efficient very large-scale integration (VLSI) architecture is developed to enable practical deployment of the proposed algorithm, and is implemented on an AMD-Xilinx Kintex UltraScale+ KCU116 FPGA platform. The design incorporates hardware-aware simplifications and an efficient processing structure, leading to significantly lower latency and reduced hardware resource utilization compared to existing hardware implementations, along with sublinear scaling as the number of antennas increases. Extensive simulation results demonstrate that the proposed method achieves performance comparable to computationally intensive existing approaches while significantly reducing computational complexity.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a Bayesian beamspace denoising framework that classifies each beam through a single-pass hard threshold derived via a Bernoulli-complex Gaussian model.
It employs hardware-friendly blind estimators for composite noise, channel power, and activity rate, achieving near-optimal MSE and BER with low-resolution ADCs.
The design features an efficient VLSI architecture on FPGA, reducing computational cost and latency for real-time mmWave massive MIMO processing.

Low-Complexity Beamspace Channel Denoiser for mmWave Massive MIMO with Low-Resolution ADCs

Background and Motivation

Millimeter-wave (mmWave) massive MIMO architectures are a central enabler for high-throughput, low-latency wireless communications in next-generation networks. The signal propagation characteristics at mmWave frequencies induce highly sparse spatial channels, but the hardware cost and power consumption scale unfavorably with higher ADC resolution and increasing antenna counts. To address these constraints, low-resolution ADC architectures are widely considered, but quantization artifacts significantly harm conventional channel estimation algorithms. Existing approaches—including nonlinear quantization models, AQNM-based estimators, and deep learning-based methods—either entail high computational complexity, require iterative tuning, or are unsuitable for hardware implementation under tight resource and latency budgets.

Bayesian Beamspace Denoising with Composite Quantization Noise

The proposed framework models mmWave beamspace channel denoising as a Bayesian binary hypothesis test under a Bernoulli-complex Gaussian prior. By leveraging the sparsity of the beamspace channel, each beam is classified as either noise-dominant or signal-dominant through a single-pass hard-thresholding rule, derived analytically. A distinguishing feature is the composite noise model: both thermal noise and quantization noise—typically nonlinear and signal-dependent—are jointly approximated as zero-mean complex Gaussian through AQNM and empirical validation.

The squared-magnitude of beamspace quantization noise after transformation exhibits an empirical distribution closely matching the theoretical Gaussian assumption for both real and imaginary components.

Figure 1: Distribution of normalized real and imaginary quantization noise in the beamspace domain approximated by a Gaussian.

The channel observation $\bar{\mathbf{h}}'$ is modeled as a mixture under the hypotheses $\mathcal{H}_1$ (signal-plus-noise) and $\mathcal{H}_0$ (noise-only). Hypothesis priors correspond to the activity rate $q$ of nonzero channel taps. The Bayesian decision rule yields a closed-form threshold on $|\bar{h}'_m|^2$ for classifying each element, parameterized by composite noise variance, the estimated SDNR, and $q$ .

To ensure full autonomy and low-latency operation, the framework employs a suite of hardware-friendly blind estimators:

Robust Composite Noise Estimator: The estimator iteratively refines an initial median-based variance estimate using a truncated mean strategy, with bias correction based on exponential distribution properties. This robustifies against the heavy right-tail of signal-dominated components, ensuring high accuracy even when sparsity is moderate. Three iterations suffice for convergence in practical scenarios.
Figure 2: Iterative convergence of the blind composite noise estimator ( $\widehat{D}_0$ ) normalized to ground truth, demonstrating efficient and accurate estimation for various SNR and ADC resolutions.
Channel Power and SDNR Estimator: These statistics are extracted from the empirical mean squared magnitude and the estimated composite noise, with clipping mechanisms to prevent estimator instability.
Activity Rate Estimation: The activity rate is estimated via a closed-form moment-matching approach, avoiding enumeration or iterative search.

All steps incur linear computational cost in $M$ (antennas), except for the unitary FFT/IFFT operations required for beamspace transformation, which are $O(M\log M)$ .

Algorithmic Performance and Complexity

The proposed denoiser is directly compared with state-of-the-art model-based, optimization-based, and deep learning-based estimators across several metrics:

Channel MSE and Symbol BER: With 3-bit ADCs, the method matches or outperforms LS, Bussgang-LMMSE, and $\alpha$ -BEACHES (another beamspace denoising approach), and incurs only negligible MSE/BER degradation compared to GL-QVBCE (variational Bayesian) and diffusion-model estimators, while requiring vastly lower computation. The improvements over LS grow with reduced ADC precision, highlighting the effective mitigation of quantization noise in hardware-constrained operation.

Figure 3: MSE and post-equalization BER as a function of SNR for the proposed denoiser and multiple baseline estimators with 3-bit ADCs.

Resolution Scaling Behavior: The SNR gain achieved by the denoiser relative to the raw observation grows as ADC precision drops. At 2–3 bits, BER improvements reach 3–5 dB (at fixed BER), decreasing as ADCs approach the high-resolution regime.

Figure 4: Performance gain (MSE and BER) of the proposed denoising algorithm over baseline as a function of SNR and ADC resolution.

Complexity: The algorithm avoids matrix inversion, iterative coordinate descent, or neural inference. Computational costs are near-linear and actual run times (Python prototype) are 2–3 orders of magnitude below learning-based or iterative baselines. With SNR prior available, the complexity reduces to $\mathcal{H}_1$ 0.

VLSI Architecture and FPGA Implementation

A full VLSI processing chain is constructed, optimized for high-throughput, low-latency deployment. The architecture is modular:

Preprocessing (FFT, IFFT, magnitude computation)
Composite Noise Estimation: Systolic insertion-based sorting and cumulative prefix-sum units support median and truncated mean computations with minimal control; all divisions/multiplications implemented as bit-shifts or LUTs for fixed-point efficiency.
Figure 5: High-level VLSI architecture for the denoising chain; yellow-boxed auxiliary estimators can be bypassed when SNR is known.

Figure 6: Detailed architecture of the composite noise power estimator comprising sorting and truncated mean blocks.

Figure 7: Auxiliary estimator units for channel power, SDNR, and activity rate, with direct-use of pipelined sum, division, and LUT-based nonlinear functions.

Figure 8: Threshold computation unit with piecewise linear/logarithmic approximation blocks highlighted in gray.

Figure 9: Flow of the denoising unit: buffer, comparator, and LUT-based scaling.

The design is implemented on a Xilinx Kintex UltraScale+ FPGA. Across $\mathcal{H}_1$ 1 and $\mathcal{H}_1$ 2 (antennas), the architecture uses less than 6% of LUT and 4% of flip-flop resources at over 300 MHz. Latency is 2–4 $\mathcal{H}_1$ 3s per vector. Full system processing (16 users) is $\mathcal{H}_1$ 4s, substantially below mmWave channel coherence times.

Key hardware results:

Sublinear scaling in latency and resource use with respect to $\mathcal{H}_1$ 5.
35% further acceleration when SNR is known a priori, by omitting all auxiliary estimator logic.
Lower absolute latency and resource use compared to published FPGA implementations of iterative or learning-based estimators, particularly the only method supporting low-resolution ADC scenarios.

Implications and Future Directions

On the practical side, this framework offers a viable route for integrating massive MIMO signal processing into hardware platforms constrained by tight area and power budgets. This is particularly relevant for mmWave access points, small-cell relays, or edge compute platforms. The full pipeline matches the performance of high-complexity baselines, yet is implementable as fixed-point hard-wired logic with sub-millisecond latency.

Theoretically, the successful adoption of a Bernoulli-complex Gaussian surrogate prior and composite Gaussian modeling for quantization noise—validated by empirical distributional fits—supports the utility of simple statistical surrogates in hardware-efficient signal processing, provided their limitations are well characterized and mitigated by robust estimation.

Potential further work includes joint channel/data iterative schemes that preserve linear cost per iteration, improved second-order parameter estimation under non-Gaussian hardware regimes, and algorithm-hardware co-design for rapidly reconfigurable antenna arrays.

Conclusion

This work establishes that Bayesian hard-thresholding denoising in beamspace with robust blind parameter estimation provides a uniquely favorable tradeoff: near-optimal estimation accuracy at minimal computational and hardware cost in low-resolution mmWave MIMO systems. The architecture is deployable today in both FPGA and ASIC form, supporting real-time, power-efficient operation in future massive MIMO infrastructures (2605.08855).

Markdown Report Issue