- The paper introduces a comprehensive thermal model for spintronic CiM systems using MRAM, revealing a linear temperature rise with increased active memory utilization.
- It demonstrates that SHE-based MRAM exhibits an order-of-magnitude lower temperature rise compared to STT-based arrays under similar operational workloads.
- The study emphasizes the importance of application scheduling, thermal throttling, and array sizing in managing heat dissipation and ensuring system reliability.
Quantitative Thermal Analysis of Spintronic Compute-in-Memory Architectures
Introduction
This paper presents the first comprehensive quantitative thermal characterization of compute-in-memory (CiM) systems implemented with spintronic non-volatile memories (NVM), specifically focusing on the thermal implications of direct Boolean computation within dense MRAM arrays. The authors develop a detailed simulation-based thermal model, analyze representative workloads, and highlight the critical interplay between array utilization, technology parameters, and cooling capability constraints. The characterization provides nuanced understanding relevant for the deployment, reliability engineering, and architectural design of future high-density CiM systems.
Technical Approach and Modeling
The study targets in-memory computing executed directly in MRAM arrays, leveraging both Spin-Transfer Torque (STT) and Spin Hall Effect (SHE) based MRAM technologies as evaluation prototypes. The work rigorously models the physical and electrical behavior of in-MRAM logic gate operations, considering gate-dependent voltage application, current flow, and corresponding Joule heating. The physical granularity of the thermal model aligns with the MRAM cell grid, and nodes represent MTJ centers, capturing localized power dissipation.
Steady-state thermal distributions are derived by numerically solving the linear system GT=P, where G encodes the material and geometric thermal conductance network, T is the vector of temperature rises from ambient, and P contains benchmark-specific spatial power profiles. The model incorporates realistic parameters for silicon die thickness, thermal interface materials (TIM), and package/heat sink conduction and convection to ambient.
Notably, the evaluation includes both worst-case and representative application-level workloads, including a power virus (INV), vector-matrix multiplication (VMUL), and Hopfield neural network (NN) computation, with multiple gate-mix and parallelism strategies under varied utilization factors and array sizes.
Key Results and Analytical Insights
Power Density and Temperature Scalability
A primary finding is the approximate linear dependence of peak array temperature on the fraction of active memory cells (array utilization), and an inverse linear scaling with increasing array area. For the most thermally stressing configurations (e.g., 100% utilization of a small STT-MRAM array with INV operations), temperature excursions can exceed 340°C, far surpassing process reliability thresholds (typically 125°C). Power density in these scenarios reaches over 570 W/cm², beyond the capacity of conventional forced-air or even advanced cooling solutions.
Conversely, increasing array size for a fixed problem results in lower power density and temperature rise, attributed to enhanced surface area for heat dissipation and more distributed heat generation. However, the authors observe that this trend may be superseded by higher-order effects as arrays scale further.
Impact of MRAM Technology
A critical differentiation is observed between STT and SHE MRAM arrays: SHE-based CiM arrays exhibit an order of magnitude lower temperature rise under analogous workloads. This advantage is due to reduced critical current requirements for switching, higher effective cell resistance, and larger cell area in SHE, which collectively curtail both instantaneous and average power density.
Application Mapping and Parallelism
The results highlight the significance of not only gate operation power but also how application scheduling impacts thermal profiles. For instance, mapping Hopfield network updates to utilize parallelism across multiple arrays reduces both maximum temperature and execution time. Coarse-grained mapping strategies can balance array utilization and enable duty cycling or idle cycle injection for thermal throttling, trading off throughput for thermal safety.
Practical Constraints and Throttling
The analysis underscores scenarios where power densities of CiM operations exceed practical cooling limits. The paper quantitatively demonstrates the effectiveness of simple thermal throttling (idle injections), showing linear reduction in peak temperature with decreased duty cycle, though at the expense of performance. The study also examines the thermal resistance stack (TIM, die, heatsink), identifying that improvements to TIM properties or die thinning can yield further reductions in maximum operating temperature.
Implications for Future CiM System Design
This study delivers actionable quantitative guidance for MRAM-based CiM design. The insights into temperature scaling with utilization and array sizing directly inform architectural and physical design constraint setting. The demonstrated order-of-magnitude reduction in thermal stress when transitioning from STT to SHE MRAM emphasizes the importance of technology selection in next-generation CiM deployments.
On the reliability front, the results suggest that without workload and thermal management co-design—including both architecture-level (throttling, mapping, utilization capping) and system-level (cooling enhancements, technology selection)—CiM arrays are at substantial risk of exceeding thermal and thus reliability bounds, especially for dense or highly utilized arrays operating workloads that trigger worst-case gate activity.
The study also positions thermal characterization as an essential first-order metric alongside functional performance and energy efficiency for evaluating CiM system merit. Future research directions suggested by this analysis include exploration of advanced multi-array load balancing, dynamic thermal-aware scheduling, and innovative heat extraction strategies tailored to CiM physical layouts.
Conclusion
This work establishes, through detailed simulation and analysis, that thermal management is a primary system constraint for MRAM-based computing-in-memory, with power density tightly regulated by utilization, array size, and MRAM cell technology. The results indicate that SHE MRAM is preferable for practical CiM instantiations due to significantly lower heat generation, and that system architects must proactively deploy CiM-specific throttling protocols and enhanced cooling methods to remain within safe operational margins. The findings are foundational for the realization of reliable, high-performance, and thermally viable spintronic CiM systems, and set the stage for future research into holistic, thermally-coherent CI/CD memory-compute paradigms.