Explaining JumpReLU sensitivity and dead latents in SynthSAEBench-16k

Identify and characterize the mechanisms that cause JumpReLU sparse autoencoders to exhibit finicky training behavior and substantial dead latents on the SynthSAEBench-16k synthetic benchmark, including their apparent sensitivity to initialization and auxiliary loss choices, to enable principled mitigation strategies.

Background

The authors observe that JumpReLU SAEs are particularly prone to dead latents on SynthSAEBench-16k and appear highly sensitive to initial conditions, such as the initial JumpReLU threshold and decoder norm. They report that adjusting these settings and replacing the standard auxiliary loss with a TopK-style auxiliary loss reduces dead latents.

Despite these practical fixes, the underlying reasons for JumpReLU’s instability and sensitivity remain unclear, motivating a precise investigation of the causes to inform design improvements and more robust training procedures.

References

We find that JumpReLU in particular is quite finicky in SynthSAEBench-16k, for reasons we are not completely sure of. It appears that JumpReLU SAEs are very sensitive to initial conditions. Understanding what exactly is causing this would be a valuable direction for future work.

SynthSAEBench: Evaluating Sparse Autoencoders on Scalable Realistic Synthetic Data  (2602.14687 - Chanin et al., 16 Feb 2026) in Appendix, Section "Dead latents in SynthSAEBench"