Cross-architecture transferability of stability-filtered steering vectors

Determine whether steering vectors constructed using stability filtering and content-subspace projection, extracted from a Qwen-architecture 1.5B model, transfer to language models from more distant architecture families without re-extraction.

Background

The paper extracts steering vectors from DeepSeek-R1-Distill-Qwen-1.5B and demonstrates that these vectors improve two additional Qwen-architecture 1.5B models (Nemotron-Research-Reasoning-1.5B and DeepScaleR-1.5B-Preview) without re-extraction.

Because the tested models share an architecture family and similar training data, the authors caution that it remains unknown whether such transferability holds for more distant model families.

References

Whether this extends to more distant model families remains an open question.

— Reliable Control-Point Selection for Steering Reasoning in Large Language Models (2604.02113 - Zhuang et al., 2 Apr 2026) in Section 4 (Experiments), Main Results — Cross-model transfer paragraph

Cross-architecture transferability of stability-filtered steering vectors

Background

References

Related Problems