Cross-architecture transferability of stability-filtered steering vectors
Determine whether steering vectors constructed using stability filtering and content-subspace projection, extracted from a Qwen-architecture 1.5B model, transfer to language models from more distant architecture families without re-extraction.
References
Whether this extends to more distant model families remains an open question.
— Reliable Control-Point Selection for Steering Reasoning in Large Language Models
(2604.02113 - Zhuang et al., 2 Apr 2026) in Section 4 (Experiments), Main Results — Cross-model transfer paragraph