Explain unexpected cross-family congruence between specific Pythia and GPT-2 models on IOI
Determine the cause of the unexpectedly elevated CONGRUITY representation-similarity scores between the Pythia-160M model and GPT-2 medium, and between the Pythia-410M model and GPT-2 small, when evaluated on the indirect-object identification task; in particular, assess whether similarities in name-mover head representations account for this phenomenon.
References
It is unclear why Pythia-160M and 410M show increased congruence with GPT2-medium and small, respectively; perhaps, subtle similarities in name-mover heads representations could explain this (Tigges et al. 2024, Section 3.2).
— Tracking Equivalent Mechanistic Interpretations Across Neural Networks
(2603.30002 - Sun et al., 31 Mar 2026) in Section 3.2: Reduction to Simpler Models