Unidentified cause of SCALAR-TRXL failure on Eat Plant
Identify the cause of the SCALAR-TRXL variant’s near-zero success on the Eat Plant task despite SCALAR-FC achieving high success, and determine whether interactions between the Transformer-XL architecture and long-wait survival credit assignment are responsible for the failure.
References
SCALAR-TRXL fails on Eat Plant. Despite identical training procedures, SCALAR-TRXL achieves near-zero success on Eat Plant while SCALAR-FC achieves approximately 90\%. We hypothesize this reflects an interaction between the transformer architecture and the specific credit assignment challenges of long-wait survival tasks, but have not isolated the cause.
— SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding
(2603.09036 - Zabounidis et al., 10 Mar 2026) in Appendix, Section "Limitations and Future Work"