Gated Coordination for Efficient Multi-Agent Collaboration in Minecraft Game

Published 21 Apr 2026 in cs.MA | (2604.18975v1)

Abstract: In long-horizon open-world multi-agent systems, existing methods often treat local anomalies as automatic triggers for communication. This default design introduces coordination noise, interrupts local execution, and overuses public interaction in cases that could be resolved locally. To address this issue, we propose a partitioned information architecture for MLLM agents that explicitly separates private execution states from public coordination states. Building on this design, we introduce two key mechanisms. First, we develop an event-triggered working memory based on system-verified outcomes to maintain compact and low-noise local state representations. Second, we propose a cost-sensitive gated escalation mechanism that determines whether cross-region communication should be initiated by jointly considering node criticality, local recovery cost, and downstream task impact. In this way, communication is transformed from a default reaction into a selective decision. Experiments conducted on long-term construction tasks in open environments demonstrate that, compared to baseline models based on strong communication and planned structures, the introduction of gated communication and a partitioned information architecture results in superior performance in terms of blueprint completion quality and execution chain length. It also improves local self-recovery, reduces ineffective escalations, and increases the utility of public communication.

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces a novel gated coordination mechanism that segregates private and public states to minimize unnecessary global communication.
It employs a three-tiered escalation strategy combining heuristic rules, cost-sensitive scoring, and bounded LLM adjudication to enhance task success and efficiency.
Empirical evaluations in Minecraft environments reveal significant improvements in task success rates and communication effectiveness over baseline methods.

Gated Coordination for Efficient Multi-Agent Collaboration in Minecraft

Problem Formulation and Motivation

The paper addresses the inefficiency and instability inherent in contemporary multi-agent MLLM systems, particularly under long-horizon, open-world task regimes exemplified by Minecraft construction. Prior approaches predominantly operate under a "communication-first" paradigm, equating frequent cross-agent interaction with better coordination. This leads to execution interruptions, global state pollution, and deadlock due to unnecessary or premature communication. Such systems lack a robust adjudication mechanism to determine when local issues truly necessitate costly global coordination, frequently conflating the ability to communicate with the necessity to do so.

Partitioned Information Architecture

To remedy these deficiencies, the authors propose a partitioned information architecture that explicitly segregates each agent’s private execution state from its public coordination state. Local execution is maintained via a compact, deterministic working memory, updated only through verified outcomes rather than free-form LLM summarization, minimizing context pollution and hallucination risks. Public coordination channels are strictly protocolized—only state-changing signals are broadcast—ensuring global communication is selectively initiated and tightly scoped.

Figure 1: Partitioned information architecture separates private execution memory from public coordination, enforced by a gated escalation mechanism.

Gated Escalation Mechanism

Agent communication is governed by a three-tiered gating policy. Upon detection of structural anomalies (e.g., missing materials, dependency blocks), a hierarchical decision process evaluates:

Heuristic Rules: Absorb deterministic cases locally.
Cost-Sensitive Escalation Scoring: Quantifies urgency and collaborative advantage using features such as node criticality, coordination benefit, downstream impact, local recoverability, and coordination history penalty.
Bounded Gray-Zone LLM Adjudicator: Handles ambiguous trade-offs via a strictly structured input/output protocol, avoiding full-context drift.

The gating mechanism is highly asymmetric, defaulting to local resolution unless collaboration is overwhelmingly advantageous. Failed coordination triggers mandatory cooldowns and deterministic local fallback, preventing deadlock.

Figure 2: Gated Collaborative Escalation Policy, integrating heuristics, scoring, and bounded LLM adjudication for selective coordination.

Experimental Evaluation

Experiments are conducted on MindCraft and VillagerBench platforms, including custom splits that necessitate genuine coordination via resource partitioning and dependency bottlenecks. The framework is benchmarked against FlatComm (free-form communication) and centralized DAG-planning baselines.

Key metrics include:

Task Success Rate (TSR)
Completion Steps (CS)
Local Resolution Rate (LRR)
Unnecessary Escalation Rate (UER)
Effective Communication Rate (ECR)
Recovery Success Rate (RSR)

The model delivers strong numerical gains:

Setting	Baseline TSR	Ours TSR	Baseline CS	Ours CS
MindCraft Std	31.1%	35.7%	396	294
MindCraft Custom	21.5%	32.8%	184	134
Villager Std	36.45%	42.76%	103	76
Villager Custom	22.04%	34.56%	145	92

Coordinative quality metrics confirm substantial improvements in local autonomy, selectivity, and communication efficacy (e.g., LRR up to 89.7%, UER as low as 11.3%).

Ablation and Mechanistic Analysis

Ablation studies demonstrate that both partitioning and multi-tiered gating are necessary to achieve optimal coordination. Removing partitioning leads to increased message volume and lower TSR. Using only heuristic rules produces suboptimal ECR and TSR; adding cost-sensitive scoring sharply improves selectivity and performance. Full configuration with bounded LLM yields highest TSR (32.8% on MindCraft Custom) and minimal communication overhead.

Qualitative case studies reveal that the framework prevents unnecessary interruptions, enables cost-sensitive escalation, efficiently routes failed coordination to deterministic fallback, and avoids deadlocks.

Figure 3: Comparison of baseline (free-form communication) versus gated escalation decision dynamics in resource-constrained scenarios.

Theoretical and Practical Implications

This work provides evidence that the efficiency and robustness of multi-agent LLM systems are not maximized by increasing communication volume but by integrating rigorously governed interaction boundaries. The partitioned architecture fundamentally reshapes coordination dynamics, enabling agents to absorb minor anomalies autonomously and escalate only when a net collaborative advantage is established. The protocolized channel transforms communication into discrete, algorithmic state-change events, reducing noise, hallucination, and global disruption.

Parameter calibration demonstrates transferability across domains, while bounded LLM adjudication refines ambiguous cases without incurring significant token overhead. The approach is generalizable and preserves efficiency under diverse planning backbones (flat versus DAG).

Future Directions

The selective, cost-sensitive communication regime outlined can be extended to broader domains in embodied AI, including robotics, simulation, and real-time collaborative planning. Leveraging further advances in structured memory, protocol design, and hybrid determinism/semantics could reinforce cross-domain transfer and enable robust scaling as agent teams and task complexity increase. Adaptive parameterization, learning-based gating, and integration with hierarchical/multi-modal memory architectures are promising avenues.

Conclusion

The paper establishes that collaborative efficacy in long-horizon MLLM-driven multi-agent systems depends critically on selective communication, controlled by a partitioned information architecture and multi-tiered gating mechanisms. Rather than reflexively resorting to global coordination, agents are empowered to autonomously adjudicate trade-offs between local recovery and collaborative escalation, leading to superior robustness, efficiency, and task completion in resource-constrained, dependency-intensive environments.

Markdown Report Issue