Wider or Deeper? Adaptive Branching for Smarter LLM Reasoning

This presentation explores how large language models can reason more effectively during inference by dynamically choosing between generating new solutions and refining existing ones. The paper introduces Adaptive Branching Monte Carlo Tree Search (AB-MCTS), a method that uses external feedback to guide whether the model should explore wider or dig deeper at each step. Through experiments on competitive programming and machine learning tasks, the authors demonstrate that this adaptive approach consistently outperforms traditional repeated sampling and standard tree search methods, especially as computational budgets increase.
Script
When a language model faces a hard reasoning problem, should it generate more diverse solutions or refine the ones it already has? This fundamental tradeoff shapes how we scale inference-time computation, and the answer turns out to be: it depends.
Building on that tension, the authors identify a key limitation in how we currently use computational resources during inference. While repeated sampling lets models generate multiple independent answers, it ignores valuable feedback that could guide solution refinement.
The paper introduces a method that addresses this gap head-on.
At the heart of their approach is Adaptive Branching Monte Carlo Tree Search, which makes a dynamic choice at every step. Using Bayesian posterior updates, the method decides whether to generate entirely new responses or refine existing ones based on accumulated feedback.
This creates two distinct node types within the search tree. Generation nodes explore by creating fresh solutions, tapping into the model's broad output distribution, while continuation nodes exploit promising candidates through iterative refinement.
The algorithm follows a classic tree search pattern with an adaptive twist. After selecting a node, it expands using either direct generation or refinement, then propagates scores upward to inform future decisions.
The evidence speaks clearly in favor of adaptive branching. Across both GPT-4o and DeepSeek-V3, AB-MCTS consistently outperforms repeated sampling and standard tree search, with the performance gap widening as computational budgets increase. The adaptive decision-making allows the method to allocate resources more effectively, generating new solutions when exploration matters and refining candidates when exploitation pays off.
The authors tested their approach on challenging real-world benchmarks. In competitive programming and machine learning engineering tasks, AB-MCTS demonstrated consistent improvements, particularly as the available computational budget scaled up.
That said, the approach has important constraints. Performance hinges on having reliable external feedback, and tuning the Bayesian priors adds complexity. Future work could develop more robust scoring methods and explore applications in domains beyond direct answer generation.
The adaptive branching framework shows us that the question is not whether to go wider or deeper, but when to do each. Visit EmergentMind.com to explore more cutting-edge research on scaling inference-time computation.