Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

This presentation explores how large language models can reason better during inference by strategically choosing between generating new solutions and refining existing ones. The authors introduce Adaptive Branching Monte Carlo Tree Search, a method that uses external feedback to guide whether to explore the vast space of possible answers or exploit promising directions. Through experiments on competitive programming and machine learning tasks, they demonstrate that this adaptive approach consistently outperforms traditional repeated sampling and standard tree search methods, especially when computational budgets increase.
Script
When a language model encounters a hard problem, should it generate dozens of different answers or focus on improving its best attempts? This question sits at the heart of making our AI systems reason more effectively.
Building on that tension, researchers have found that simply throwing more compute at inference time helps, but today's repeated sampling methods miss a crucial opportunity. They generate answers independently, never using feedback to guide refinement, leaving substantial reasoning capacity untapped.
The authors introduce a solution that fundamentally rethinks this exploration-exploitation tradeoff.
At each decision point, Adaptive Branching Monte Carlo Tree Search chooses dynamically. The method creates GEN nodes to explore entirely new solution paths or CONT nodes to refine existing answers, using Bayesian posterior updates to formalize which strategy will likely yield the best results given current feedback.
Connecting these exploration and exploitation paths, the algorithm operates in cycles. It selects promising nodes, expands them either by generating fresh content or refining existing solutions, then propagates scores back up the tree to inform future decisions, creating a feedback loop that guides the search intelligently.
These performance curves tell a compelling story. On competitive programming problems from Code Contest, the adaptive branching method consistently solves more problems than repeated sampling or standard tree search, and this advantage grows as the generation budget increases. The pattern holds across both GPT-4o and DeepSeek-V3, demonstrating that strategic refinement beats pure exploration when you have computational resources to invest.
Moving to the empirical evidence, the authors validated their approach across two demanding domains. The experiments reveal that adaptive branching excels precisely where you'd hope it would, extracting more value from additional computation and demonstrating that the vast output space of language models pairs powerfully with feedback-driven refinement.
That said, the approach faces real constraints. The method's effectiveness hinges on having a reliable score evaluator, and tuning the Bayesian framework introduces complexity that practitioners must navigate. The authors acknowledge these challenges and point toward developing more robust evaluation mechanisms as critical next steps.
Stepping back, this work illuminates a fundamental principle for making language models smarter at inference time. By formalizing when to explore and when to refine, the authors provide a blueprint for extracting maximum reasoning performance from computational budgets, suggesting that the future of capable AI systems may lie not just in larger models but in more strategic thinking during deployment.
Adaptive branching reveals that sometimes the best way forward is knowing when to dig deeper rather than cast wider. Visit EmergentMind.com to explore more cutting-edge research transforming how AI systems reason.