MIT researchers have developed EnCompass, a framework that automatically adds search and backtracking capabilities to AI agent programs, reducing coding effort by up to 80% and improving accuracy by 15-40%.
When artificial intelligence tools become your daily assistants, you want them to be reliable, efficient, and adaptable. That's exactly what MIT researchers have achieved with EnCompass, a new framework that transforms how AI agents handle complex tasks by automatically managing search and backtracking when large language models make mistakes.

The Problem with Current AI Agents
AI agents are semi-autonomous software systems that call on AI at specific points to solve problems and complete tasks. They're particularly effective when using large language models (LLMs) because these systems are powerful, efficient, and adaptable. However, there's a significant challenge: when LLMs make mistakes, programmers must manually implement backtracking logic and parallel execution strategies.
Consider a software company trying to modernize its codebase by translating it from one programming language to another. You might build a system that uses an LLM to translate the codebase one file at a time, testing each file as you go. But what happens when the LLM makes an error? You need the agent to backtrack and make another attempt, incorporating lessons from previous mistakes. This backtracking logic can require thousands of lines of additional code—often as much effort as implementing the original agent.
How EnCompass Works
EnCompass solves this problem by separating the search strategy from the underlying workflow of an AI agent. When you run your program with EnCompass, it automatically backtracks if LLMs make mistakes and can clone the program runtime to make multiple attempts in parallel, searching for the best solution.
The framework works by allowing programmers to annotate specific operations—such as calls to an LLM—where results may vary. These annotations are called "branchpoints." If you imagine your agent program as generating a single plot line of a story, then adding branchpoints turns it into a choose-your-own-adventure story game, where branchpoints are locations where the plot branches into multiple future plot lines.
You can then specify the strategy that EnCompass uses to navigate that story game, searching for the best possible ending. This can include launching parallel threads of execution or backtracking to a previous branchpoint when you get stuck in a dead end. Users can plug-and-play common search strategies provided by EnCompass out of the box, or define their own custom strategy.
Real-World Impact and Results
The coding efficiency gains are substantial. When researchers applied EnCompass to an agent that translates a repository of code from Java to Python, implementing search with EnCompass required 348 fewer lines of code—about 82 percent less than implementing it by hand. The framework drastically cut down how much programmers needed to add to their agent programs to add search, helping them experiment with different strategies to find the one that performs the best.
In testing, the researchers identified the best strategy to be a two-level beam search algorithm, achieving an accuracy boost of 15 to 40 percent across five different repositories at a search budget of 16 times the LLM calls made by the agent without search.
Beyond Code Translation
While the initial demonstrations focused on code translation, the potential applications are much broader. In the future, EnCompass could enable agents to tackle large-scale tasks including managing massive code libraries, designing and carrying out science experiments, and creating blueprints for rockets and other hardware.
The framework targets agents where a program specifies the steps of the high-level workflow. The current iteration is less applicable to agents that are entirely controlled by an LLM, where the LLM itself decides everything on the fly. In those cases, there's less need for a tool like EnCompass that modifies how a program executes with search and backtracking.
The Research Team and Future Directions
Lead author Zhening Li '25, MEng '25, an MIT electrical engineering and computer science (EECS) PhD student and CSAIL researcher, emphasizes that "With EnCompass, we've separated the search strategy from the underlying workflow of an AI agent. Our framework lets programmers easily experiment with different search strategies to find the one that makes the AI agent perform the best."
Co-author Armando Solar-Lezama, an MIT professor of EECS and CSAIL principal investigator, notes that "As LLMs become a more integral part of everyday software, it becomes more important to understand how to efficiently build software that leverages their strengths and works around their limitations. EnCompass is an important step in that direction."
The research team, which includes Caltech Professor Yisong Yue and Asari AI CEO Stephan Zheng, plans to extend EnCompass to more general search frameworks for AI agents. They also plan to test their system on more complex tasks to refine it for real-world uses, including at companies.
Industry Recognition
The framework has already garnered attention from experts in the field. Carnegie Mellon University Professor Yiming Yang, who wasn't involved in the research, notes that "EnCompass arrives at a timely moment, as AI-driven agents and search-based techniques are beginning to reshape workflows in software engineering. By cleanly separating an agent's programming logic from its inference-time search strategy, the framework offers a principled way to explore how structured search can enhance code generation, translation, and analysis."
This work was presented at the Conference on Neural Information Processing Systems (NeurIPS) in December and was supported by Asari AI. As AI agents become increasingly integral to software development and other professional workflows, tools like EnCompass that make them more reliable and efficient will become essential building blocks for the next generation of intelligent systems.

Comments
Please log in or register to join the discussion