Developers bloat agent files and waste coding-tool tokens

Brazilian researchers found clutter in agent instruction files across 91 of 100 open-source projects and urged maintainers to move lint rules and task-specific guidance out of the main context file.

Brazilian software engineering researchers found agent configuration problems in 91 of 100 popular open-source projects that use AGENTS.md or CLAUDE.md files. The researchers give maintainers a cleanup list for files that coding agents read before they edit code.

Helio Victor F. dos Santos, Vitor Costa, Joao Eduardo Montandon, Luciana Lourdes Silva, and Marco Tulio Valente posted the preprint Configuration Smells in AGENTS.md Files: Common Mistakes in Configuring Coding Agents June 14. They mined about 532,000 files, then examined 100 popular repositories that committed one of the two agent instruction files.

Developers use AGENTS.md and CLAUDE.md to tell tools such as Claude Code and Codex how a repository works. A good file gives an agent test commands and ownership boundaries. It also spells out project conventions that developers need to state in prose.

A team bloats the file when it forces the model to carry guidance that a script, linter, or human reviewer can handle with more precision. The researchers borrowed the software term code smell for that failure mode. They cataloged six configuration smells and found that many teams mixed several in the same file.

Lint leakage topped the list. The researchers found it in 62% of the files. A team creates lint leakage when it tells the agent to follow rules that tools such as ESLint and Prettier enforce. The agent spends tokens reading a rule that the repository can verify through commands.

Context bloat followed at 42%. Developers create context bloat when they load long histories or broad process notes into each agent session. The model has to carry weak signals beside the instructions it needs for the current edit.

Skill leakage appeared in 35% of the files. A team creates skill leakage when it keeps instructions for a seldom-used tool or workflow in the main file. The authors argue that maintainers should move those instructions into task-specific skill files or docs that the agent can read when the task requires them.

The researchers also identify blind references and old setup debris. Developers create a blind reference when they send the agent to another document or URL without saying which task requires it. Developers create init fossilization when they leave stale scaffolding instructions in place after the project changes.

The researchers use conflicting instructions for files that give the agent incompatible commands, such as asking it to avoid a tool in one section and require that tool in another. Users feel the issue once a coding agent changes a file. An agent that misses the right test command can ship a regression into a pull request.

An agent that follows stale architecture notes can add code in the wrong package. A developer then spends the review finding the instruction bug and the code bug. Teams pay for that clutter each time an agent loads the file.

The Minas Gerais researchers join ETH Zurich researchers in warning that repository context can cost more than it returns. Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, and Martin Vechev posted Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? in February. They found that context files increased inference cost by more than 20%, with mixed performance results.

Gloaguen's team reached the same practical advice: maintainers should write minimal requirements and leave verifiable rules to tools. Team leads can treat AGENTS.md like executable infrastructure. A useful file should answer two questions: which commands should the agent run, and which project constraints require human explanation?

Test commands and permission limits earn space. Package boundaries earn space when agents cross modules. General advice about clean code belongs in engineering docs or team training.

Engineers should remove anything that another system enforces. Put formatting rules in the formatter config. Put static checks in CI. Engineers should point the agent to the command when the agent must run it.

Maintainers need a review habit. A scheduled check can catch stale setup notes, dead URLs, and contradictions before agents absorb them. Pull requests that touch AGENTS.md should receive the same scrutiny as CI config because the file changes how an automated contributor behaves.

A short file can cause harm if it gives the wrong instruction. The study argues for precise configs that tell the agent the facts it cannot infer and the commands it must run.

Security teams should care as agent use spreads. Agents can edit authentication code and deployment scripts. An agent can follow a stale instruction into sensitive code, then leave reviewers to catch the mismatch after it has produced convincing code.

The fix requires discipline rather than a new framework. Maintainers should put durable project facts in AGENTS.md. They should put tool manuals and edge-case workflows somewhere else, then review the file when the repo changes.

Developers bloat agent files and waste coding-tool tokens

Comments