150,000 Lines of Vibe Coded Elixir: The Good, The Bad, and The Ugly

A practical case study from BoothIQ, a trade show badge scanner company that uses AI to write 100% of its Elixir codebase. The analysis reveals that Elixir's small, terse, and immutable nature makes it exceptionally AI-friendly, but the AI still struggles with architectural decisions, defensive coding patterns, and understanding concurrent test isolation. The productivity gains are substantial, but human oversight remains critical for codebase coherence.

BoothIQ, a universal badge scanner for trade shows, has built a production Elixir codebase of 150,000 lines written entirely by AI. This isn't a theoretical experiment—it's a live system handling real-world event data. The team's experience offers a concrete look at what works, what doesn't, and where the boundaries of current AI coding assistants still lie.

The Good: Elixir's Structure Plays to AI's Strengths

Elixir is Small: It Gets It Right the First Time

Elixir's design philosophy works in AI's favor. The language has a small surface area: few operators, a compact standard library, and limited control flow patterns. Unlike .NET or Java, where functional and object-oriented paradigms compete for space, Elixir has one clear way to solve most problems. This matters because AI agents struggle with architectural decisions. When there's only one idiomatic approach, the model doesn't waste context tokens debating between OOP patterns and functional composition.

This advantage compounds when adding AI to an existing codebase. In languages with decades of paradigm shifts—like JavaScript's evolution from callback hell to promises to async/await—AI models try to match inconsistent legacy code. Elixir's consistency means the AI encounters fewer conflicting patterns, reducing the cognitive load required to produce coherent code.

Elixir is Terse: Longer Sessions, Fewer Compactions

Small language design and syntactic terseness are related but distinct advantages. Elixir achieves both. While Go is small conceptually, its explicit error handling and verbose syntax burn tokens quickly. Elixir's pattern matching, pipe operator, and lack of braces or semicolons mean more functionality per token.

This directly impacts AI coding sessions. Context windows remain a hard constraint, and Elixir's token efficiency allows longer working sessions before the model needs to summarize and forget earlier context ("compactions"). When building the React Native version of their app, the team hit compactions constantly—JavaScript's token-heavy nature made it difficult to maintain context across iterations. Even in Elixir's LiveView templates, where HTML and HEEx markup add token weight, the experience is less painful than JavaScript-heavy work.

Tidewave: Longer Unassisted Runs

Tidewave, an Elixir-specific AI context tool, significantly improves the AI's ability to work autonomously. It gives the agent direct access to running application logs, database queries, Ecto schemas, and package documentation. This reduces hallucinations and enables longer unassisted coding sessions. The AI can validate its own assumptions without human intervention, checking logs and database state to confirm its changes work as expected.

Immutability: Fewer Decisions, Less Code

Elixir's immutable data structures eliminate a major source of AI complexity. In mutable languages, a variable can change after a function call, forcing the AI to track three problems simultaneously: the feature implementation, potential mutation side effects, and the evolving state of the data itself. This leads to defensive code—excessive validation checks and if-statements that wouldn't exist in an immutable context.

In Elixir, data is what it is. It doesn't change. The AI writes cleaner, more direct code because it doesn't need to guard against unexpected mutations. This reduces the decision space and results in fewer lines of code for the same functionality.

Frontend: Higher Quality, Less Time

For UI work, AI demonstrates clear advantages. High-level prompts like "give the top section more padding" are executed faster and often better than manual coding. The AI excels at modifying or moving large chunks of page structure, implementing mobile-first views, and making design decisions. The quality floor rises significantly—there's no hiding behind "I'm not a designer" anymore.

Git Worktrees: Parallel Development

The team uses three git worktrees to work on multiple features simultaneously: one primary feature, a secondary feature, and a third for quick fixes or experiments. This parallelization maximizes productivity, though context switching becomes the bottleneck beyond three concurrent worktrees.

The Bad: Where AI Still Needs Human Guidance

AI Can't Organize: Architecture Is Still on You

While AI excels at generating lines of code, it performs poorly at architectural decisions. It defaults to creating new files unnecessarily, repeats existing code, and introduces inconsistencies. This leads to the "mess" commonly described in vibe-coded projects as they grow. Human oversight remains essential for structural decisions—where code should live, how modules should interact, and when to refactor versus add new files.

Trained on Imperative: It Writes Defensive Code

Most AI models are trained on imperative languages like Ruby, Python, JavaScript, and C#. Elixir's syntax resembles Ruby, so Claude often writes Ruby-style Elixir: if/then/else chains, defensive nil-checking, and early returns that don't align with functional programming principles.

Elixir encourages assertive pattern matching and letting processes crash—knowing the supervisor will restart them in a clean state. This paradigm is foreign to models trained on defensive imperative code. The situation improves as the codebase grows and the AI encounters more assertive patterns, but it still defaults to defensive style. Regular correction and strict enforcement of idiomatic Elixir are necessary.

Git Operations: Keep It Out of Context

Every git operation—checking status, writing commit messages, describing PRs—consumes valuable context window space. These messages go stale quickly; a commit message from 20 minutes ago becomes worthless after three more changes. The team commits manually when babysitting a feature, using cheap, fast version control that doesn't burn context.

Claude Code's "checkpoints" feature offers internal version control without explicit commits, which is better than having the AI manage git directly. It protects vibe coders without consuming context or creating stale metadata.

The Ugly: Concurrency and Test Isolation Break the Model

OTP and Async: It Chases Ghosts

Claude is essentially useless for debugging OTP, Task, or asynchronous issues. It doesn't understand how processes, the actor model, and GenServers interact. When attempting to introspect the running system, it feeds itself bad data and gets thoroughly lost. It can course-correct when a human points out the error, but on its own, it chases ghosts—following false leads and making incorrect assumptions about system state.

Ecto Sandbox: It Chases Red Herrings

In Elixir tests, each test runs in an isolated database transaction that rolls back at completion. Tests run asynchronously without interfering with each other—no test data persists. Claude doesn't understand this isolation.

When a test fails, Claude queries the development database via Tidewave, finds nothing, and concludes there's a data problem. It has attempted to seed the test database to make tests pass—a fundamentally wrong approach. When two tests insert or query the same schema, Claude doesn't grasp transaction isolation and incorrectly recommends disabling async tests entirely. This behavior is manageable once you recognize the pattern, but it represents a significant gap in the AI's understanding of concurrent systems.

Bottom Line: Productivity Gains with Managed Friction

Writing 100% of the code with AI has been a massive productivity win. The friction points are real but manageable—they don't interfere significantly with day-to-day work. The most critical success factor is maintaining a consistent, coherent codebase architecture. Without human architectural oversight, the codebase quickly devolves into spaghetti code.

The team's goal for the year is to automate themselves out of a job: giving the AI control over the entire software development lifecycle, from problem statement to fully tested, working PRs that only require a quick human review before merging and deployment. The current state shows this is ambitious but increasingly plausible, provided the human stays in the loop for structural decisions and understands the AI's limitations around concurrency and system-level debugging.

For teams considering similar approaches, Elixir's design makes it uniquely suitable for AI-assisted development, but success requires accepting that AI is a powerful coding assistant, not an autonomous software engineer. The human must remain the architect, the quality gate, and the debugger for complex system interactions.