Vibe Coding: When the Code Becomes Invisible
#Security

Vibe Coding: When the Code Becomes Invisible

Backend Reporter
5 min read

Vibe coding lets users build applications by prompting LLMs and never looking at the generated code. The practice speeds up throwaway projects but introduces serious maintainability, correctness, and security risks. This article breaks down the technique, compares it to agentic programming, and evaluates the trade‑offs for scalability, consistency, and API design.

Vibe Coding: When the Code Becomes Invisible

Featured image

Published 21 May 2026 – by Martin Fowler
Topic: Generative AI, software engineering, security


The problem: rapid prototyping without a code view

Developers have been using large language models (LLMs) to write code for a few years now, but most still open the diff, skim the changes, and decide whether to merge. Vibe coding pushes the process one step further: a user describes a feature, accepts the LLM’s suggestion, and never inspects the resulting source. The term was coined by Andrej Karpathy in early 2025 and has since spread on social platforms.

At first glance the workflow looks attractive:

  1. Prompt – “Add a dark‑mode toggle to the sidebar.”
  2. Run – The LLM (e.g., Cursor Composer with Sonnet) returns a patch.
  3. Accept – Click Accept All without reading the diff.
  4. Iterate – When an error appears, copy‑paste the error message back into the prompt.

For a weekend hackathon or a personal dashboard, this can shave hours off the development cycle. The user never needs to learn syntax, type a line of code, or manage a build system.

Why it matters: hidden costs at scale

The convenience comes with hidden costs that become apparent when the software moves beyond a single‑user sandbox.

1. Maintainability collapses

When code is never examined, it quickly devolves into a tangled web of autogenerated snippets. Each new request adds another layer of indirection, and the resulting repository often resembles a massive “spaghetti bowl”. Even an LLM struggles to refactor such code because the model has no reliable notion of the project's architectural intent.

Rule of thumb: well‑structured code is easier for both humans and LLMs to evolve. The moment you stop treating the code as a first‑class artifact, you lose that advantage.

2. Correctness becomes probabilistic

LLMs are non‑deterministic; the same prompt can yield different implementations on successive runs. They also hallucinate—they may generate APIs that look plausible but never exist, or they may misinterpret a requirement and produce silent bugs. Because the user never validates the diff, these defects can propagate silently.

3. Security surface expands dramatically

A LLM‑generated project often includes:

  • Hard‑coded secrets that were copied from example snippets.
  • Over‑permissive CORS or authentication settings.
  • Dependencies with known vulnerabilities because the model pulls the latest version without checking CVE databases.

When the author does not audit the code, these issues remain undiscovered until an external attacker exploits them. The Lethal Trifecta—poor quality, hidden secrets, and unchecked dependencies—makes vibe‑coded apps a prime target.

Solution approach: disciplined “visible” LLM assistance

Instead of discarding the code entirely, a pragmatic middle ground is to treat the LLM as a code‑assistant that remains under human supervision. The workflow looks like this:

  1. Prompt for a small, self‑contained change.
  2. Review the diff (even a quick glance at the affected files). Use tools like git diff --stat to see the scope.
  3. Run static analysis (e.g., eslint, bandit, or go vet).
  4. Run a security scanner such as GitHub Dependabot or Snyk.
  5. Merge only after automated checks pass.

This pattern preserves the speed advantage of LLMs while re‑introducing a safety net. It also aligns with established API design practices: each LLM‑generated change should expose a well‑defined contract (e.g., a new REST endpoint or GraphQL mutation) that can be validated with integration tests.

Trade‑offs and scalability implications

Aspect Pure Vibe Coding Assisted LLM Workflow
Speed Highest – no review step. Slightly slower – diff review + CI.
Scalability Poor – code quality degrades with size; future changes become costly. Good – automated checks keep repo health; large teams can share the same CI pipeline.
Consistency model Implicit, relies on LLM’s internal stochasticity. Explicit – tests and type systems enforce eventual consistency.
Security High risk – secrets and vulnerable deps hidden. Lower risk – scanners catch most issues before merge.
Developer skill requirement Near zero for simple prototypes. Still low for basic use, but requires understanding of CI pipelines and test suites.

When a project needs to scale horizontally—for example, a SaaS product serving thousands of users—the assisted workflow is the only viable option. The consistency model shifts from “trust the model” to “trust the test suite”.

API patterns that survive vibe coding

Even a vibe coder can benefit from stable API boundaries. If every LLM request targets a single well‑documented endpoint, the rest of the system can remain insulated from the model’s whims.

  • Command‑style prompts: POST /generate‑patch with a JSON payload describing the desired change.
  • Versioned contracts: Include a schemaVersion field so the LLM knows which data model to target.
  • Idempotent operations: Design prompts that produce the same output given the same input, reducing nondeterminism.

These patterns make it possible to automate regression testing: a CI job can replay the same prompt against the latest codebase and compare the diff against a stored golden file. If the diff diverges, the pipeline flags a potential regression.

When to accept vibe coding, when to reject it

Scenario Recommended approach
Personal script, one‑off data transformation Pure vibe coding is acceptable; run the script locally and discard after use.
Prototype for a pitch deck, limited audience (≤5 users) Assisted workflow: quick review, basic linting, no heavy security scanning.
Internal tool that handles non‑sensitive data but will be maintained for months Assisted workflow with full test coverage and dependency scanning.
Public‑facing service, handling authentication or PII Reject vibe coding; use conventional development practices with code reviews.

Looking ahead: how model improvements could shift the balance

If future LLMs achieve deterministic generation (same prompt → identical output) and formal verification of generated code, the risk profile of vibe coding could improve. Projects like OpenAI’s Codex already expose a type‑aware generation mode, but they still rely on the developer to validate the result.

Until such guarantees exist, the safest path is to keep the code visible, treat the LLM as an assistant, and let automated tools enforce the invariants that humans no longer wish to check manually.


Martin Fowler is a software architect and author who writes about the practical implications of emerging technologies. Follow his thoughts on martinfowler.com.

Comments

Loading comments...