Why “Vibe Coding” Falls Short of Real Engineering
#Dev

Why “Vibe Coding” Falls Short of Real Engineering

AI & ML Reporter
5 min read

LLM‑generated snippets can spin up a login page in minutes, but without the systematic decisions that engineers make—defining invariants, handling edge cases, and modeling failure modes—those snippets quickly become fragile. This article breaks down what engineering adds beyond raw code, shows where AI‑generated output typically fails, and offers a practical workflow for integrating LLM assistance without sacrificing system integrity.

Why “Vibe Coding” Falls Short of Real Engineering

Featured image

What the hype promises

A large language model can produce a Flask app that lets users sign up and log in in a handful of seconds. The output looks complete: routes, a SQLite schema, and a few HTML templates. For a demo, that is impressive.

What engineering actually delivers

Before the first line of Python is typed, engineers answer a set of questions that shape the whole system:

Phase Typical questions
Problem framing Who are the users? What are the business goals?
Requirements engineering Which invariants must hold? What are the edge cases?
System modelling How does data flow? What state transitions exist?
Architectural design Where are the boundaries? What are the failure modes?
Non‑functional specs What latency, reliability, and security levels are required?
Risk identification Which dependencies could break the system?
Interface design What contracts do components expose?
Planning How is work broken into deliverable units?

These steps happen before any code appears. They are the decisions that keep a system coherent when it meets real traffic, regulatory scrutiny, or evolving feature sets.

The engineering work that “vibe coding” skips

Decision area Engineer’s answer LLM output
Invariants Email addresses must be unique across the user table. No check for uniqueness; duplicate rows are possible.
Identity rules A user is identified by a UUID, not by mutable fields. Uses email as the sole identifier.
Constraints Passwords must meet complexity requirements. Generates a plain‑text password field without validation.
Failure modes Account lockout after repeated failed logins. No handling of brute‑force attempts.
Coupling & sequencing Registration must precede email verification. Sends a welcome email before the account is persisted.
State transitions Account can be active, suspended, or deleted. Only a single boolean active flag.
Interfaces & contracts API returns standardized error objects. Returns raw exception messages.
Boundaries System never stores passwords in clear text. Stores raw passwords in the SQLite DB.
Error handling All database errors are logged and mapped to user‑friendly responses. Uncaught exceptions crash the Flask process.

Missing any of these decisions can turn a working demo into a production nightmare.

A concrete example: missing uniqueness

Prompt: “Add user accounts to a website so people can log in.”
The model returns a Flask app with a users table that has a plain email column but no UNIQUE constraint. If two users sign up with the same address, the database stores both rows. Later, a password‑reset request that filters by email will affect both accounts, potentially locking out legitimate users and violating privacy regulations. The problem only surfaces when the system is used, not when the code is first run.

Why AI‑generated code stalls in production

  1. Hidden assumptions – The model assumes a “happy path” where inputs are well‑formed and unique.
  2. No domain model – It does not build an internal representation of entities, relationships, or business rules.
  3. No risk analysis – Threats such as injection attacks, race conditions, or data loss are never considered.
  4. No contract enforcement – Generated APIs lack versioning, schema validation, or clear error semantics.
  5. No performance budgeting – The code may make a blocking DB call on every request, which becomes a bottleneck under load.

When these gaps are later filled by hand, developers spend time retrofitting checks, rewriting data models, and adding layers of validation—exactly the work that should have been done up front.

A practical workflow that combines LLM speed with engineering rigor

  1. Define invariants and constraints in a short checklist before you ask the model to write code. Capture them in a Markdown file or a ticket.
  2. Prompt with explicit requirements – include the invariants, identity rules, and failure‑mode expectations in the prompt. Example: “Generate a Flask registration endpoint that enforces unique email addresses, stores passwords with bcrypt, and returns JSON error objects on validation failure.”
  3. Review the generated schema – verify that database constraints (e.g., UNIQUE, NOT NULL) match the checklist.
  4. Add automated tests that target the edge cases you listed. Tests become the safety net for anything the model missed.
  5. Integrate static analysis (e.g., bandit for security, mypy for type safety) into the CI pipeline.
  6. Iterate – if the model fails to satisfy a requirement, refine the prompt or write the missing piece manually.

By treating the LLM as a code‑completion tool rather than a substitute for design, teams keep the speed advantage while preserving system integrity.

Bottom line

LLM‑generated snippets are great for prototypes, learning, or scaffolding. They do not replace the systematic decisions that make a system safe, maintainable, and scalable. The real value of AI in software lies in augmenting engineers—handling boilerplate, suggesting patterns, and surfacing alternatives—while the engineer remains responsible for defining invariants, constraints, and failure handling.


Further reading

  • The Big AI Gains Come From Teams, Not Individuals – explains how collaborative workflows amplify productivity.
  • Agents Cannot Maintain Systems – a look at why autonomous agents still need human oversight.
  • Latency Is Architectural – discusses how performance considerations belong in the design phase, not after code is written.

If you found this analysis useful, consider subscribing to the Phroneses newsletter for more deep dives into AI‑assisted engineering.

Comments

Loading comments...