Artificial Intelligence Coding Is Turning Into Vibe Working: What Still Breaks
#AI

Artificial Intelligence Coding Is Turning Into Vibe Working: What Still Breaks

Backend Reporter
4 min read

AI-assisted coding has evolved into 'vibe working,' where agents execute multi-step tasks based on high-level intent. While enabling new productivity, this shift exposes critical reliability gaps requiring robust backend systems.

Featured image

The transition from AI-assisted coding to autonomous task execution represents a fundamental shift in human-computer interaction. What began as intelligent autocomplete has evolved into "vibe working"—a paradigm where users specify objectives in natural language, then iterate on outputs generated by AI agents. This pattern now extends beyond code generation into document creation, data analysis, and operational workflows.

The Rise of Vibe Working

Vibe working operationalizes IBM's concept of "vibe coding," where developers prioritize rapid prototyping through AI-generated code before refinement. As detailed in IBM's vibe coding overview, this approach values experimentation over initial optimization. Microsoft has expanded this pattern into general productivity with Agent Mode in Microsoft 365 Copilot, enabling users to generate complex documents and analyses through iterative prompting.

Three technical advancements enable this shift:

  1. Improved reasoning capabilities in large language models
  2. Extended context windows (128K+ tokens)
  3. Agent frameworks supporting multi-step planning

The Critical Trade-off: Effort vs. Systemic Risk

Vibe working trades manual creation time for operational risk management. When humans produce work, errors remain localized. When agents handle multi-step workflows—pulling data, updating documents, triggering actions—failures become systemic:

  • Provenance ambiguity: Difficulty tracing decision paths
  • Error propagation: Small mistakes cascade across connected systems
  • Expanded blast radius: Single failures impact downstream processes

The NIST AI Risk Management Framework provides essential scaffolding here, emphasizing measurable controls for accountability across the AI lifecycle. At the application layer, the OWASP Top 10 for LLM Applications details specific vulnerabilities like prompt injection and insecure plugin execution that become critical in agentic systems.

When Agents Succeed (and Fail)

Agent effectiveness correlates with constraint clarity:

High-success scenarios

  • Generating UI boilerplate
  • Drafting specifications from outlines
  • Summarizing known documents
  • Creating initial dashboard views

High-risk scenarios

  • Payroll processing
  • Account deletions
  • Production configuration changes
  • Permission modifications

The threshold emerges around user scale: Beyond 100-1,000 active users, systems require:

  • Action audit trails
  • Rate limiting
  • Idempotent retries
  • State reproducibility

Architectural Shifts for Agentic Systems

Traditional AI coding focused on model training and inference pipelines. Vibe working demands infrastructure for agent supervision:

Component Purpose Failure Consequence
Identity model Prevents privilege escalation Agents act with wrong permissions
State management Enables resumable workflows Lost context between steps
Artifact storage Manages files/logs/outputs Data loss or inconsistency
Background execution Handles long-running tasks Timeouts and partial failures
Real-time monitoring Provides execution visibility Unreproducible agent behavior

The Copilot Dichotomy

Confusion persists between similar-named tools:

The distinction isn't about capability but environment: GitHub Copilot integrates with development workflows, while Microsoft Copilot automates business artifact creation.

Beyond Context: The Memory Imperative

Many prototypes misuse context windows as makeshift memory—a fatal error in production systems. Context represents the current input; memory encompasses:

  • Historical interactions
  • User-specific data
  • Execution state
  • Audit trails

For example, a support agent requires:

  • Conversation history
  • Attachment storage
  • Escalation records
  • User consent tracking

Production Pathway

Transitioning from prototype requires incremental hardening:

  1. Boundary definition: Explicitly separate drafting, suggestion, and execution capabilities
  2. Identity foundation: Implement authentication before user growth
  3. State persistence: Store agent decisions as structured data outside LLMs
  4. Background processing: Convert long tasks into monitored jobs with retries
  5. Scale planning: Design for 10x user growth before launch

Resources like SashiDo's Getting Started Guide provide implementation patterns for these transitions.

Cost Predictability

Agentic workflows introduce unique cost risks:

  • Retry cascades
  • Storage bloat from generated artifacts
  • Integration call chains

Quota enforcement and infrastructure visibility prevent bill shocks. Transparent pricing models like SashiDo's pricing with explicit resource limits help teams scale predictably.

Reliability Patterns > Prompt Engineering

System design supersedes prompt optimization for production readiness:

  • Action isolation: Separate "draft" and "send" operations
  • Intent logging: Record agent objectives alongside outcomes
  • File governance: Treat outputs as versioned artifacts
  • Retry design: Implement idempotent retry mechanisms
  • Realtime updates: Provide progress streams during execution

These patterns prevent systemic failures when agents interact with real-world systems.

Tool Selection Principles

For indie developers and startups, prioritize:

  • Integrated infrastructure: Avoid stitching together authentication, storage, and compute services
  • Migration paths: Ensure data portability
  • Cost transparency: Predict expenses at scale
  • Recovery capabilities: Built-in backup/restore features

Comparisons like SashiDo vs Supabase help evaluate tradeoffs between specialized and general-purpose backends.

Language Considerations

While Python dominates AI research, production systems often combine:

  • Python for model interaction
  • JavaScript/TypeScript for application logic
  • Infrastructure-as-code for orchestration

The critical factor is API design cleanliness, not language dogma.

Conclusion

Vibe working represents a permanent shift in human-AI collaboration. Its success depends not on prompt engineering brilliance but on unglamorous foundations: auditable state management, strict permission boundaries, and predictable infrastructure. Builders who implement these systemic safeguards will ship agentic applications; others will remain trapped in prototype purgatory.

Further Reading:

Heroku Heroku's platform exemplifies managed infrastructure that can support AI agent workflows

Comments

Loading comments...