Software architecture is entering an 'oil and water' moment where deterministic systems must coexist with non-deterministic AI behavior, and traditional guardrails no longer apply automatically.

The Oil and Water Moment in AI Architecture

The Fundamental Tension

Have you ever tried mixing oil and water? You can shake the jar hard enough to make it appear unified. For a moment, it blends. Then it separates.

This chemistry captures where software architecture stands today.

For decades, our systems were fundamentally deterministic. Given the same input, the system produced the same output. Even during the shift to cloud native architectures, that assumption remained intact. Microservices, containers, distributed systems and elastic infrastructure increased complexity but did not alter the procedural nature of execution.

The old and the new could coexist because both followed deterministic rules.

Artificial intelligence alters that foundation. We are now integrating probabilistic, non-deterministic systems into deterministic environments. These systems generate responses, infer intent, select tools dynamically and adapt based on contextual signals. Their outputs may vary even when inputs remain similar. The underlying execution path may not be explicitly defined in code.

The tension is not about model performance. It is about architectural assumptions.

The central argument of this article is like this: AI architecture is not about tools. It is about intent under non-determinism.

In my recent book, Dear Software and AI Architect, I examine this shift in depth. This article introduces one core idea from that work: the architectural implications of mixing deterministic systems with probabilistic intelligence.

Where Traditional Guardrails Fall Short

Guardrails in deterministic systems are explicit and static. For instance:

Input validation enforces structure
Access controls regulate permissions
Rate limits constrain usage
API contracts define boundaries
Workflow engines encode business rules

These controls assume predictable execution paths and predefined branches.

AI-enabled systems, particularly those using agents and tool orchestration, operate differently. A single user request may trigger retrieval across multiple knowledge sources. The model may synthesize context from structured and unstructured inputs. An agent may select one of several tools. Tool outputs may be evaluated and used to generate further actions. The system evolves through composition rather than linear progression.

Even when each component is individually constrained, the composition of components may not be fully anticipated. An agent might chain tools in a sequence that no engineer explicitly designed. Context retrieved for one task may influence a subsequent decision in unintended ways. Policy constraints applied at the component level may not capture risks that emerge at the system level.

This does not represent failure in the traditional sense. The system behaves within probabilistic boundaries. However, it may operate outside architectural expectations. That is what is meant by the oil and water moment.

What Is Changing Structurally

The current discourse around artificial intelligence often suggests that everything is changing. That is inaccurate. Precision matters. Certain structural dimensions are evolving rapidly, while others remain foundational.

One major change is the expansion of the decision surface. In deterministic systems, decision trees are encoded directly in logic. Engineers can trace branches through code paths and reason about execution states. In AI systems, decision boundaries are distributed across model parameters, prompt structures, retrieval scope and policy constraints. The system behaviour is influenced by statistical inference rather than explicit conditionals. This makes it more difficult to enumerate all possible execution states in advance.

Architects must therefore design for new classes of risk. Prompt injection attacks attempt to override model instructions. Context integrity must be protected to prevent contamination across retrieval sessions. Tool misuse must be constrained to avoid unintended side effects. Failure fallback strategies must be defined when probabilistic outputs exceed acceptable variance. Evaluation loops must be incorporated to assess whether generated responses align with architectural intent.

Observability also changes in character. Traditional observability focuses on latency, error rates, throughput and resource utilisation. These metrics remain necessary but insufficient. AI-native systems require behavioural observability. Engineers and architects must be able to trace prompt lineage, identify context sources, analyse output variance, detect drift and calibrate confidence levels. The system must be observable not only in terms of availability, but also in terms of decision quality.

Governance shifts from being primarily a design time activity to a continuous runtime discipline. Policies must adapt to evolving model behaviour. Output classification layers must evaluate generated content before external exposure. Escalation triggers must activate when risk thresholds are crossed. Model version management must allow rollback without destabilising dependent systems. Prompt updates may be required in response to emerging edge cases. Governance becomes adaptive rather than static.

What Remains Foundational

While structural complexity increases, foundational principles endure. Systems thinking becomes more important, not less. Dependencies across data pipelines, model inference layers, agent orchestration frameworks and legacy integration points create feedback loops that amplify small design flaws. Architects who reason in isolated components will struggle. Those who understand system level interactions will remain effective.

Technical communication gains strategic significance. Probabilistic systems introduce ambiguity. Stakeholders must understand uncertainty budgets, acceptable risk envelopes and behavioural variance. Architects must translate model behaviour into business implications and compliance considerations. In my book, I describe this capability as Architectural Transcoding, the disciplined practice of converting technical ambiguity into organisational clarity.

Learning discipline also remains essential. Tools evolve rapidly. Model capabilities improve. Frameworks are rebranded and extended. However, foundational competencies such as pattern recognition, contextual reasoning, ethical evaluation and trade-off analysis retain long-term value. Architects who anchor themselves in principles adapt more effectively than those who pursue tooling alone.

The Architect's V-Impact Canvas

To operate under non-determinism, architectural intent must be explicit and structured. The AI Architect V-Impact Canvas provides a framework for aligning intelligent systems with purpose and accountability. It consists of three interdependent layers:

Architectural Intent

The Architectural Intent layer defines non-negotiable principles, acceptable behavioural variance and ethical constraints. It articulates why the system exists and what boundaries must never be crossed. In AI systems, intent must be translated into prompt architecture, agent objective framing, policy definitions and evaluation criteria. Agents optimise toward defined objectives. If intent is vague, optimisation may produce undesirable outcomes. Intent functions as a stabilising centre.

Design Governance

The Design Governance layer addresses autonomy trade-offs. The more autonomy granted to an agent, the more precisely architectural intent must be specified. Agentic systems often require broader contextual access, cross-system visibility and persistent memory. These features enhance performance but increase privacy exposure and compliance complexity. Architects must design context scoping boundaries, memory segmentation strategies, data minimisation mechanisms, runtime output filters and escalation pathways. Governance becomes an architectural capability rather than a procedural checklist.

Impact and Value

The Impact and Value layer ensures that intelligence produces measurable outcomes. AI experimentation can generate impressive demonstrations, but sustainable architecture requires economic discipline. Architects must evaluate measurable improvements in decision quality, operational stability implications, cost per inference relative to business gain and long-term trust impact. AI economics are dynamic. Inference cost, retrieval infrastructure, model lifecycle management and compliance oversight alter financial profiles. Impact must be quantified rather than assumed.

Token and Context Economics

In traditional systems, when high-level programming languages emerged, the dominant economic lever was hardware. Servers, infrastructure procurement, and supply chains defined architectural decisions. Later, cloud computing shifted that burden to operational expenditure as cloud providers took responsibility for infrastructure and scaling.

In the AI era, however, a new architectural lever has emerged that directly shapes impact and value: token and context engineering.

These concepts may appear simple at first glance, yet today they sit at the centre of how LLMs, SLMs and other frontier models operate. To illustrate, models such as GPT-4 operate within a finite context window, meaning there is a maximum amount of text they can process at one time. Modern GPT-4 (Turbo) variants support roughly 128,000 tokens of context. A useful approximation is that 1 token ≈ 4 characters or about 3/4th of a word, meaning 128,000 tokens correspond to roughly 90,000-100,000 words of total input and output combined.

This capacity is shared across the entire interaction: system prompts, user questions, retrieved documents, tool outputs and the model's response must all fit within that limit.

Consider a simple architectural example. Suppose a RAG system sends the following to the model:

System instructions: ~1,000 tokens
Chat history: ~2,000 tokens
Retrieved documents: 6 chunks × 1,500 tokens = 9,000 tokens
User question: ~200 tokens

This already consumes roughly 12,200 tokens of context before the model even begins generating an answer. If the response itself requires another 1,000 tokens, the total context becomes 13,200 tokens for a single query.

Now imagine scaling this across thousands of requests per hour. Token usage directly translates into cost, latency and model behaviour. If architects overload the context with irrelevant retrieval chunks or poorly structured prompts, then the important information may be diluted and the model may respond inaccurately despite technically having the "right data".

The implication is that AI architecture introduces a new optimisation discipline. Just as cloud architects learned to manage compute utilisation and storage efficiency, AI architects must now design systems that carefully shape context: retrieving fewer but more relevant documents, summarising prior interactions, and structuring prompts to maximise signal within limited token budgets.

Now imagine if this all needs to be designed for agents. Then that's another level.

The principles of architecture remain unchanged, but new abstractions introduce new ways, impact and value levers. Token and context engineering are therefore not minor implementation details, they are emerging as central mechanisms through which architects influence reliability, cost efficiency and business outcomes in AI-driven systems.

The Architect's V Impact Canvas aligns intent, governance and measurable value into a coherent discipline. The complete framework and applied case discussions are explored in greater depth in my book. This article introduces you to the conceptual foundation.

The Tool Obsession Risk

The industry is moving quickly. New models, orchestration frameworks and evaluation toolkits appear regularly. Industry may feel pressure to adopt the latest release to remain relevant. However, tool adoption without architectural maturity increases fragility. Frequent model switching can introduce behavioural inconsistency. Rapid framework integration can create governance gaps. Feature expansion without intent clarity can increase technical debt.

Tools accelerate capability. They do not provide stability. Stability emerges from explicit intent and disciplined governance.

The Legacy Integration Question

Most enterprises operate within legacy environments. Deterministic workflows, static compliance models and procedural approval chains were not designed for adaptive intelligence. When AI capabilities are layered on top of these systems without re-evaluating underlying assumptions, friction emerges. AI recommendations may exceed what legacy systems can execute. Agents may lack access to full historical context. Compliance controls may conflict with adaptive behaviour.

Architects and technology leaders must decide whether to modernise architectural intent to accommodate intelligence or constrain AI capabilities to fit legacy boundaries. This decision is strategic rather than technical and may take us back to the question in that particular context - Are we trying to mix oil and water here? If the answer is yes then step back.

From Insight to Impact

Mixing deterministic systems with probabilistic intelligence is not optional. It is already underway. Without a stabilising mechanism, separation and operational streamlining the tension are inevitable. Intent serves as that mechanism. Clear principles, explicit objectives, adaptive governance and measurable impact anchor non-deterministic systems within accountable boundaries.

Artificial intelligence does not remove architectural responsibility. It intensifies it.

Architects are entering a new design reality where deterministic software systems must coexist with non-deterministic intelligence. Understanding how to anchor this shift in intent, governance and architectural fundamentals will define the next generation of AI architects.

As I wrote - "Architects must lead with clarity, build architecture with context and thrive in an AI driven engineering world". The tools will evolve. The models will improve. Intent is what will hold the system together.

What do you think?

About the Author

Shweta Vohra is a Lead Architect, author of Decoding Platform Engineering Patterns and Dear Software & AI Architect, and an inventor committed to bringing clarity where there's noise and wisdom where there's hype. With two decades in technology and partnerships across 50+ global customers, she blends architecture, platform innovation, and human-centred design to drive meaningful change. Her expertise spans cloud-native architecture, platform strategy, and architectural excellence. She advances industry thought leadership through InfoQ podcasts, reports, and global forums, exploring the future of cloud, DevOps, architectural evolution, and responsible AI.

This content is in the Architecture topic

Related Topics: ARCHITECTURE & DESIGN, AI, ML & DATA ENGINEERING, GOVERNANCE, AI ARCHITECTURE, ENTERPRISE ARCHITECTURE, ARCHITECTURE SYSTEMS THINKING

#AI_Architecture #Token Economics #Governance #Architectural Intent #Non-Determinism

The Oil and Water Moment in AI Architecture

The Oil and Water Moment in AI Architecture

The Fundamental Tension

Where Traditional Guardrails Fall Short

What Is Changing Structurally

What Remains Foundational

The Architect's V-Impact Canvas

Architectural Intent

Design Governance

Impact and Value

Token and Context Economics

The Tool Obsession Risk

The Legacy Integration Question

From Insight to Impact

Comments