API Contract Drift Is a Distributed Systems Problem, and Nobody Builds for the Dev-Time Feedback Loop
#Backend

API Contract Drift Is a Distributed Systems Problem, and Nobody Builds for the Dev-Time Feedback Loop

Backend Reporter
7 min read

A former Fynd infrastructure engineer is shipping two small open source tools, api-watch and proto-lock, that target the exact failure modes that bite teams running gRPC and OpenAPI across many consumers: contracts that silently diverge from implementations, and protobuf field numbers that shift underneath you. The interesting part isn't the tools. It's where they sit in the workflow.

Featured image

Most contract validation tooling lives in the wrong place. It runs in CI, or it runs as a centralized schema registry that someone has to remember to update, or it runs as a linter that checks the spec file in isolation without ever looking at what the server actually returns. All of these catch drift after it has already been committed, sometimes after it has already shipped. A developer named Barath Nadar, who spent four years at ecommerce platform Fynd working on gRPC migrations, GraphQL federation, and marketplace integrations across Myntra, Amazon, Flipkart, Nykaa, and Tata Cliq, is building two tools that move the feedback loop earlier. They are worth looking at not because they are large, but because they target a specific and underserved point in the development cycle.

The problem: contracts and implementations are two independent sources of truth

An API contract, whether it is an OpenAPI document or a protobuf definition, is a promise. The implementation is a separate artifact that is supposed to honor that promise. Nothing structurally binds them. You can change the handler that serializes a response without touching the spec, and the spec will keep claiming the old shape. You can rename a field, make a nullable value required, or change an integer to a string, and no compiler will stop you, because the spec is just a YAML file sitting in a different directory.

This is fundamentally a consistency problem, and it has the same shape as the consistency problems you see in data systems. You have two replicas of a logical truth, the contract and the code, and writes go to them independently. Without a reconciliation mechanism, they diverge. The divergence is invisible until a reader, in this case a downstream consumer, observes the inconsistency. In a service with one consumer and one author, drift is annoying. In a marketplace integration layer where dozens of internal teams and external partners depend on your response shapes, drift propagates as production incidents, and the blast radius is whoever happened to parse the field you quietly changed.

Nadar's description of how this played out at Fynd is familiar to anyone who has worked at the infrastructure layer. You build an endpoint, you test it manually, it works, you ship it. Three weeks later someone notices the response does not match the OpenAPI spec, or a consumer breaks in production. The engineers were not careless. There was simply no feedback loop during development that would have surfaced the mismatch while it was cheap to fix.

The api-watch approach: validate live traffic against the spec, locally

api-watch takes a different tack from CI-based contract testing. It is a local CLI proxy. You point your HTTP client at it instead of directly at your dev server, and it sits in the middle, capturing every request and response that flows through. It validates that traffic against your OpenAPI spec in real time, and when you stop it, it generates a report.

The report answers three questions. Which endpoints did you actually call that are not documented in the spec. Which responses came back with a shape that did not match the declared schema. And, usefully, for the undocumented endpoints it auto-generates OpenAPI stubs you can paste straight into the spec. The design choice that makes this practical is the absence of ceremony. No config file, no CI pipeline to wire up, no code changes to the server. The proxy observes real traffic rather than asking you to write contract tests by hand.

There is a meaningful architectural distinction here. Schema validation that works off your test suite only ever sees the requests your tests generate. A proxy that captures your actual development traffic sees what you really exercised while building the feature, including the exploratory calls you made in Postman or curl. That is a wider net, and it is captured passively, which means the cost of using it is close to zero. The trade-off is that it only validates what you happen to exercise. It will not tell you about an endpoint you never called, the way a static analysis of the spec against route definitions might. The two approaches are complementary: passive runtime observation catches drift in the paths you touch, static checks catch the paths you forget.

pic

proto-lock: field numbers are mutable state, and that is the bug

The second tool comes out of a sharper, more specific failure. During the gRPC migration at Fynd, the team generated protobuf definitions from OpenAPI specs, and hit a production incident caused by protobuf field numbers shifting.

If you have not worked with protobuf wire encoding, here is why this is dangerous. In protobuf, the field number, not the field name, is what gets written on the wire. A message encodes field 3 as a varint tag derived from the number 3, and the decoder uses that number to know which field it is reading. Field names exist only in the schema for human readability; they are erased at the wire level. This is what makes protobuf forward and backward compatible: you can rename a field freely because the name is not transmitted. But the corollary is the trap. If field number 3 means user_id in the producer and field number 3 means account_id in the consumer, the bytes decode cleanly into the wrong field. There is no error. There is no type mismatch if both are integers. The data is simply, silently wrong.

Now add code generation. If you are generating .proto files from another source like an OpenAPI spec, the field numbering is decided by the generator. Reorder a property in the source, regenerate, and the generator may assign different numbers than it did last time. The schema looks correct. It compiles. And every previously serialized message, every consumer built against the old numbering, now reads garbage. This is the protobuf equivalent of a struct memory layout changing underneath two binaries that were compiled separately.

proto-lock treats field numbers the way package-lock.json treats dependency versions. The lockfile pins the assignment. Once field user_id is bound to number 3, that binding is recorded and enforced across regenerations, so the generator cannot reassign it on a whim. New fields get new numbers, retired fields keep their numbers reserved, and the wire contract stays stable even as the human-facing schema evolves. It is a small idea executed against a real failure mode, which is usually where the best infrastructure tools come from.

Why dev-time tooling is the right layer

Both tools share a thesis worth stating directly: the cheapest place to catch a contract violation is the moment a developer creates it, on their own machine, before it enters version control. Every layer after that, code review, CI, staging, production, makes the fix more expensive and the blast radius larger. This is the same reasoning that pushed type checking from runtime into the compiler and pushed linting from the build server into the editor. Move the check left, shrink the feedback loop, and the class of error stops reaching the places where it costs real money.

The limitation is honesty about scope. A local proxy that validates observed traffic does not give you the guarantees of a consumer-driven contract testing framework like Pact, where consumers publish their expectations and the provider's CI verifies against them. It does not replace a schema registry that enforces compatibility at publish time for a streaming platform. What it does is fill the gap those heavier systems leave open, the gap between writing code and committing it, where today most teams have nothing but manual testing and hope. For a single developer iterating on an endpoint, a zero-config proxy that flags drift the moment it appears is a better fit than standing up a contract testing pipeline.

The broader pattern these tools point at is that contract management is a distributed consistency problem masquerading as a tooling gap. The spec, the implementation, and every consumer are replicas of an agreement, and without explicit reconciliation they drift apart at the speed of development. Most organizations solve this with process, code review discipline, and the institutional memory of whoever owns the schema. Process does not scale across dozens of teams and external partners. Tooling that makes the invariant cheap to check, and checks it where the change is made, scales better. That is the bet behind api-watch and proto-lock, and it is the right bet.

Both projects are open source and early. If you run gRPC generated from OpenAPI, or you maintain APIs with consumers you do not control, they are worth a look, if only as a reminder that the plumbing layer has more unsolved problems than the feature layer gets credit for.

Comments

Loading comments...