REST, GraphQL, and gRPC Are Architecture Decisions, Not API Fashion Choices

API style determines how systems scale, fail, cache, evolve, and absorb client diversity over years of production use.

Problem

Choosing between REST, GraphQL, and gRPC is often framed as a developer experience preference. That framing is too small. API style affects cacheability, schema evolution, client independence, operational debugging, rate limiting, request fan-out, and how expensive it becomes to keep old integrations alive.

REST remains the default public API shape because it maps cleanly to HTTP resources, status codes, caches, proxies, and common tooling. GraphQL solves a real client problem by letting consumers ask for exactly the data they need, but it moves complexity into query planning, authorization, caching, and cost control. gRPC is excellent for internal service communication where strongly typed contracts, binary payloads, and streaming matter more than human readability.

The practical lesson is not that one API style wins. Mature systems usually use more than one. Public APIs often favor REST or GraphQL. Internal services often favor gRPC. API gateways and Backend for Frontend layers sit at the boundary to absorb authentication, routing, rate limits, observability, and client-specific aggregation.

Solution Approach

A useful way to choose is to start from failure modes rather than syntax.

REST works best when the domain can be expressed as stable resources. A user, invoice, order, repository, payment method, or deployment can be modeled as a URL. HTTP verbs carry intent: GET reads, POST creates, PUT replaces, PATCH modifies, and DELETE removes. This gives clients, proxies, logs, browsers, CDNs, and monitoring systems a shared language.

That shared language is REST's advantage. A GET /products/456 response can use Cache-Control, ETag, conditional requests, and CDN caching without special machinery. A 404 means the resource was not found. A 409 means the requested state conflicts with existing state. A 429 means the caller crossed a rate limit. These are boring conventions, and boring conventions are a gift in distributed systems because every custom rule becomes another place where production behavior can drift.

The REST architectural style, described by Roy Fielding, is not just JSON over HTTP. It is a set of constraints around statelessness, cacheability, uniform interfaces, and layered systems. Most production APIs do not implement every REST constraint perfectly, but the parts they do use are operationally valuable.

GraphQL starts from a different pain. A mobile screen may need a user's name, avatar, the first ten posts, and three comments per post. A REST API might require several round trips, or it might return a large user object filled with fields the client does not need. That is under-fetching and over-fetching. Both become visible when latency, bandwidth, and battery life matter.

GraphQL lets the client describe the response shape. A query can request only name, posts.title, posts.createdAt, and comments.author.name. The server exposes a typed schema, and clients select fields from that schema. This is a strong model for products with multiple clients: web, iOS, Android, admin tools, partner dashboards, and internal workflows.

The cost is that GraphQL makes one HTTP request look simple while hiding potentially large server-side work. A naive resolver tree can trigger the classic N+1 problem: fetch one user, then fetch posts, then fetch comments for each post, then fetch authors for each comment. Without batching and request planning, one clean query can become hundreds of database calls. Tools such as DataLoader exist because this failure mode is common.

gRPC optimizes a different boundary: service-to-service calls inside a controlled environment. APIs are defined in Protocol Buffers files, and client and server code are generated from those contracts. Payloads are compact binary messages rather than verbose JSON. HTTP/2 gives multiplexing and streaming. Unary calls, server streaming, client streaming, and bidirectional streaming are built into the model.

That makes gRPC a strong fit for internal systems where services are written in several languages and call each other frequently. A payment service, risk service, ledger service, notification service, and identity service can share typed contracts without hand-maintained HTTP clients. When request volume is high, binary encoding and generated clients reduce overhead. When real-time event streams matter, gRPC streaming avoids awkward polling patterns.

The trade is visibility and reach. gRPC payloads are not pleasant to inspect with curl. Browser support requires gRPC-Web or a translation layer. Public API consumers may find REST easier to adopt because they can test requests from a terminal, inspect JSON, and use ordinary HTTP tooling.

API Gateways And BFF Layers

Once an API is used by real clients, the protocol choice is only part of the design. The boundary needs policy.

An API gateway centralizes concerns that should not be reimplemented in every service: authentication, authorization, TLS termination, request routing, rate limiting, logging, metrics, circuit breaking, and sometimes response transformation. Common options include Kong, Envoy, AWS API Gateway, Traefik, and Apigee.

This centralization matters because duplicated policy drifts. One service forgets to enforce a stricter tenant check. Another logs sensitive payloads. A third interprets rate limits differently. Gateways are not magic, but they give teams a single enforcement point for cross-cutting behavior.

The Backend for Frontend pattern solves a different issue. A mobile app, desktop web app, and partner integration rarely want the same response shape. A single general-purpose API can become bloated as it tries to satisfy every client. A BFF gives each client type its own boundary layer: mobile BFF, web BFF, partner BFF. Each BFF can aggregate internal services, shape payloads, and enforce client-specific policies without forcing one API to carry every concern.

This pattern is especially useful when internal services use gRPC but external clients use REST or GraphQL. The BFF or gateway translates between external ergonomics and internal efficiency.

Consistency And Versioning

API design also affects consistency guarantees. REST resources often expose a snapshot of server state at a point in time. GraphQL can assemble one response from multiple backing systems, which raises questions about whether the returned data is internally consistent. gRPC calls between services may participate in workflows where each service owns its own data, so callers must handle retries, idempotency, and partial failure.

Distributed systems do not fail as one unit. They fail in slices. A user service may be healthy while a recommendation service times out. A GraphQL query that asks for both needs a policy: fail the whole request, return partial data, or substitute nulls with errors. A REST endpoint that aggregates several services faces the same issue, but GraphQL makes partial data a first-class response pattern.

Versioning is where API decisions compound. Removing a field, changing a type, renaming an error code, or tightening validation can break clients that were written years earlier. Public APIs need a compatibility strategy before they need it, because by the time a breaking change is urgent, the client base is already out of your control.

URL versioning, such as /api/v1/users/123, is blunt but easy to route and debug. Header versioning keeps URLs clean but hides behavior from casual inspection. Query parameter versioning is easy to test but can complicate cache behavior. Stripe's documented model uses dated API versions and lets accounts remain pinned to older behavior until upgraded, as described in Stripe's API versioning docs. That kind of compatibility is expensive, but breaking payments integrations is usually more expensive.

Trade-Offs

REST is the best default when the API is public, resource-oriented, cache-sensitive, and stable. It works well for CRUD, public developer platforms, webhooks, admin APIs, and systems where HTTP semantics are useful. Its weakness appears when clients need many related objects in one screen or when each client wants a different shape.

GraphQL is a strong choice when client diversity dominates. It gives frontend teams flexibility and reduces waste over slow networks. It also creates new operational work: query depth limits, cost analysis, persisted queries, resolver batching, field-level authorization, schema governance, and caching strategy. A GraphQL API without those controls can let one expensive query put pressure on the database.

gRPC is the right tool when service calls are internal, high volume, strongly typed, and latency-sensitive. It is particularly effective in microservice systems where generated clients reduce contract drift. Its weakness is public accessibility and manual debugging. Binary protocols are efficient, but when production is failing, humans still need observability that explains what happened.

The most defensible architecture often combines them:

REST for public, stable resource APIs and webhooks.
GraphQL for product surfaces with many client-specific data needs.
gRPC for internal service calls where schema contracts and streaming matter.
API gateways for shared boundary policy.
BFF layers when different client types need materially different APIs.

The deeper rule is to design for change. APIs are promises made under uncertainty. Every endpoint, field, status code, and error response becomes part of a contract once clients depend on it. A distributed systems engineer learns to respect those contracts because the failure rarely shows up in the deploy that introduced the change. It shows up months later, in an integration nobody remembered, owned by a team that no longer exists, carrying traffic that still matters.