The AKS Book Is Out: Every Decision That Will Save You a Cluster Rebuild
#Cloud

The AKS Book Is Out: Every Decision That Will Save You a Cluster Rebuild

Cloud Reporter
7 min read

The AKS Book by Richard Hooper covers critical Kubernetes decisions that are hard to change later, from networking to identity, with practical guidance for production deployments.

I've been running AKS clusters in production for a long time. I've seen the same patterns play out over and over—someone stands up a cluster, it works great in dev, and then six months later they're rebuilding the whole thing because a networking decision made on day one turned out to be permanent and wrong. I've been that person. I've also been the one called in to explain why the cluster has to be rebuilt to a room full of frustrated engineers.

That experience is exactly why I wrote The AKS Book.

Why This Book Needed to Exist

Microsoft's AKS documentation is good. The official docs cover how to create a cluster, how to configure features, and how to use every dial and knob available. But documentation tells you what to do. It rarely tells you which decisions you can't change later, which trade-offs will wake you up at 3 AM, or which choices seem minor today but define how your cluster operates for its entire lifetime.

Most AKS tutorials follow the same path: deploy a sample app, watch it scale, call it a day. Then you take that knowledge into a real production environment and immediately run into decisions the tutorial never mentioned.

Azure CNI vs kubenet vs Azure CNI Overlay. Workload Identity vs managed identity at the cluster level. Availability zones that have to be configured at cluster creation and can't be added later without a full rebuild. Node pool sizing when you don't yet know your workload patterns.

These decisions happen early, often before you fully understand their consequences. I wanted to write the book I wish had existed when I was making those choices for the first time.

What the Book Actually Covers

This is not a tutorial book. It is a decision guide. Every chapter is built around the choices that matter most when running AKS in production: the ones that are hard to change, expensive to get wrong, or likely to cause problems at the worst possible moment.

The book covers 17 chapters across the full lifecycle of AKS:

  • Cluster setup decisions. The choices you make before you even hit Create
  • Networking. CNI options, NAT Gateway, VNet integration, and the routing decisions that are nearly impossible to unpick later
  • Identity and access. Workload Identity, managed identities, RBAC, and why getting this wrong affects every application you deploy
  • Node pools and VM sizing. System pools, user pools, Spot VMs, and how to size for workloads you haven't fully defined yet
  • Cost. Because every decision has a bill attached to it
  • Autoscaling. Cluster Autoscaler, KEDA, and the traps people fall into chasing cost savings
  • Storage and stateful workloads. When persistent storage makes sense and when it doesn't
  • Security. Not just hardening, but the decisions that make hardening possible
  • Traffic management. Ingress controllers, Gateway API, and how to route without painting yourself into a corner
  • Observability. What metrics actually matter during an incident versus what just generates noise
  • Multi-region strategies, backup and recovery, capacity planning, and more

Each chapter follows a consistent pattern. Here's what matters, here are the trade-offs, and here is the decision I would make and why. Not "it depends" without context. Actual recommendations with actual reasoning.

Brendan Burns Wrote the Foreword

I am not going to downplay how much this means to me. Brendan Burns is the co-creator of Kubernetes and a Corporate Vice President and Technical Fellow at Microsoft Azure. He has been involved with AKS since the beginning.

Having him write the foreword for this book is genuinely one of the highlights of my career. In it, he talks about how Kubernetes remains foundational even as AI reshapes the industry. How AKS is powering healthcare, finance, retail, and manufacturing. How the same open-source ecosystem that built Kubernetes is now driving innovation in AI through projects like Ray, KubeFlow, and Triton.

And he makes the case that understanding how to run Kubernetes well on Azure enables you to do all of it, whether you're modernising existing workloads or building the next generation of AI applications.

Having someone with Brendan's history and depth write that context genuinely sets the tone for what this book is trying to be. It is a serious, practical resource for people running real workloads.

Thank You to the Technical Reviewers

No book gets better without people willing to push back on it. I want to give a proper thank you to three reviewers who made this book significantly better:

Matt Boyd, Senior Cloud Consultant with over 15 years of experience in Microsoft technologies and Azure architecture. Matt caught things I had simply stopped noticing after reading the same sections too many times, and challenged assumptions I hadn't realised I was making.

Wesley Haakman, Principal Azure Architect at Intercept and Microsoft Azure MVP. Wesley and I actually co-wrote Azure Containers Explained together, so he already knows exactly how I write and where my blind spots are. That history made his review sharper and more direct than most, and the book is better for it.

Luke Murray, ISM Service Lead and Microsoft Azure MVP. Luke has been sharing practical guidance through his blog at luke.geek.nz for years and brought that same practitioner lens to review. His feedback on real-world usability shaped how I explained some of the harder trade-offs.

These three gave up their time to read drafts, flag errors, and share their own experiences. This book is better because of all three of them.

What I Hope This Book Does for You

I hope it saves you from the rebuild. That is genuinely what I want. If you pick this book up before creating your first production AKS cluster, the chapters on networking, identity, and node sizing alone should prevent the most expensive day-one mistakes. The decisions covered in the first five chapters are the ones that, once made wrongly, require a full cluster rebuild to fix.

If you are already running AKS in production, I hope you find answers to the operational questions that never quite had a clear answer in the docs. The chapters on autoscaling, observability, storage, and multi-region are built around the questions I see asked repeatedly in the community, the ones where every answer comes with "it depends" and not enough context to act on.

And if you are responsible for explaining AKS decisions to stakeholders, I hope the decision framework in each chapter gives you language for those conversations. "Here is what we chose, here is why, and here is what we traded off to get there" is a much better position than "Kubernetes is complicated."

This Book Will Keep Getting Updated

AKS is not a static product. Features ship constantly. Defaults change. Things that were true when I started writing this were sometimes out of date by the time I finished. That is the reality of writing about a managed cloud service.

My plan is to release an updated edition annually. The core decision framework does not change much, but specific recommendations, default values, and which features are GA versus preview absolutely do. I want this to be a book you can return to each year and find it still accurate and still useful.

If you spot something that is wrong, outdated, or missing, please reach out. The contact details are in the book. Every piece of feedback improves the next edition, and I would rather know about errors than have them persist.

A Note on Azure Containers Explained

Before The AKS Book, Wesley and I co-wrote Azure Containers Explained, which covers the broader Azure container ecosystem, including Azure Container Instances, Azure Container Apps, Azure Kubernetes Service, and how to choose between them. If you are earlier in your container journey and still deciding which Azure service fits your workload, that book is a good starting point before going deep on AKS specifically.

The AKS Book picks up where that foundation ends. It assumes you have chosen AKS and now need to make it work in production.

Get the Book

The AKS Book is available on Amazon in Kindle and paperback. It covers every major decision you will face running AKS in production, written from years of production experience rather than documentation reading.

If you have been looking for something that goes deeper than tutorials but is more practical than academic, this is the book I wrote for you.

Kindle: available on Amazon (link coming soon)

Paperback: available on Amazon (link coming soon)

If you read it and have thoughts, corrections, or war stories of your own, drop a comment below or find me on LinkedIn and X as @Pixel_Robots. I would love to hear what you think.

Comments

Loading comments...