Microsoft's Virtual Nodes on Azure Container Instances (ACI) is being rebuilt from the ground up, offering enhanced features and requiring migration from the legacy managed add-on to a new Helm-based deployment model.
Migrating from Legacy Virtual Nodes to Next-Gen ACI on AKS
Microsoft is rebuilding Virtual Nodes on Azure Container Instances (ACI) from the ground up, introducing a new generation of the service that offers enhanced features and a different deployment model. This migration represents a significant shift for Azure Kubernetes Service (AKS) users who have been leveraging Virtual Nodes to burst workloads onto ACI without managing additional VM nodes.
What Changed
The legacy Virtual Nodes managed add-on is being replaced with a Helm-based deployment model. The new generation includes substantial improvements such as VNet peering support, outbound traffic with network security groups, init containers, host aliases, arguments for exec in ACI, persistent volumes, container hooks, confidential containers, and ACI standby pools. Future enhancements will include support for ACR image pull via Service Principal, Kubernetes network policies, IPv6, Windows containers, and port forwarding.
Provider Comparison
Unlike the legacy add-on that was managed directly through AKS, the new Virtual Nodes on ACI is deployed and managed via Helm charts. This change provides more flexibility but requires manual management of the deployment. The new version also requires AKS clusters with Azure CNI networking (Kubenet is not supported) and is incompatible with API server authorized IP ranges due to subnet delegation requirements.
Business Impact
Organizations using Virtual Nodes will need to plan migration carefully, as direct migration is not supported. The process involves disabling the legacy add-on, recreating subnets with specific configurations, and installing the new Helm chart. Each deployment requires 3 vCPUs and 12 GiB memory from an AKS cluster VM and supports up to 200 pods. The migration process, while straightforward, requires downtime and careful planning to avoid disrupting production workloads.
Technical Implementation
The migration process involves several key steps: scaling down existing Virtual Nodes workloads, disabling the legacy add-on, recreating the subnet with the required delegation to Microsoft.ContainerInstance/containerGroups, assigning appropriate permissions to the cluster's kubelet identity, and installing the new Helm chart. The new deployment creates a node named "virtualnode-0" instead of the legacy "virtual-node-aci-linux."
Node Selector Changes
Applications will need updates to their deployment configurations. The legacy Virtual Nodes used node selectors with "kubernetes.io/role: agent" and tolerations for "azure.com/aci," while the new version requires "virtualization: virtualnode2" and similar tolerations. These changes must be reflected in application manifests to ensure proper scheduling on the new Virtual Nodes.
Troubleshooting and Support
Common issues include pods remaining in Pending state due to insufficient resources in node pools, or the virtualnode-n pod crashing due to Managed Identity permissions problems. The cluster's agentpool MSI requires Contributor access on the infrastructure resource group and Network Contributor access to the ACI subnet. Detailed troubleshooting guidance is available in the official documentation, and support is provided through GitHub issues.
Migration Timeline
The migration is already available, with the new generation offering significantly enhanced capabilities. Organizations should evaluate their current Virtual Nodes usage and plan migration to take advantage of features like confidential containers, persistent volumes, and improved networking capabilities. The GitHub repository provides comprehensive documentation and examples for deployment and customization.
For organizations heavily invested in serverless container workloads on AKS, this migration represents an opportunity to modernize their infrastructure while gaining access to new capabilities that weren't available in the legacy implementation. The shift to Helm-based management also aligns with broader Kubernetes ecosystem practices, potentially simplifying operations for teams already familiar with Helm deployments.

Comments
Please log in or register to join the discussion