Istio Ambient Mesh – a sidecar-less data plane for Istio – represents true innovation in the years-old service mesh industry as it addresses serious concerns about complexity, manageability, and day-two operations compared to sidecar-based deployments. Workload onboarding, data plane upgrades and CVE patches now become much easier.
In addition, for large clusters with thousands of Pods, the resources requested by the sidecar containers are an expensive service mesh tax, as the memory usage of the Envoy sidecars grows linearly with the size of the service mesh. Istio Ambient Mesh alleviates these concerns as well.
We at Solo.io spearheaded its development, and recently this new sidecar-less pattern was integrated into the main branch of Istio. Introducing a new proxy architecture with a split for L4/L7 traffic, it’s quickly becoming interesting wherever a non-invasive approach to application networking is needed and where an incremental adoption of service mesh is the preferred path to production.
We want to validate that the current state of Ambient Mesh can be deployed and used already in managed Kubernetes services, beyond the simple examples using local development clusters, but in setups that approximate more closely real-world scenarios and quasi-production deployments. We chose Azure Kubernetes Service first as we acknowledge its strength and popularity among enterprises and power users, and the availability of multiple options for network plugins in AKS.
Currently, there are 5 main options available for AKS networking:
- The simpler, older kubenet based on non-CNI implementation based on NAT
- The Azure CNI with IP assignment for pods from an existing VNet
- Azure CNI Overlay with a Pod CIDR different from the VNet hosting the nodes
- Azure CNI with Cilium and IP assignment from an overlay network
- Bring-your-own CNI mode, where you can choose which CNI to deploy
As you can see from the table, the only viable option at this moment is to use Azure CNI without Cilium. As Ambient Mesh matures and starts supporting Cilum and other eBPF-based CNIs we will update this blog with new information to deploy Ambient Mesh with eBPF-accelerated routing tables. Solo.io is committed to work with the Istio upstream community to continue to drive the evolution of Istio Ambient, including integration with Cilium and eBPF, and provide Azure Kubernetes Service users with the best possible service-mesh experience.
Ambient Mesh in Azure Kubernetes Service with Azure CNI
If you want to try Ambient Mesh in Azure Kubernetes Service, you’ll need:
- An Azure account and the
azure-cli
command line tool (installation instructions here) - Access to GitHub and the istio/istio repository
- Docker desktop to run the istioctl image.
First let’s create an AKS cluster with AzureCNI network plugin (at the time of writing, 1.25.5 is the latest supported version):
We suggest a minimum size of Standard_DS3_v2 to run Istio because the Kubernetes nodes should have at least 4 CPU cores.
We add the Gateway API CRDs to the AKS cluster, that Istio will use for the waypoint proxies.
Now that Ambient mode is included in the main branch, the nightly build containers published at gcr.io/istio-testing/
contain the ambient mode functionality. We can use the istioctl container to install Istio in our cluster (here shown the command to deploy Istio Ingress gateway, please refer to this guide if you wish to deploy the new Gateway API-based ingress):
Confirm that all pods in istio-system
namespaces are up and running:
kubectl get pod -n istio-system
Note the istio-cni
and ztunnel
daemonsets: the first will take care of modifying the iptables rules on each node to redirect mesh traffic to the ztunnel and the latter is the L4 proxy that will tunnel connections to and from pods that are part of the mesh.
Istio-CNI works in parallel with Azure CNI and they will not interfere with each other. The ztunnel
source code is available outside of the istio repository and has its own issue tracker, this is important to know when looking for known issues about Istio Ambient.
The traffic interception in Ambient Mesh works by leveraging GENEVE tunnels and iptables interception, as explained in detail in this blog post by Peter Jausovec; it’s completely transparent to the user and only works on tagged traffic, allowing flexible interoperability of non-mesh and meshed applications.
Let’s deploy a sample application
Deploy the bookinfo demo app and tag the namespace to be part of the Ambient Mesh:
To check if the traffic is encrypted we are going to:
- Find the IP address of the istio-ingressgateway that is exposed by an Azure Load Balancer, with a Kubernetes Service of type Load Balancer in the istio-system namespace.
- Use curl to generate some traffic.
- Use Stern to look at logs of the ztunnel pods.
Notice the logs line confirming the traffic flows thru the ztunnel and into the application pods:
Note the traffic flowing in outbound from one ztunnel and into the second ztunnel and the correct use of SPIFFE identities (which will come in handy in the next section).
Add a L7 Gateway
We can add a waypoint proxy with the new “x waypoint apply” command of istioctl; this will create a waypoint proxy in the same namespace of the application, associated with the service account bookinfo-reviews
:
A waypoint proxy will make sure that the L7 policies are applied to the connections to the pods using the service account, and that custom policies are enforced, such as request type limiting, network routing, etc.
This setup can be seen in the picture below.
When you execute the same request you can see the product page waypoint pod being used:
More examples of using L7 waypoint proxy are available in the preliminary Istio documentation.
Conclusion
We demonstrated how the latest version of Istio Ambient can be easily deployed in Azure Kubernetes Service, enabling its users to kick the tires of this new sidecarless model even in a managed Kubernetes service. This new operating model for service mesh allows for progressive adoption and incremental enablement of your workloads in the service mesh, avoiding big bang migrations and allowing for mesh applications to co-exist side-by-side with other applications in your cluster.