Istio v1.22: Trying latest Ambient mode

Istio v1.22: Trying latest Ambient mode

Need for Istio

Istio is a service mesh for Kubernetes clusters. It offers many features like zero trust network using mTLS, Traffic management, Routing, observability using Kiali etc. Well how does this do it? By inserting a lightweight sidecar container as proxy in all the pods. So traffic flows to and from these proxy containers. In layman terms, if Kubernetes cluster is a town with paved roads, istio installs highways with traffic signals, CCTV cameras installed, and a centralized Traffic HQ.

Istio’s traditional model deploys Envoy proxies as sidecars within the workloads’ pods

Limitations with Istio

The problem with istio is that it needs dedicated resources for sidecar containers. Generally 128Mb memory and 100m CPU is given to the istio sidecars. The resource consumption is minimal, but when you add it to all the pods running in a cluster, it is a significant amount of resources.

The other problem is that traffic between pods has to do two hops i.e. to and from sidecar container, which adds to the latency, Yes minimal (~10-20ms) but still it adds up. This article explores some more of it.

What is Istio Ambient Mesh

On the surface, it is just a new way of proxying without injecting sidecar containers. This ups the istio's game bringing it equivalent to Cilium where Cilium uses eBPF for manipulating traffic. Istio now separates the concerns of L4 and L7 proxying by using two different ways to proxy traffic. This revolutionises the way of using Istio.

For L4 proxying, istio uses node proxy called ztunnel. It is a Daemonset which is deployed on every node and modifies the bidirectional traffic. It secures traffic using mTLS.

For L7 proxying, Istio uses waypoint proxy. It can be deployed as one for each service running in parallel with

Testing Istio in Ambient Mode

For this I am trying out the test setup for ambient mode written in Istio's blog

I have created a Kind cluster

➜  terraform_projects git:(main) ✗ kind create cluster --config=- <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: ambient
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
Creating cluster "ambient" ...
 ✓ Ensuring node image (kindest/node:v1.29.2) 🖼 
 ✓ Preparing nodes 📦 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-ambient"
You can now use your cluster with:

kubectl cluster-info --context kind-ambient

Thanks for using kind! 😊
➜  terraform_projects git:(main) ✗ k get nodes
NAME                    STATUS   ROLES           AGE     VERSION
ambient-control-plane   Ready    control-plane   5m58s   v1.29.2
ambient-worker          Ready    <none>          5m34s   v1.29.2
ambient-worker2         Ready    <none>          5m34s   v1.29.2

Now to install istio, I am using istioctl

➜  ~ istioctl install --set profile=ambient

This will install the Istio 1.21.2 "ambient" profile (with components: Istio core, Istiod, CNI, and Ztunnel) into the cluster. Proceed? (y/N) y
✔ Istio core installed                                                                                                                                                                  
✔ Istiod installed                                                                                                                                                                      
✔ CNI installed                                                                                                                                                                         
✔ Ztunnel installed                                                                                                                                                                     
✔ Installation complete                                                                                                                                                                Made this installation the default for injection and validation.

If we check the pods installed in istio-system,

➜  ~ k get pods -n istio-system 
NAME                     READY   STATUS    RESTARTS   AGE
istio-cni-node-blxdv     1/1     Running   0          8m2s
istio-cni-node-k5q9p     1/1     Running   0          8m2s
istio-cni-node-nmg86     1/1     Running   0          8m2s
istiod-ff94b9d97-6tzfp   1/1     Running   0          8m17s
ztunnel-dsfdh            1/1     Running   0          7m28s
ztunnel-kwnz7            1/1     Running   0          7m28s
ztunnel-z78gf            1/1     Running   0          7m28s

Both ztunnel and istio CNI are installed as daemonSets. ztunnel is to intercept the traffic flowing between nodes and CNI intercepts traffic between pods in a same node and redirects it through ztunnel. Like forcing the traffic to implement the ambient mode.

To test it, I have installed bookinfo application, sleep and notsleep pods as well.

➜  ~ k apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/bookinfo/platform/kube/bookinfo.yaml
service/details created
serviceaccount/bookinfo-details created
deployment.apps/details-v1 created
service/ratings created
serviceaccount/bookinfo-ratings created
deployment.apps/ratings-v1 created
service/reviews created
serviceaccount/bookinfo-reviews created
deployment.apps/reviews-v1 created
deployment.apps/reviews-v2 created
deployment.apps/reviews-v3 created
service/productpage created
serviceaccount/bookinfo-productpage created
deployment.apps/productpage-v1 created
➜  ~ k get pods
NAME                             READY   STATUS              RESTARTS   AGE
details-v1-cf74bb974-8bmhv       0/1     ContainerCreating   0          32s
productpage-v1-87d54dd59-2mxrt   1/1     Running             0          32s
ratings-v1-7c4bbf97db-cxmps      1/1     Running             0          32s
reviews-v1-5fd6d4f8f8-sfzv4      0/1     ContainerCreating   0          32s
reviews-v2-6f9b55c5db-9wh4k      0/1     ContainerCreating   0          32s
reviews-v3-7d99fd7978-jm5sr      0/1     ContainerCreating   0          32s
➜  ~ kubectl apply -f https://raw.githubusercontent.com/linsun/sample-apps/main/sleep/sleep.yaml
serviceaccount/sleep created
service/sleep created
deployment.apps/sleep created
➜  ~ kubectl apply -f https://raw.githubusercontent.com/linsun/sample-apps/main/sleep/notsleep.yaml


serviceaccount/notsleep created
service/notsleep created
deployment.apps/notsleep created

As the ambient mode is enabled to induvidual namespaces, and we haven't enabled it yet. You can see that I can hit other pods from sleep pod and get response

➜  ~ kubectl exec deploy/sleep -- curl -s http://productpage:9080/           
<!DOCTYPE html>
<html>
  <head>
    <title>Simple Bookstore App</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">

After setting the ambient mode using kubectl label namespace default istio.io/dataplane-mode=ambient I could see that the traffic is now flowing through ztunnel

➜  ~ kubectl label namespace default istio.io/dataplane-mode=ambient
namespace/default labeled
➜  ~ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
<!DOCTYPE html>
➜  ~ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n5
<!DOCTYPE html>
<html>
  <head>
    <title>Simple Bookstore App</title>
<meta charset="utf-8">

Here are the ztunnel logs for the same

2024-05-17T20:25:18.143503Z     info    access  connection complete     src.addr=10.244.1.4:42786 src.workload=sleep-97576b68f-ftqsh src.namespace=default src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.2.8:15008 dst.hbone_addr=10.244.2.8:9080 dst.service=productpage.default.svc.cluster.local dst.workload=productpage-v1-87d54dd59-jvj79 dst.namespace=default dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-productpage" direction="outbound" bytes_sent=79 bytes_recv=1918 duration="128ms"
2024-05-17T20:25:19.755188Z     info    access  connection complete     src.addr=10.244.1.4:42796 src.workload=sleep-97576b68f-ftqsh src.namespace=default src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.2.8:15008 dst.hbone_addr=10.244.2.8:9080 dst.service=productpage.default.svc.cluster.local dst.workload=productpage-v1-87d54dd59-jvj79 dst.namespace=default dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-productpage" direction="outbound" bytes_sent=79 bytes_recv=1918 duration="5ms"
2024-05-17T20:25:21.033103Z     info    access  connection complete     src.addr=10.244.1.4:42810 src.workload=sleep-97576b68f-ftqsh src.namespace=default src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.2.8:15008 dst.hbone_addr=10.244.2.8:9080 dst.service=productpage.default.svc.cluster.local dst.workload=productpage-v1-87d54dd59-jvj79 dst.namespace=default dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-productpage" direction="outbound" bytes_sent=79 bytes_recv=1918 duration="7ms"

Other than mTLS, you can set Authorization policies for different workloads.

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: productpage-viewer
 namespace: default
spec:
 selector:
   matchLabels:
     app: productpage
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep", "cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
➜  ~ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
<!DOCTYPE html>
➜  ~ kubectl exec deploy/notsleep -- curl -s http://productpage:9080/ | head -n1


command terminated with exit code 56

By using istio gateway, you can bring up waypoint proxy to do L7 loadbalancing. Which creates waypoint proxy pods which will not be attached to application pods but stay as stand-alone pods.

Trying on EKS

I also tried the same demo in the EKS cluster, ztunnel daemonset is not coming up.

➜  ~ k get pods -n istio-system 
NAME                      READY   STATUS    RESTARTS   AGE
istio-cni-node-26bn9      1/1     Running   0          5m27s
istio-cni-node-kzd82      1/1     Running   0          5m27s
istio-cni-node-x99n2      1/1     Running   0          5m27s
istiod-5888647857-d9vdh   0/1     Pending   0          5m22s
ztunnel-7qr8l             0/1     Pending   0          5m22s
ztunnel-pgpsn             0/1     Pending   0          5m22s
ztunnel-z2kvc             0/1     Running   0          5m22s

The logs show that the pods are failing with XDS client connection error

2024-05-17T17:33:58.934625Z     warn    xds::client:xds{id=29}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:34:13.939490Z     warn    xds::client:xds{id=30}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:34:28.943546Z     warn    xds::client:xds{id=31}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:34:43.947796Z     warn    xds::client:xds{id=32}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:34:58.950405Z     warn    xds::client:xds{id=33}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:35:13.955870Z     warn    xds::client:xds{id=34}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:35:28.960054Z     warn    xds::client:xds{id=35}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:35:43.964064Z     warn    xds::client:xds{id=36}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:35:58.968351Z     warn    xds::client:xds{id=37}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:36:13.972471Z     warn    xds::client:xds{id=38}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:36:28.975409Z     warn    xds::client:xds{id=39}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s
2024-05-17T17:36:43.979856Z     warn    xds::client:xds{id=40}  XDS client connection error: gRPC connection error:status: Unknown, message: "client error (Connect)", source: tcp connect error: Connection refused (os error 111), retrying in 15s

Conclusion

I believe this is Istio's attempt to stay relevant and adress all the customer concerns and it hit nail in the head. Looking forward to test it further.

Did you find this article valuable?

Support Srujan Reddy by becoming a sponsor. Any amount is appreciated!