Kubernetes Networking
By the end of this exercise, you should be able to:
- Predict what routing tables rules calico will write to each host in your cluster
- Route and load balance traffic to deployments using clusterIP and nodePort services
- Reconfigure a deployment into a daemonSet (analogous to changing scheduling from 'replicated' to 'global' in a swarm service)
Routing Traffic with Calico
Make sure you're on the master node
node-0, and redeploy the nginx deployment defined indeployment.yamlfrom the last exercise.List your pods:
[centos@node-0 ~]$ kubectl get podsGet some metadata on one of the pods found in the last step:
[centos@node-0 ~]$ kubectl describe pods <pod name>which in my case results in:
Name: nginx-deployment-69df458bc5-bb87w Namespace: default Priority: 0 PriorityClassName: <none> Node: node-2/10.10.43.25 Start Time: Thu, 09 Aug 2018 17:29:52 +0000 Labels: app=nginx pod-template-hash=2589014671 Annotations: <none> Status: Running IP: 192.168.247.10 Controlled By: ReplicaSet/nginx-deployment-69df458bc5 Containers: nginx: Container ID: docker://26e8eac8d5a89b7cf2f2af762de88d7f4fa234174881626a1427b813c06b1362 Image: nginx:1.7.9 Image ID: docker-pullable://nginx@sha256:e3456c851a152494c3e4ff5fcc26f240206abac0c9d794affb40e0714846c451 Port: <none> Host Port: <none> State: Running Started: Thu, 09 Aug 2018 17:29:53 +0000 Ready: True Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-fkf5d (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-fkf5d: Type: Secret (a volume populated by a Secret) SecretName: default-token-fkf5d Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 20s default-scheduler Successfully assigned default/nginx-deployment-69df458bc5-bb87w to node-2 Normal Pulled 19s kubelet, node-2 Container image "nginx:1.7.9" already present on machine Normal Created 19s kubelet, node-2 Created container Normal Started 19s kubelet, node-2 Started container [centos@node-0 ~]$ kubectl describe pods nginx-deployment-69df458bc5-bb87w Name: nginx-deployment-69df458bc5-bb87w Namespace: default Priority: 0 PriorityClassName: <none> Node: node-2/10.10.43.25 Start Time: Thu, 09 Aug 2018 17:29:52 +0000 Labels: app=nginx pod-template-hash=2589014671 Annotations: <none> Status: Running IP: 192.168.247.10 Controlled By: ReplicaSet/nginx-deployment-69df458bc5 Containers: nginx: Container ID: docker://26e8eac8d5a89b7cf2f2af762de88d7f4fa234174881626a1427b813c06b1362 Image: nginx:1.7.9 Image ID: docker-pullable://nginx@sha256:e3456c851a152494c3e4ff5fcc26f240206abac0c9d794affb40e0714846c451 Port: <none> Host Port: <none> State: Running Started: Thu, 09 Aug 2018 17:29:53 +0000 Ready: True Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-fkf5d (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-fkf5d: Type: Secret (a volume populated by a Secret) SecretName: default-token-fkf5d Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned default/nginx-deployment-69df458bc5-bb87w to node-2 Normal Pulled 1m kubelet, node-2 Container image "nginx:1.7.9" already present on machine Normal Created 1m kubelet, node-2 Created container Normal Started 1m kubelet, node-2 Started containerWe can see that in our case the pod has been deployed to
node-2as indicated near the top of the output, and the pod has an IP of192.168.247.10.Have a look at the routing table on
node-0usingip route, which for my example looks like:[centos@node-0 ~]$ ip route default via 10.10.0.1 dev eth0 10.10.0.0/20 dev eth0 proto kernel scope link src 10.10.7.20 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 blackhole 192.168.39.192/26 proto bird 192.168.39.193 dev cali12b0eb5c038 scope link 192.168.39.194 dev calibe752d56965 scope link 192.168.84.128/26 via 10.10.24.89 dev tunl0 proto bird onlink 192.168.247.0/26 via 10.10.43.25 dev tunl0 proto bird onlinkNotice the last line; this rule was written by Calico to send any traffic on the 192.168.247.0/26 subnet (which the pod we examined above is on) to the host at IP 10.10.43.25 via IP in IP as indicated by the
dev tunl0entry. Look at your own routing table and list of VM IPs; what are the corresponding subnets, pod IPs and host IPs in your case? Does that make sense based on the host you found for the nginx pod above?Curl your pod's IP on port 80 from
node-0; you should see the HTML for the nginx landing page. By default this pod is reachable at this IP from anywhere in the Kubernetes cluster.Head over to the node this pod got scheduled on (
node-2in the example above), and have a look at that host's routing table in the same way:[centos@node-2 ~]$ ip route default via 10.10.32.1 dev eth0 10.10.32.0/20 dev eth0 proto kernel scope link src 10.10.43.25 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 192.168.39.192/26 via 10.10.7.20 dev tunl0 proto bird onlink 192.168.84.128/26 via 10.10.24.89 dev tunl0 proto bird onlink blackhole 192.168.247.0/26 proto bird 192.168.247.10 dev calia5daa4e7a1d scope link 192.168.247.11 dev cali9ff153fb143 scope linkAgain notice the second-to-last line; this time, the pod IP is routed to a
cali***device, which is a virtual ethernet endpoint in the host's network namespace, providing a point of ingress into that pod. Once again trycurl <pod IP>:80- you'll see the nginx landing page html as before.Back on
node-0, fetch the logs generated by the pod you've been curling:[centos@node-0 ~]$ kubectl logs <pod name> 10.10.52.135 - - [09/May/2018:13:58:42 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-" 192.168.84.128 - - [09/May/2018:14:00:41 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"We see records of the curls we preformed above; like Docker containers, these logs are the STDOUT and STDERR of the containerized processes.
Routing and Load Balancing with Services
Above we were able to hit nginx at the pod IP, but there is no guarantee this pod won't get rescheduled to a new IP. If we want a stable IP for this deployment, we need to create a
ClusterIPservice. In a filecluster.yamlon your masternode-0:apiVersion: v1 kind: Service metadata: name: cluster-demo spec: selector: app: nginx ports: - port: 8080 targetPort: 80Create this service with
kubectl create -f cluster.yaml. This maps the pod internal port 80 to the cluster wide external port 8080; furthermore, this IP and port will only be reachable from within the cluster. Also note theselector: app: nginxspecification; that indicates that this service will route traffic to every pod that hasnginxas the value of theapplabel in this namespace.Let's see what services we have now:
[centos@node-0 ~]$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 33m cluster-demo ClusterIP 10.104.201.93 <none> 8080/TCP 48sThe second one is the one we just created and we can see that a stable IP address and port
10.104.201.93:8080has been assigned to ournginxservice.Let's try to access Nginx now, from any node in our cluster:
[centos@node-0 ~]$ curl <nginx CLUSTER-IP>:8080which should return the nginx welcome page. Even if pods get rescheduled to new IPs, this clusterIP service will preserve a stable entrypoint for traffic to be load balanced across all pods matching the service's label selector.
ClusterIP services are reachable only from within the Kubernetes cluster. If you want to route traffic to your pods from an external network, you'll need a NodePort service. On your master
node-0, create a filenodeport.yaml:apiVersion: v1 kind: Service metadata: name: nodeport-demo spec: type: NodePort selector: app: nginx ports: - port: 8080 targetPort: 80And create this service with
kubectl create -f nodeport.yaml. Notice this is exactly the same as the ClusterIP service definition, but now we're requesting a type NodePort.Inspect this service's metadata:
[centos@node-0 ~]$ kubectl describe service nodeport-demoNotice the NodePort field: this is a randomly selected port from the range 30000-32767 where your pods will be reachable externally. Try visiting your nginx deployment at any public IP of your cluster, and the port you found above, and confirming you can see the nginx landing page.
Clean up the objects you created in this section:
[centos@node-0 ~]$ kubectl delete deployment nginx-deployment [centos@node-0 ~]$ kubectl delete service cluster-demo [centos@node-0 ~]$ kubectl delete service nodeport-demo
Optional: Deploying DockerCoins onto the Kubernetes Cluster
First deploy Redis via
kubectl run:[centos@node-0 ~]$ kubectl run redis --image=redisAnd now all the other deployments. To avoid too much typing we do that in a loop:
[centos@node-0 ~]$ for DEPLOYMENT in hasher rng webui worker; do kubectl run $DEPLOYMENT --image=training/dockercoins_${DEPLOYMENT}:1.0 doneLet's see what we have:
[centos@node-0 ~]$ kubectl get pods -o wide -win my case the result is:
hasher-6c64f78655-rgjk5 1/1 Running 0 53s 10.36.0.1 node-2 redis-75586d7d7c-mmjg7 1/1 Running 0 5m 10.44.0.2 node-1 rng-d94d56d4f-twlwz 1/1 Running 0 53s 10.44.0.1 node-1 webui-6d8668984d-sqtt8 1/1 Running 0 52s 10.36.0.2 node-2 worker-56756ddbb8-lbv9r 1/1 Running 0 52s 10.44.0.3 node-1pods have been distributed across our cluster.
We can also look at some logs:
[centos@node-0 ~]$ kubectl logs deploy/rng [centos@node-0 ~]$ kubectl logs deploy/workerThe
rngservice (and also thehasherandwebuiservices) seem to work fine but theworkerservice reports errors. The reason is that unlike on Swarm, Kubernetes does not automatically provide a stable networking endpoint for deployments. We need to create at least aClusterIPservice for each of our deployments so they can communicate.List your current services:
[centos@node-0 ~]$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 46mExpose the
redis,rngandhasherinternally using services and specifying the correct (internal) port:[centos@node-0 ~]$ kubectl expose deployment redis --port 6379 [centos@node-0 ~]$ kubectl expose deployment rng --port 80 [centos@node-0 ~]$ kubectl expose deployment hasher --port 80List your services again:
[centos@node-0 ~]$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hasher ClusterIP 10.108.207.22 <none> 80/TCP 20s kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 47m redis ClusterIP 10.100.14.121 <none> 6379/TCP 31s rng ClusterIP 10.111.235.252 <none> 80/TCP 26sEvidently
kubectl exposecreatesClusterIPservices allowing stable, internal reachability for your deployments, much like you did via yaml manifests for your nginx deployment in the last section. See thekubectlapi docs for more command-line alternatives to yaml manifests.Get the logs of the worker again:
[centos@node-0 ~]$ kubectl logs deploy/workerThis time you should see that the
workerrecovered (give it at least 10 sec to do so). Theworkercan now access the other services.Now let's expose the
webuito the public using a service of typeNodePort:[centos@node-0 ~]$ kubectl expose deploy/webui --type=NodePort --port 80List your services one more time:
[centos@node-0 ~]$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hasher ClusterIP 10.108.207.22 <none> 80/TCP 2m kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 49m redis ClusterIP 10.100.14.121 <none> 6379/TCP 2m rng ClusterIP 10.111.235.252 <none> 80/TCP 2m webui NodePort 10.108.88.182 <none> 80:32015/TCP 33sNotice the
NodePortservice created forwebui. This type of service provides similar behavior to the Swarm L4 mesh net: a port (32015 in my case) has been reserved across the cluster; any external traffic hitting any cluster IP on that port will be directed to port 80 inside awebuipod.Visit your Dockercoins web ui at
http://<node IP>:<port>, where<node IP>is the public IP address any of your cluster members. You should see the dashboard of our DockerCoins application.Let's scale up the worker a bit and see the effect of it:
[centos@node-0 ~]$ kubectl scale deploy/worker --replicas=10Observe the result of this scaling in the browser. We do not really get a 10-fold increase in throughput, just as when we deployed DockerCoins on swarm; the
rngservice is causing a bottleneck.To scale up, we want to run an instance of
rngon each node of the cluster. For this we use aDaemonSet. We do this by using a yaml file that captures the desired configuration, rather than through the CLI.Create a file
deploy-rng.yamlas follows:[centos@node-0 ~]$ kubectl get deploy/rng -o yaml --export > deploy-rng.yamlNote:
--exportwill remove "cluster-specific" informationEdit this file to make it describe a
DaemonSetinstead of aDeployment:- change
kindtoDaemonSet - remove the
progressDeadlineSecondsfield - remove the
replicasfield - remove the
strategyblock (which defines the rollout mechanism for a deployment) - remove the
status: {}line at the end
- change
Now apply this YAML file to create the
DaemonSet:[centos@node-0 ~]$ kubectl apply -f deploy-rng.yamlWe can now look at the
DaemonSetthat was created:[centos@node-0 ~]$ kubectl get daemonset NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE rng 2 2 2 2 2 <none> 1mDockercoins performance should now improve, as illustrated by your web ui.
If we do a
kubectl get allwe will see that we now have both adeployment.apps/rngAND adaemonset.apps/rng. Deployments are not just converted to Daemon sets. Let's delete therngdeployment:[centos@node-0 ~]$ kubectl delete deploy/rngClean up your resources when done:
[centos@node-0 ~]$ for D in redis hasher rng webui; \ do kubectl delete svc/$D; done [centos@node-0 ~]$ for D in redis hasher webui worker; \ do kubectl delete deploy/$D; done [centos@node-0 ~]$ kubectl delete ds/rngMake sure that everything is cleared:
[centos@node-0 ~]$ kubectl get allshould only show the
svc/kubernetesresource.
Conclusion
In this exercise, we looked at some of the key Kubernetes service objects that provide routing and load balancing for collections of pods; clusterIP for internal communication, analogous to Swarm's VIPs, and NodePort, for routing external traffic to an app similarly to Swarm's L4 mesh net. We also briefly touched on the inner workings of Calico, one of many Kubernetes network plugins and the one that ships natively with Docker's Enterprise Edition product. The key networking difference between Swarm and Kubernetes is their approach to default firewalling; while Swarm firewalls software defined networks automatically, all pods can reach all other pods on a Kube cluster, in Calico's case via the BGP-updated control plane and IP-in-IP data plane you explored above.