Kubernetes Networking

By the end of this exercise, you should be able to:

Predict what routing tables rules calico will write to each host in your cluster
Route and load balance traffic to deployments using clusterIP and nodePort services
Reconfigure a deployment into a daemonSet (analogous to changing scheduling from 'replicated' to 'global' in a swarm service)

Routing Traffic with Calico

Make sure you're on the master node node-0, and redeploy the nginx deployment defined in deployment.yaml from the last exercise.
List your pods:
```
[centos@node-0 ~]$ kubectl get pods
```

Get some metadata on one of the pods found in the last step:

[centos@node-0 ~]$ kubectl describe pods <pod name>

which in my case results in:

Name:               nginx-deployment-69df458bc5-bb87w
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               node-2/10.10.43.25
Start Time:         Thu, 09 Aug 2018 17:29:52 +0000
Labels:             app=nginx
                    pod-template-hash=2589014671
Annotations:        <none>
Status:             Running
IP:                 192.168.247.10
Controlled By:      ReplicaSet/nginx-deployment-69df458bc5
Containers:
  nginx:
    Container ID:   docker://26e8eac8d5a89b7cf2f2af762de88d7f4fa234174881626a1427b813c06b1362
    Image:          nginx:1.7.9
    Image ID:       docker-pullable://nginx@sha256:e3456c851a152494c3e4ff5fcc26f240206abac0c9d794affb40e0714846c451
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 09 Aug 2018 17:29:53 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fkf5d (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-fkf5d:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fkf5d
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  20s   default-scheduler  Successfully assigned default/nginx-deployment-69df458bc5-bb87w to node-2
  Normal  Pulled     19s   kubelet, node-2    Container image "nginx:1.7.9" already present on machine
  Normal  Created    19s   kubelet, node-2    Created container
  Normal  Started    19s   kubelet, node-2    Started container
[centos@node-0 ~]$ kubectl describe pods nginx-deployment-69df458bc5-bb87w
Name:               nginx-deployment-69df458bc5-bb87w
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               node-2/10.10.43.25
Start Time:         Thu, 09 Aug 2018 17:29:52 +0000
Labels:             app=nginx
                    pod-template-hash=2589014671
Annotations:        <none>
Status:             Running
IP:                 192.168.247.10
Controlled By:      ReplicaSet/nginx-deployment-69df458bc5
Containers:
  nginx:
    Container ID:   docker://26e8eac8d5a89b7cf2f2af762de88d7f4fa234174881626a1427b813c06b1362
    Image:          nginx:1.7.9
    Image ID:       docker-pullable://nginx@sha256:e3456c851a152494c3e4ff5fcc26f240206abac0c9d794affb40e0714846c451
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 09 Aug 2018 17:29:53 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fkf5d (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-fkf5d:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fkf5d
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  1m    default-scheduler  Successfully assigned default/nginx-deployment-69df458bc5-bb87w to node-2
  Normal  Pulled     1m    kubelet, node-2    Container image "nginx:1.7.9" already present on machine
  Normal  Created    1m    kubelet, node-2    Created container
  Normal  Started    1m    kubelet, node-2    Started container

We can see that in our case the pod has been deployed to node-2 as indicated near the top of the output, and the pod has an IP of 192.168.247.10.

Have a look at the routing table on node-0 using ip route, which for my example looks like:
```
[centos@node-0 ~]$ ip route

default via 10.10.0.1 dev eth0 
10.10.0.0/20 dev eth0  proto kernel  scope link  src 10.10.7.20 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 
blackhole 192.168.39.192/26  proto bird 
192.168.39.193 dev cali12b0eb5c038  scope link 
192.168.39.194 dev calibe752d56965  scope link 
192.168.84.128/26 via 10.10.24.89 dev tunl0  proto bird onlink 
192.168.247.0/26 via 10.10.43.25 dev tunl0  proto bird onlink
```
Notice the last line; this rule was written by Calico to send any traffic on the 192.168.247.0/26 subnet (which the pod we examined above is on) to the host at IP 10.10.43.25 via IP in IP as indicated by the dev tunl0 entry. Look at your own routing table and list of VM IPs; what are the corresponding subnets, pod IPs and host IPs in your case? Does that make sense based on the host you found for the nginx pod above?
Curl your pod's IP on port 80 from node-0; you should see the HTML for the nginx landing page. By default this pod is reachable at this IP from anywhere in the Kubernetes cluster.

Head over to the node this pod got scheduled on (node-2 in the example above), and have a look at that host's routing table in the same way:

[centos@node-2 ~]$ ip route

default via 10.10.32.1 dev eth0 
10.10.32.0/20 dev eth0  proto kernel  scope link  src 10.10.43.25 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 
192.168.39.192/26 via 10.10.7.20 dev tunl0  proto bird onlink 
192.168.84.128/26 via 10.10.24.89 dev tunl0  proto bird onlink 
blackhole 192.168.247.0/26  proto bird 
192.168.247.10 dev calia5daa4e7a1d  scope link 
192.168.247.11 dev cali9ff153fb143  scope link

Again notice the second-to-last line; this time, the pod IP is routed to a cali*** device, which is a virtual ethernet endpoint in the host's network namespace, providing a point of ingress into that pod. Once again try curl <pod IP>:80 - you'll see the nginx landing page html as before.

Back on node-0, fetch the logs generated by the pod you've been curling:

[centos@node-0 ~]$ kubectl logs <pod name>
10.10.52.135 - - [09/May/2018:13:58:42 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
192.168.84.128 - - [09/May/2018:14:00:41 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"

We see records of the curls we preformed above; like Docker containers, these logs are the STDOUT and STDERR of the containerized processes.

Routing and Load Balancing with Services

Above we were able to hit nginx at the pod IP, but there is no guarantee this pod won't get rescheduled to a new IP. If we want a stable IP for this deployment, we need to create a ClusterIP service. In a file cluster.yaml on your master node-0:
```
apiVersion: v1
kind: Service
metadata:
  name: cluster-demo
spec:
  selector:
    app: nginx
  ports:
  - port: 8080
    targetPort: 80
```
Create this service with kubectl create -f cluster.yaml. This maps the pod internal port 80 to the cluster wide external port 8080; furthermore, this IP and port will only be reachable from within the cluster. Also note the selector: app: nginx specification; that indicates that this service will route traffic to every pod that has nginx as the value of the app label in this namespace.

Let's see what services we have now:

[centos@node-0 ~]$ kubectl get services
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes     ClusterIP   10.96.0.1       <none>        443/TCP    33m
cluster-demo   ClusterIP   10.104.201.93   <none>        8080/TCP   48s

The second one is the one we just created and we can see that a stable IP address and port 10.104.201.93:8080 has been assigned to our nginx service.

Let's try to access Nginx now, from any node in our cluster:
```
[centos@node-0 ~]$ curl <nginx CLUSTER-IP>:8080
```
which should return the nginx welcome page. Even if pods get rescheduled to new IPs, this clusterIP service will preserve a stable entrypoint for traffic to be load balanced across all pods matching the service's label selector.
ClusterIP services are reachable only from within the Kubernetes cluster. If you want to route traffic to your pods from an external network, you'll need a NodePort service. On your master node-0, create a file nodeport.yaml:
```
apiVersion: v1
kind: Service
metadata:
  name: nodeport-demo
spec:
  type: NodePort
  selector:
      app: nginx
  ports:
  - port: 8080
    targetPort: 80
```
And create this service with kubectl create -f nodeport.yaml. Notice this is exactly the same as the ClusterIP service definition, but now we're requesting a type NodePort.
Inspect this service's metadata:
```
[centos@node-0 ~]$ kubectl describe service nodeport-demo
```
Notice the NodePort field: this is a randomly selected port from the range 30000-32767 where your pods will be reachable externally. Try visiting your nginx deployment at any public IP of your cluster, and the port you found above, and confirming you can see the nginx landing page.

Clean up the objects you created in this section:

[centos@node-0 ~]$ kubectl delete deployment nginx-deployment
[centos@node-0 ~]$ kubectl delete service cluster-demo
[centos@node-0 ~]$ kubectl delete service nodeport-demo

Optional: Deploying DockerCoins onto the Kubernetes Cluster

First deploy Redis via kubectl run:

[centos@node-0 ~]$ kubectl run redis --image=redis

And now all the other deployments. To avoid too much typing we do that in a loop:

[centos@node-0 ~]$ for DEPLOYMENT in hasher rng webui worker; do
    kubectl run $DEPLOYMENT --image=training/dockercoins_${DEPLOYMENT}:1.0
done

Let's see what we have:

[centos@node-0 ~]$ kubectl get pods -o wide -w

in my case the result is:

hasher-6c64f78655-rgjk5   1/1       Running   0          53s       10.36.0.1   node-2
redis-75586d7d7c-mmjg7    1/1       Running   0          5m        10.44.0.2   node-1
rng-d94d56d4f-twlwz       1/1       Running   0          53s       10.44.0.1   node-1
webui-6d8668984d-sqtt8    1/1       Running   0          52s       10.36.0.2   node-2
worker-56756ddbb8-lbv9r   1/1       Running   0          52s       10.44.0.3   node-1

pods have been distributed across our cluster.

We can also look at some logs:
```
[centos@node-0 ~]$ kubectl logs deploy/rng
[centos@node-0 ~]$ kubectl logs deploy/worker
```
The rng service (and also the hasher and webui services) seem to work fine but the worker service reports errors. The reason is that unlike on Swarm, Kubernetes does not automatically provide a stable networking endpoint for deployments. We need to create at least a ClusterIP service for each of our deployments so they can communicate.

List your current services:

[centos@node-0 ~]$ kubectl get services
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   46m

Expose the redis, rng and hasher internally using services and specifying the correct (internal) port:

[centos@node-0 ~]$ kubectl expose deployment redis --port 6379
[centos@node-0 ~]$ kubectl expose deployment rng --port 80
[centos@node-0 ~]$ kubectl expose deployment hasher --port 80

List your services again:

[centos@node-0 ~]$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
hasher       ClusterIP   10.108.207.22    <none>        80/TCP     20s
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP    47m
redis        ClusterIP   10.100.14.121    <none>        6379/TCP   31s
rng          ClusterIP   10.111.235.252   <none>        80/TCP     26s

Evidently kubectl expose creates ClusterIP services allowing stable, internal reachability for your deployments, much like you did via yaml manifests for your nginx deployment in the last section. See the kubectl api docs for more command-line alternatives to yaml manifests.

Get the logs of the worker again:
```
[centos@node-0 ~]$ kubectl logs deploy/worker
```
This time you should see that the worker recovered (give it at least 10 sec to do so). The worker can now access the other services.

Now let's expose the webui to the public using a service of type NodePort:

[centos@node-0 ~]$ kubectl expose deploy/webui --type=NodePort --port 80

List your services one more time:

[centos@node-0 ~]$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
hasher       ClusterIP   10.108.207.22    <none>        80/TCP         2m
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP        49m
redis        ClusterIP   10.100.14.121    <none>        6379/TCP       2m
rng          ClusterIP   10.111.235.252   <none>        80/TCP         2m
webui        NodePort    10.108.88.182    <none>        80:32015/TCP   33s

Notice the NodePort service created for webui. This type of service provides similar behavior to the Swarm L4 mesh net: a port (32015 in my case) has been reserved across the cluster; any external traffic hitting any cluster IP on that port will be directed to port 80 inside a webui pod.

Visit your Dockercoins web ui at http://<node IP>:<port>, where <node IP> is the public IP address any of your cluster members. You should see the dashboard of our DockerCoins application.
Let's scale up the worker a bit and see the effect of it:
```
[centos@node-0 ~]$ kubectl scale deploy/worker --replicas=10
```
Observe the result of this scaling in the browser. We do not really get a 10-fold increase in throughput, just as when we deployed DockerCoins on swarm; the rng service is causing a bottleneck.
To scale up, we want to run an instance of rng on each node of the cluster. For this we use a DaemonSet. We do this by using a yaml file that captures the desired configuration, rather than through the CLI.

Create a file deploy-rng.yaml as follows:
```
[centos@node-0 ~]$ kubectl get deploy/rng -o yaml --export > deploy-rng.yaml
```
Note: --export will remove "cluster-specific" information
Edit this file to make it describe a DaemonSet instead of a Deployment:
- change kind to DaemonSet
- remove the progressDeadlineSeconds field
- remove the replicas field
- remove the strategy block (which defines the rollout mechanism for a deployment)
- remove the status: {} line at the end

Now apply this YAML file to create the DaemonSet:

[centos@node-0 ~]$ kubectl apply -f deploy-rng.yaml

We can now look at the DaemonSet that was created:

[centos@node-0 ~]$ kubectl get daemonset

NAME      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
rng       2         2         2         2            2           <none>          1m

Dockercoins performance should now improve, as illustrated by your web ui.

If we do a kubectl get all we will see that we now have both a deployment.apps/rng AND a daemonset.apps/rng. Deployments are not just converted to Daemon sets. Let's delete the rng deployment:
```
[centos@node-0 ~]$ kubectl delete deploy/rng
```

Clean up your resources when done:

[centos@node-0 ~]$ for D in redis hasher rng webui; \
    do kubectl delete svc/$D; done
[centos@node-0 ~]$ for D in redis hasher webui worker; \
    do kubectl delete deploy/$D; done
[centos@node-0 ~]$ kubectl delete ds/rng

Make sure that everything is cleared:
```
[centos@node-0 ~]$ kubectl get all
```
should only show the svc/kubernetes resource.

Conclusion

In this exercise, we looked at some of the key Kubernetes service objects that provide routing and load balancing for collections of pods; clusterIP for internal communication, analogous to Swarm's VIPs, and NodePort, for routing external traffic to an app similarly to Swarm's L4 mesh net. We also briefly touched on the inner workings of Calico, one of many Kubernetes network plugins and the one that ships natively with Docker's Enterprise Edition product. The key networking difference between Swarm and Kubernetes is their approach to default firewalling; while Swarm firewalls software defined networks automatically, all pods can reach all other pods on a Kube cluster, in Calico's case via the BGP-updated control plane and IP-in-IP data plane you explored above.