Scaling and Scheduling Services

By the end of this exercise, you should be able to:

Define the desired number of containers running as part of a service via the deploy block in a docker compose file
Schedule services in replicated mode to ensure exactly one replica runs on every node in your swarm

Scaling up a Service

If we've written our services to be stateless, we might hope for linear performance scaling in the number of replicas of that service. For example, our worker service requests a random number from the rng service and hands it off to the hasher service; the faster we make those requests, the higher our throughput of dockercoins should be, as long as there are no other confounding bottlenecks.

Modify the worker service definition in stack.yml to set the number of replicas to create using the replicas key:

worker:
  image: training/dc_worker:1.0
  networks:
    - dockercoins
  deploy:
    endpoint_mode: dnsrr
    replicas: 2

Update your app by running the same command you used to launch it in the first place, and check to see when your new worker replica is up and running:
```
PS: node-0 orchestration-workshop-net> docker stack deploy -c 'stack.yml' dc
PS: node-0 orchestration-workshop-net> docker service ps dc_worker
```
Once both replicas of the worker service are live, check the web frontend; you should see about double the number of hashes per second, as expected.
Scale up even more by changing the worker replicas to 10. A small improvement should be visible, but certainly not an additional factor of 5. Something else is bottlenecking dockercoins.

Scheduling Services

Something other than worker is bottlenecking dockercoins's performance; the first place to look is in the services that worker directly interacts with.

First, we need to expose ports for the rng and hasher services on their hosts, so we can probe their latency. Update their definitions in stack.yml with a ports key:

rng:
  image: training/dc_rng:1.0
  networks:
    - dockercoins
  deploy:
    endpoint_mode: dnsrr
  ports:
    - target: 80
      published: 8001
      mode: host    

hasher:
  image: training/dc_hasher:1.0
  networks:
    - dockercoins
  deploy:
    endpoint_mode: dnsrr
  ports:
    - target: 80
      published: 8002
      mode: host

Update the services by redeploying the stack file:

PS: node-0 orchestration-workshop-net> docker stack deploy -c 'stack.yml' dc

If this is successful, a docker stack ps dc should show rng and hasher exposed on the appropriate ports.

Check your Dockercoins web frontend. You may find that your mining speed has dropped to zero! When you reconfigured and rescheduled your rng and hasher services, their containers may have received new IPs. If worker doesn't re-check what IPs the DNS entries rng and hasher are meant to resolve to, it can end up trying to send traffic to dead containers after such a reschedule. Long term, we should update our application logic to be smart about re-polling IPs, but for now we can force a re-poll by scaling our worker service down and up again:
```
PS: node-0 orchestration-workshop-net> docker service scale dc_worker=0
PS: node-0 orchestration-workshop-net> docker service scale dc_worker=10
```
Double check your web frontend to make sure your mining speed is what it was before you rescheduled rng and hasher.
With rng and hasher exposed, we can use httping to probe their latency; in both cases, <public IP> is the public IP of the nodes exposing rng on 8001 and hasher on 8002, respectively:
```
PS: node-0 orchestration-workshop-net> httping -c 5 <public IP>:8001
PS: node-0 orchestration-workshop-net> httping -c 5 <public IP>:8002
```
rng is much slower to respond, suggesting that it might be the bottleneck. If this random number generator is based on an entropy collector (random voltage microfluctuations in the machine's power supply, for example), it won't be able to generate random numbers beyond a physically limited rate; we need more machines collecting more entropy in order to scale this up. This is a case where it makes sense to run exactly one copy of this service per machine, via global scheduling (as opposed to potentially many copies on one machine, or whatever the scheduler decides as in the default replicated scheduling).

Modify the definition of our rng service in stack.yml to be globally scheduled:

rng:
  image: training/dc_rng:1.0
  networks:
    - dockercoins
  deploy:
    endpoint_mode: dnsrr
    mode: global
  ports:
    - target: 80
      published: 8001
      mode: host

Scheduling can't be changed on the fly, so we need to stop our app and restart it:

PS: node-0 orchestration-workshop-net> docker stack rm dc
PS: node-0 orchestration-workshop-net> docker stack deploy -c='stack.yml' dc

Check the web frontend again (note it may be on a different node); you may still not see much improvement in overall performance, depending on how worker traffic is getting distributed across random number generators. Try scaling your worker service down to one replica and then back up to ten, and you should finally see the factor of 10 improvement in performance versus a single worker container, from 3-4 coins per second to around 35.

Conclusion

In this exercise, you explored the performance gains a distributed application can enjoy by scaling a key service up to have more replicas, and by correctly scheduling a service that needs to be replicated across different hosts. Also, bear in mind the behavior of DNSRR service name resolution; it's up to your application logic to periodically check the list of IPs being returned by the DNS lookup, and rebalance traffic across new instances as those services scale up (or down). Alternatively, rescaling the service originating the request can cause that service's replicas to rebalance their requests across the new replicas of the destination service.