Scaling and Scheduling Services
By the end of this exercise, you should be able to:
- Define the desired number of containers running as part of a service via the
deployblock in a docker compose file - Schedule services in replicated mode to ensure exactly one replica runs on every node in your swarm
Scaling up a Service
If we've written our services to be stateless, we might hope for linear performance scaling in the number of replicas of that service. For example, our worker service requests a random number from the rng service and hands it off to the hasher service; the faster we make those requests, the higher our throughput of dockercoins should be, as long as there are no other confounding bottlenecks.
Modify the
workerservice definition indocker-compose.ymlto set the number of replicas to create using thedeployandreplicaskeys:worker: image: user/dockercoins_worker:1.0 networks: - dockercoins deploy: replicas: 2Update your app by running the same command you used to launch it in the first place, and check to see when your new worker replica is up and running:
[centos@node-0 dockercoins]$ docker stack deploy -c docker-compose.yml dc [centos@node-0 dockercoins]$ docker service ps dc_workerOnce both replicas of the
workerservice are live, check the web frontend; you should see about double the number of hashes per second, as expected.Scale up even more by changing the
workerreplicas to 10. A small improvement should be visible, but certainly not an additional factor of 5. Something else is bottlenecking dockercoins.
Scheduling Services
Something other than worker is bottlenecking dockercoins's performance; the first place to look is in the services that worker directly interacts with.
The
rngandhasherservices are exposed on host ports 8001 and 8002, so we can usehttpingto probe their latency:[centos@node-0 dockercoins]$ httping -c 5 localhost:8001 [centos@node-0 dockercoins]$ httping -c 5 localhost:8002rngis much slower to respond, suggesting that it might be the bottleneck. If this random number generator is based on an entropy collector (random voltage microfluctuations in the machine's power supply, for example), it won't be able to generate random numbers beyond a physically limited rate; we need more machines collecting more entropy in order to scale this up. This is a case where it makes sense to run exactly one copy of this service per machine, viaglobalscheduling (as opposed to potentially many copies on one machine, or whatever the scheduler decides as in the defaultreplicatedscheduling).Modify the definition of our
rngservice indocker-compose.ymlto be globally scheduled:rng: image: user/dockercoins_rng:1.0 networks: - dockercoins ports: - "8001:80" deploy: mode: globalScheduling can't be changed on the fly, so we need to stop our app and restart it:
[centos@node-0 dockercoins]$ docker stack rm dc [centos@node-0 dockercoins]$ docker stack deploy -c=docker-compose.yml dcCheck the web frontend again; the overall factor of 10 improvement (from ~3 to ~35 hashes per second) should now be visible.
Conclusion
In this exercise, you explored the performance gains a distributed application can enjoy by scaling a key service up to have more replicas, and by correctly scheduling a service that needs to be replicated across different hosts.