Updating a Service

By the end of this exercise, you should be able to:

  • Update a swarm service's underlying image, controlling update parallelism, speed, and rollback contingencies

Creating Rolling Updates

  1. First, let's change one of our services a bit: open orchestration-workshop-net/worker/Program.cs in your favorite text editor, and find the following section:

    private static void WorkOnce(){
        Console.WriteLine("Doing one unit of work");
        Thread.Sleep(100);  // 100 ms
    

    Change the 100 to a 10. Save the file, exit the text editor.

  2. Rebuild the worker image with a tag of <Docker ID>/dc_worker:1.1, and push it to Docker Hub.

  3. Start the update:

    PS: node-0 orchestration-workshop-net> docker service update dc_worker `
        --image <Docker ID>/dc_worker:1.1
    

    Tasks get updated to our new 1.1 image one at a time. Keep watching until all of your workers have been updated to the new image version.

Parallelizing Updates

  1. We can also set our updates to run in batches by configuring some options associated with each service. Change the update parallelism to 2 and the delay to 5 seconds on the worker service by editing its definition in stack.yml:

    worker:
      image: training/dc_worker:1.0
      networks:
        - dockercoins
      deploy:
        endpoint_mode: dnsrr
        replicas: 10
        update_config:
          parallelism: 2
          delay: 5s
    
  2. Roll back the worker service to 1.0:

    PS: node-0 orchestration-workshop-net> docker stack deploy -c 'stack.yml' dc
    

    Run docker service ps dc_worker every couple of seconds, as above; you should see pairs of worker tasks getting shut down and replaced with the 1.0 version, with a 5 second delay between updates (this is perhaps easiest to notice by examining the NAME column - every worker replica will start with one dead task from when you upgraded in the last step; you should be able to notice pairs of tasks with two dead ancestors as this rollback moves through the list, two at a time).

Auto-Rollback Failed Updates

In the event of an application or container failure on deployment, we'd like to automatically roll the update back to the previous version.

  1. Update the worker service, this time from the CLI, with some parameters to define rollback:

    PS: node-0 orchestration-workshop-net> docker service update `
        --update-max-failure-ratio=0.2 `
        --update-monitor=20s `
        --update-failure-action=rollback `
        dc_worker
    

    These parameters will trigger a rollback if more than 20% of services tasks fail in the first 20 seconds after an update.

  2. Make a broken version of the worker service to trigger a rollback with; try removing all the using commands at the top of worker/Program.cs, for example. Then rebuild the worker image with a tag <Docker ID>/dc_worker:bugged, push it to Docker Hub, and attempt to update your service:

    PS: node-0 orchestration-workshop-net> docker image build `
        -t <Docker ID>/dc_worker:bugged worker
    PS: node-0 orchestration-workshop-net> docker image push `
        <Docker ID>/dc_worker:bugged
    PS: node-0 orchestration-workshop-net> docker service update dc_worker `
        --image <Docker ID>/dc_worker:bugged
    
  3. Several worker tasks should attempt to update to the :bugged image, but after enough of these updates fail, the update is halted and rolled back to the previous version of the image automatically:

    PS: node-0 worker>docker service update dc_worker --image training/dc_worker:bugged
    dc_worker
    overall progress: rolling back update: 0 out of 10 tasks
    1/10: starting  [============================================>      ]
    2/10: ready     [======================================>            ]
    3/10: ready     [===========>                                       ]
    4/10: ready     [======================================>            ]
    5/10: ready     [======================================>            ]
    6/10: ready     [===========>                                       ]
    7/10: starting  [============================================>      ]
    8/10: ready     [===========>                                       ]
    9/10: assigned  [===========================>                       ]
    10/10: running   [==================================================>]
    rollback: update rolled back due to failure or early termination of task potgxcb3mhxvsq04hrsbfrviz
    
  4. Finally, do docker service ps dc_worker one last time to see the results of the rollback:

    PS: node-0 worker>docker service ps dc_worker
    
    ID                  NAME                IMAGE                       ...  DESIRED STATE       CURRENT STATE
    b11oaafcp8by        dc_worker.1         training/dc_worker:1.0      ...  Running             Running 21 seconds ago
    zn75m2i3ej9w         \_ dc_worker.1     training/dc_worker:bugged   ...  Shutdown            Failed 33 seconds ago
    hiuj9nbcqu4g         \_ dc_worker.1     training/dc_worker:bugged   ...  Shutdown            Failed about a minute ago
    m9vw8eabc8oa         \_ dc_worker.1     training/dc_worker:bugged   ...  Shutdown            Failed about a minute ago
    1ne2992nbfq0         \_ dc_worker.1     training/dc_worker:bugged   ...  Shutdown            Failed about a minute ago
    l5q8tu7hkaoi        dc_worker.2         training/dc_worker:1.0      ...  Running             Running 34 seconds ago
    yn6syswiw964         \_ dc_worker.2     training/dc_worker:bugged   ...  Shutdown            Failed 45 seconds ago
    3dpphwi4ipqz         \_ dc_worker.2     training/dc_worker:bugged   ...  Shutdown            Failed about a minute ago
    kjey2u5pi07z         \_ dc_worker.2     training/dc_worker:1.0      ...  Shutdown            Shutdown about a minute ago
    8u6nabylcwc9         \_ dc_worker.2     training/dc_worker:1.1      ...  Shutdown            Shutdown 5 minutes ago
    

    The failed deployment of :bugged images has been rolled back to the previous :1.0 image.

Shutting Down a Stack

  1. To shut down a running stack:

    PS: node-0 orchestration-workshop-net> docker stack rm <stack name>
    

    Where the stack name can be found in the output of docker stack ls.

Conclusion

In this exercise, we explored deploying and redeploying an application as stacks and services. Note that relaunching a running stack updates all the objects it manages in the most non-disruptive way possible; there is usually no need to remove a stack before updating it. In production, rollback contingencies should always be used to cautiously upgrade images, cutting off potential damage before an entire service is taken down.