Docker for SDL

Why?

There are two recurring problems in SDL projects that often block forward progress:

Docker is a general purpose tool that has lots of uses, but our initial focus will be on using it to solve the above two problems.

Docker Basics

Docker is a platform for building and running applications. This is similar to an operating system such as Windows or Linux, but with a full operating system installed software is available to all users. With Docker you can install your own software packages separate from any others, allowing a build to use just the packages needed without conflicts with the build requirements for other software systems. This is similar to a virtual machine, in which a computer runs a program that translates all operations for one machine (say, Windows) into commands that can be interpreted by another (say, Linux). But the difference is that Docker takes advantage of existing services so installations do not need to repeat work that is already done. This makes for a significantly lighter footprint.

Docker runs your application in a container. In many cases, containers are an operating system combined with any software packages you need to build and run your code. For example, you might use Ubuntu Linux as your operating system, the version 23 of the Java Development Kit, and Apache for a web server. Commonly, Docker is used to create an image, and then that image is run on a Docker server to create a (running) container.

Docker build steps are recorded in a Dockerfile. This is a sequence of steps; each step creates a new image by applying a command to the previous image. By default, For example, you might a Docker file such as

    FROM ubuntu:18.04
    RUN apt-get update; apt-get install -y g++ cmake git libgtest-dev
    RUN cd /usr/src/gtest; cmake CMakeLists.txt; make; cp *.a /usr/lib
    COPY . /cube
    RUN cd /cube; make clean; make test

The first step declares the base image to be version 18.04 of Ubuntu. Docker searches standard registries for the given image and copies it to your system. Next, Docker executes apt-get to install various packages such as G++ and CMake. Third, it executes build commands to build the gtest library; this is a simple C++ testing library from Google. The COPY command copies your source files into the image in the folder /cube, and the last RUN command builds the C++ code and executes tests.

A key feature of Docker is each step in the Dockerfile is stored as a separate image. If the only change to any source files is in the later steps, Docker reuses images from earlier steps. For example, if you are using the above Dockerfile and the only thing changing is your source code (the code being copied into the image by COPY), then it does not have to repeat the work of downloading Ubuntu or reinstalling software packages. This will not be quite as fast as building your system in an IDE, but it will be close. And a win is that the Dockerfile documents a sequence of reproducible steps to build and run the system. Another win is that you can publish images that you create to a registry. Once it is in a registry, you can download it to cloud servers and run your application on those servers without repeating build steps. This is known as “containerizing” your application, and is how many systems are deployed. Handling high demand (“scaling up”) is as simple as running the container on yet another cloud server. In fact, this is the basis for building out a microservice application: | microservices [can be seen as] a design pattern where individual application features are developed, deployed, and managed as small independent apps running as containers. (Poulton, Nigel. Getting Started with Docker: Learn Docker in a Day!)

To get started, install Docker Desktop on your computer:

To check that your installation is working, start Docker Desktop and click on > Terminal in the lower right corner. This opens a terminal session. You can then type

    docker run hello-world

Be sure to use a dash, not an underscore. If you get no response, you might have to first type docker login and then retype the run command. When it is successful, you will see a pull command for the image followed by Hello from Docker! followed by some information about next steps. You will also notice you have a hello-world image running in the Docker Desktop Containers tab. You can delete it by clicking on the trash can icon. Switch to the Images tab and notice the hello-world image is still present. Re-executing the run will start a new container and execute it. If you like, you can delete both the image and the container(s) to clean up.

One of the hello-world suggestions is to start an Ubuntu container. Type

    docker run -it ubuntu bash

This will pull ubuntu:latest and open up a Bash prompt. Type ls to list the top level directories, and ls bin to list the commands in the bin directory. Note that you are executing as root; that is, in the super-user account. Normally this would be a bad idea, but since Docker isolates the container from the rest of your computer, you will not introduce security issues. One of the commands you can use is to install some classic terminal games:

    apt update
    apt install bsdgames

List the games you just installed by typing ls /usr/games, and execute hangman by typing /usr/games/hangman. Exit the game by pressing Control-C (that is, hold down the control key and press the c key).

Before you exit Ubuntu, type ps -e. The ps command lists the process status of jobs on the machine, and -e makes it list all jobs including those run by root. There are few jobs because Docker uses the host’s operating system to perform low-level actions rather than starting its own services. When you start bash in the docker run command, that was the only job running in the container. Any further jobs would be ones you start from Bash. This makes Docker a light-weight mechanism.

To exit the shell, you can either press Control-D or type exit. Control-D is end-of-file in Unix. It works at the shell prompt because there is a loop in the shell interpreter that exits at end-of-file; you will get used to using Control-D to exit applications in Unix. But typing exit works as well.

Using Docker as a Build System

Open Docker Terminal and type the following commands to retrieve a simple project to build and test a C++ program:

    git clone https://gitlab.com/hasker/cube.git
    cd cube
    docker build .

There will be a number of steps downloading system updates, C++, and various build tools. The fact that there are a large number of files downloaded shows that we might need a slightly modified process for production. But at the end you will see commands copying the code into the repository (in the directory /cube), building the executable, and running the tests:

    #9 [4/5] COPY . /cube
    #9 DONE 0.0s

    #10 [5/5] RUN cd /cube; make clean; make test
    #10 0.106 rm -f *.o cube cube_test 
    #10 0.108 g++ -std=c++14 -Wall -c cube.cpp
    #10 0.138 g++ -std=c++14 -Wall -c cube_test.cpp
    #10 0.406 g++ -pthread -std=c++14 -Wall cube.o cube_test.o -lgtest_main -lgtest -lpthread -o cube_test
    #10 0.459 ./cube_test
    #10 0.460 [==========] Running 3 tests from 1 test case.
    #10 0.460 [----------] Global test environment set-up.
    #10 0.460 [----------] 3 tests from CubeTest
    #10 0.460 [ RUN      ] CubeTest.SmallNaturalNumbers
    #10 0.460 [       OK ] CubeTest.SmallNaturalNumbers (0 ms)
    #10 0.460 [ RUN      ] CubeTest.Zero
    #10 0.460 [       OK ] CubeTest.Zero (0 ms)
    #10 0.460 [ RUN      ] CubeTest.NegativeNumbers
    #10 0.460 [       OK ] CubeTest.NegativeNumbers (0 ms)
    #10 0.460 [----------] 3 tests from CubeTest (0 ms total)
    #10 0.460 
    #10 0.460 [----------] Global test environment tear-down
    #10 0.460 [==========] 3 tests from 1 test case ran. (0 ms total)
    #10 0.460 [  PASSED  ] 3 tests.
    #10 DONE 0.5s

The RUN and OK lines are output from the test environment this uses, GoogleTest. This shows all tests passed.

Edit cube.cpp in the folder and introduce a bug; simple ones are to negate x or add another * x. This will fail, but there will not be much information. Go to the Builds tab in Docker Desktop and click on the ID for the topmost (most recent) build. You should see a red banner; click on error logs on that line. This gives the full output that (in this case) includes expected values and actual computed values.

It is tempting to think that this is “bad”. But it is not! If your code contains errors, you want the build environment to let you know! A key thing about professional development is that if you make a mistake and introduce an error, you want to know about it as soon as possible. Waiting until your client finds the error is bad form. Worse, it is inefficient: if you fix a problem you found, you just do it and move on. But if a client finds the error, you have to reproduce it for yourself, document the change, likely add a whole bunch of tests to restore faith in your product, and probably have several meetings over the whole thing. The value of strong testing and early detection of errors cannot be overestimated.

Fix the code. You could undo your edits in cube.cpp or simply type

    git restore .

Then re-execute the docker build command. Note that the setup operations are not repeated; the only actions executed are to copy the modified code to the /code directory and re-execute the compile and run steps. This is because Docker caches results. If you try another docker build . when there has been no changes, all steps are reported as cached and the tests are not actually re-executed. You have to change the source code to force recompiling and re-running tests.

CI

Up to now we have been running Docker manually. This works as long as every developer remembers to run the tests, but people get rushed. The real value of automated tests are that they can be run by the repository system such as GitLab. This is known as continuous integration, or more often, as CI. In this context, “integration” means integrating all of the pieces of the code together and ensuring the integrated system works; i.e., passes all tests.

Setting up CI in GitLab is simple: create a .gitlab-ci.yml file. The extension “yml” is short for “yet another markup language” and is a format where items are grouped by indentation. Preserve the spaces at the beginnings of lines and you do not have to spend a lot of time on the format details. See the .gitlab-ci.yml in the cube project; it specifies using a docker image to run the test, that there is a single build stage, and that the script for the build step is docker build . You could create your own GitLab repository from this project to see the CI execute, or you can just visit https://gitlab.com/hasker/cube, navigate to the Build tab, and select Jobs. Hopefully the most recent run is marked Passed. Click on the 10-digit number followed by build (as of this writing, #10453822658: build) to see the full output of the run.

The basic strategy is to build out a test suite that can be run from Docker and then set up a pipeline to execute that test suite on each commit, or at least each commit to your dev and main branches.

Setting up a GitLab Runner

Note that the build output for the cube code is very long, and each build repeats all of the steps. Docker normally caches results, but these caches are ignored by the computers that execute pipelines (the “runners”) because the runners are shared. Using cached results would allow one organization to capture data from another. There are two solutions. This section describes how to set up another machine as a runner.

If your team has access to a machine with a name like sdlstudentvmXX.msoe.edu (where the X’s are digits), then you can try the following. Otherwise, skip to the next section for now.

To set up a runner on an SDLStudentVM machine:

  1. Use the following to add the GitLab repository:
        curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash
    
  2. sudo apt install gitlab-runner
  3. See https://docs.gitlab.com/runner/register/ to register a GitLab runner:
    1. Create a project runner token:
      • Set the name to the name of the machine: `sdlstudentvmXX`
      • Use sudo for the gitlab-runner command
      • GitLab instance: select default
      • Pick docker for the executor
      • Set the default image to what you would put in the .gitlab-ci.yml file
  4. Follow to set up the runner

Docker Registries

Another way to improve the build time is to create your own docker image. and publish it in Gitlab. You then inject your source code into this to build the project. Doing this with the cube project:

    cp Dockerfile CIDockerfile
    docker init

Then edit Dockerfile to be just the lines starting with FROM, RUN apt-get, and RUN cd. That is, the Dockerfile will download Ubuntu, install the build tools, and set up the gtest library.

Next, set up a project access token:

  1. Visit your project repository in GitLab
  2. Select Settings and then Access tokens
  3. Click on Add new token
  4. Set the Token name to something like “registry [year]” and the Expiration date to a year from now
  5. Set the role to Developer
  6. Set the Selected scopes to read_registry and write_registry
  7. Click on Create project access token
  8. Save the project access token somewhere you can get to it later, preferably in a password manager

You will use this token to publish your container:

  1. Go to your Docker Terminal
  2. Enter the command docker login registry.gitlab.com
  3. Enter your GitLab username (not the email address)
  4. For the password, enter the registry access token you created using the above steps
  5. Go to your project repository in GitLab
  6. Open the the Deploy menu, and select Container registry
  7. Execute the build and push steps given by the instructions on the page

The build step will likely use cached results. You can see the container in the registry by refreshing the page in GitLab. To use the container,

  1. Assuming you are still on the Container registry page, click on the clipboard icon by your container name
  2. Edit CIDockerfile and change the FROM line to be FROM [your container]; this line will look something like FROM registry.gitabl.com/XXX/cube
  3. Delete the installation lines that are now in Dockerfile, leaving the COPY line and the final RUN line.
  4. Type docker build -f CIDockerfile . Assuming this works, you can then edit your .gitlab-ci.yml file to use the new build, changing the last line to
        - docker build -f CIDockerfile .

Minimal Images and Versions

Ubuntu is useful because it is a pretty full-featured version of Linux. But for builds, you do not need a full version. Generally, builds will happen much faster if you change to Apline Linux. The file Dockerfile.alpine illustrates using Alpine linux to build the project. This has been tested at about 9 seconds to build and test the Cube project compared to the 34 seconds for the Ubuntu-based version. Review Dockerfile.alpline and note the following:

        docker run -it alpine sh
        apk update
        apk search g++

Cleanup

If you click on the Containers and Images you will likely see a lot of entries at this point. You can delete them one at a time, but that can take a while. A faster way is to enter the command

    docker system prune -a

This deletes all stopped containers, all images without at least one associated container, and build caches. This often reclaims a significant amount of disk space.