Docker for SDL

Why?

There are two recurring problems in SDL projects that often block forward progress:

It works on my machine: Tests run on one person’s computer but not another’s, typically because someone has installed a slightly different library or has configured some option slightly differently. Software has to work all of the time; it is not enough that one person on the team can run the software. Everyone, including clients, must be able to run it using published configurations.
Endless time spent setting up a development environment: because setup is based on capturing detailed instructions and everyone’s computer is slightly different, getting a new project to build on your machine means debugging someone else’s build instructions. The core issue is that the build documentation is written by hand, and people make errors when capturing steps. This soaks a lot of productive time, leading to the mistake (an “anti-pattern”) of spending whole sprints on doing nothing but trying to get build environments set up.

Docker is a general purpose tool that has lots of uses, but our initial focus will be on using it to solve the above two problems.

Docker Basics

Docker is a platform for building and running applications. This is similar to an operating system such as Windows or Linux, but with a full operating system installed software is available to all users. With Docker you can install your own software packages separate from any others, allowing a build to use just the packages needed without conflicts with the build requirements for other software systems. This is similar to a virtual machine, in which a computer runs a program that translates all operations for one machine (say, Windows) into commands that can be interpreted by another (say, Linux). But the difference is that Docker takes advantage of existing services so installations do not need to repeat work that is already done. This makes for a significantly lighter footprint.

Docker runs your application in a container. In many cases, containers are an operating system combined with any software packages you need to build and run your code. For example, you might use Ubuntu Linux as your operating system, the version 23 of the Java Development Kit, and Apache for a web server. Commonly, Docker is used to create an image, and then that image is run on a Docker server to create a (running) container.

Docker build steps are recorded in a Dockerfile. This is a sequence of steps; each step creates a new image by applying a command to the previous image. By default, For example, you might a Docker file such as

FROM ubuntu:18.04
RUN apt-get update; apt-get install -y g++ cmake git libgtest-dev
RUN cd /usr/src/gtest; cmake CMakeLists.txt; make; cp *.a /usr/lib
COPY . /cube
RUN cd /cube; make clean; make test

The first step declares the base image to be version 18.04 of Ubuntu. Docker searches standard registries for the given image and copies it to your system. Next, Docker executes apt-get to install various packages such as G++ and CMake. Third, it executes build commands to build the gtest library; this is a simple C++ testing library from Google. The COPY command copies your source files into the image in the folder /cube, and the last RUN command builds the C++ code and executes tests.

A key feature of Docker is each step in the Dockerfile is stored as a separate image. If the only change to any source files is in the later steps, Docker reuses images from earlier steps. For example, if you are using the above Dockerfile and the only thing changing is your source code (the code being copied into the image by COPY), then it does not have to repeat the work of downloading Ubuntu or reinstalling software packages. This will not be quite as fast as building your system in an IDE, but it will be close. And a win is that the Dockerfile documents a sequence of reproducible steps to build and run the system. Another win is that you can publish images that you create to a registry. Once it is in a registry, you can download it to cloud servers and run your application on those servers without repeating build steps. This is known as “containerizing” your application, and is how many systems are deployed. Handling high demand (“scaling up”) is as simple as running the container on yet another cloud server. In fact, this is the basis for building out a microservice application:

microservices [can be seen as] a design pattern where individual application features are developed, deployed, and managed as small independent apps running as containers. (Poulton, Nigel. _Getting Started with Docker: Learn Docker in a Day!_)

To get started, install Docker Desktop on your computer:

Windows users: see https://docs.docker.com/desktop/setup/install/windows-install/, preferably using the steps based on WSL 2 (Windows Subsystem for Linux)
Mac users: A simple solution is to use brew install docker, but an alternative is to see https://docs.docker.com/desktop/setup/install/mac-install/
Linux users: see https://docs.docker.com/desktop/setup/install/linux/

To check that your installation is working, start Docker Desktop and click on >_ Terminal in the lower right corner. This opens a terminal session. You can then type

docker run hello-world

Be sure to use a dash, not an underscore. If you get no response, you might have to first type docker login and then retype the run command. When it is successful, you will see a pull command for the image followed by Hello from Docker! followed by some information about next steps. Click on the link to see a sample output. You will also notice you have a hello-world image running in the Docker Desktop Containers tab. You can delete it by clicking on the trash can icon. Switch to the Images tab and notice the hello-world image is still present. Re-executing the run will start a new container and execute it. If you like, you can delete both the image and the container(s) to clean up.

One of the hello-world suggestions is to start an Ubuntu container. Type

docker run -it ubuntu bash

This will pull ubuntu:latest and open up a Bash prompt. Type ls to list the top level directories, and ls bin to list the commands in the bin directory. Note that you are executing as root; that is, in the super-user account. Normally this would be a bad idea, but since Docker isolates the container from the rest of your computer, you will not introduce security issues. One of the commands you can use is to install some classic terminal games:

apt update
apt install bsdgames

The first steps are shown in this image. List the games you just installed by typing ls /usr/games, and execute hangman by typing /usr/games/hangman. Exit the game by pressing Control-C (that is, hold down the control key and press the c key).

Before you exit Ubuntu, type ps -e. The ps command lists the process status of jobs on the machine, and -e makes it list all jobs including those run by root. There are few jobs because Docker uses the host’s operating system to perform low-level actions rather than starting its own services. When you start bash in the docker run command, that was the only job running in the container. Any further jobs would be ones you start from Bash. This makes Docker a light-weight mechanism.

To exit the shell, you can either press Control-D or type exit. Control-D is end-of-file in Unix. It works at the shell prompt because there is a loop in the shell interpreter that exits at end-of-file; you will get used to using Control-D to exit applications in Unix. But typing exit works as well.

To learn more about Docker, images, and containers, check out resources such as

Docker Up & Running
DevOps Training Institute discussion on Docker images and containers

Using Docker as a Build System

Open Docker Terminal and type the following commands to retrieve a simple project to build and test a C++ program:

git clone https://gitlab.com/hasker/cube.git
cd cube
docker build .

There will be a number of steps downloading system updates, C++, and various build tools. The fact that there are a large number of files downloaded shows that we might need a slightly modified process for production. But at the end you will see commands copying the code into the repository (in the directory /cube), building the executable, and running the tests:

#9 [4/5] COPY . /cube
#9 DONE 0.0s

#10 [5/5] RUN cd /cube; make clean; make test
#10 0.106 rm -f *.o cube cube_test 
#10 0.108 g++ -std=c++14 -Wall -c cube.cpp
#10 0.138 g++ -std=c++14 -Wall -c cube_test.cpp
#10 0.406 g++ -pthread -std=c++14 -Wall cube.o cube_test.o -lgtest_main -lgtest -lpthread -o cube_test
#10 0.459 ./cube_test
#10 0.460 [==========] Running 3 tests from 1 test case.
#10 0.460 [----------] Global test environment set-up.
#10 0.460 [----------] 3 tests from CubeTest
#10 0.460 [ RUN      ] CubeTest.SmallNaturalNumbers
#10 0.460 [       OK ] CubeTest.SmallNaturalNumbers (0 ms)
#10 0.460 [ RUN      ] CubeTest.Zero
#10 0.460 [       OK ] CubeTest.Zero (0 ms)
#10 0.460 [ RUN      ] CubeTest.NegativeNumbers
#10 0.460 [       OK ] CubeTest.NegativeNumbers (0 ms)
#10 0.460 [----------] 3 tests from CubeTest (0 ms total)
#10 0.460 
#10 0.460 [----------] Global test environment tear-down
#10 0.460 [==========] 3 tests from 1 test case ran. (0 ms total)
#10 0.460 [  PASSED  ] 3 tests.
#10 DONE 0.5s

This test uses the GoogleTest framework (like JUnit, but for C++). The lines containing RUN and OK in them (marked #10 0.460 above) are the actual output from GoogleTest. This shows all tests passed.

Edit cube.cpp in the folder and introduce a bug; simple ones are to negate x or add another * x. This will fail, but there will not be much information. Go to the Builds tab in Docker Desktop and click on the ID for the topmost (most recent) build. You should see a red banner; click on error logs on that line. This gives the full output that (in this case) includes expected values and actual computed values.

Note that it is good that this fails! That is, you want a test environment to fail if you introduce errors. That is its purpose: to identify bad code. The alternative is that your client finds the error, which both makes you look bad and is much more expensive to fix because of the extra documentation requirements. Successful tests are ones that find undiscovered errors. The value of strong testing and early detection of errors cannot be overestimated.

This example illustrates something else: always test your test environment. It is tempting to just assume it all works, but unless you periodically introduce errors and confirm the environment catches them, you do not know if your system is successfully finding errors.

Go back to your local copy of the cube project and fix the code. You could undo your edits in cube.cpp or simply type

git restore .

Then re-execute the docker build command. Note that the setup operations are not repeated; the only actions executed are to copy the modified code to the /code directory and re-execute the compile and run steps. This is because Docker caches results. If you try another docker build . when there has been no changes, all steps are reported as cached and the tests are not actually re-executed. You have to change the source code to force recompiling and re-running tests.

CI

Up to now we have been running Docker manually. This works as long as every developer remembers to run the tests, but people get rushed. The real value of automated tests are that they can be run by the repository system such as GitLab. This is known as continuous integration, or more often, as CI. In this context, “integration” means integrating all of the pieces of the code together and ensuring the integrated system works; i.e., passes all tests.

CI in GitLab is controlled by a special file, .gitlab-ci.yml. The extension “yml” is short for “yet another markup language” and is a format where items are grouped by indentation. This makes the format simple to edit: use extra spaces to show subitems. Keep columns lined up and the file should be appropriately formatted. View the .gitlab-ci.yml file in the cube project as an example. It specifies using a docker image to run the test, that there is a single build stage, and that the script for the build step is docker build .

To see GitLab CI in action, simply visit https://gitlab.com/hasker/cube, navigate to the Build tab, and select Jobs. Hopefully the most recent run is marked Passed. Click on the 10-digit number followed by build (as of this writing, #10453822658: build) to see the full output of the run. Note multiple elements are showing that the tests pass: green checkmarks, green buttons, and the build log. You can copy this project into your own repository if you would like to experiment more with it.

The basic strategy in projects is to build out a test suite that can be run from Docker and then set up a pipeline to execute that test suite on each commit, or at least on each commit to your dev and main branches. However, we could be more efficient in how we applied docker in the above examples. The next sections talk about speeding up CI.

Setting up a GitLab Runner

Note that the build output for the cube code is very long, and each build repeats all of the steps. Docker normally caches results, but these caches are ignored by the computers that execute pipelines (the “runners”) because the runners are shared. Using cached results would allow one organization to capture data from another. There are two solutions. This section describes how to set up another machine as a runner.

If your team has access to a machine with a name like sdlstudentvmXX.msoe.edu (where the X’s are digits), then you can try the following. Otherwise, skip to the next section for now.

To set up a runner on an SDLStudentVM machine:

Ask your instructor for **Maintainer** access to your GitLab repository. You need this to set up a runner.

On the sdlstudentvm, use the following to add the GitLab repository:

    curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash

Then: sudo apt install gitlab-runner
See https://docs.gitlab.com/runner/register/ to register a GitLab runner:
1. Create a project runner token:
  - Set the name to the name of the machine: `sdlstudentvmXX`
  - Use sudo for the gitlab-runner command
  - GitLab instance: select default
  - Pick docker for the executor
  - Set the default image to what you would put in the .gitlab-ci.yml file
Follow to set up the runner

Docker Registries

Another way to improve the build time is to create your own docker image. and publish it in Gitlab. You then inject your source code into this to build the project. Do this with the cube project. Use the terminal window to cd into the appropriate folder and type

cp Dockerfile CIDockerfile
docker init

Then edit Dockerfile to be just the lines starting with FROM, RUN apt-get, and RUN cd. That is, the Dockerfile will download Ubuntu, install the build tools, and set up the gtest library. Next, set up a project access token:

Visit your project repository in GitLab
Select Settings and then Access tokens
Click on Add new token
Set the Token name to something like “registry [year]” and the Expiration date to a year from now
Set the role to Developer
Set the Selected scopes to read_registry and write_registry
Click on Create project access token
Save the project access token somewhere you can get to it later, preferably in a password manager

You will use this token to publish your container:

Go to your Docker Terminal
Enter the command docker login registry.gitlab.com
Enter your GitLab username (not the email address)
For the password, enter the registry access token you created using the above steps
Go to your project repository in GitLab
Open the the Deploy menu, and select Container registry
Execute the build and push steps given by the instructions on the page

The build step will likely use cached results. You can see the container in the registry by refreshing the page in GitLab. To use the container,

Assuming you are still on the Container registry page, click on the clipboard icon by your container name
Edit CIDockerfile and change the FROM line to be FROM [your container]; this line will look something like FROM registry.gitabl.com/XXX/cube
Delete the installation lines that are now in Dockerfile, leaving the COPY line and the final RUN line.
Type docker build -f CIDockerfile . Assuming this works, you can then edit your .gitlab-ci.yml file to use the new build, changing the last line to
```
- docker build -f CIDockerfile .
```

Your published container can incorporate as many steps as you like, but be aware that it is public so others can see its contents. Including too many steps can make the container brittle, requiring more frequent updates.

Minimal Images and Versions

Ubuntu is useful because it is a pretty full-featured version of Linux. But for builds, you do not need a full version. Generally, builds will happen much faster if you change to Apline Linux. The file Dockerfile.alpine illustrates using Alpine linux to build the project. This has been tested at about 9 seconds to build and test the Cube project compared to the 34 seconds for the Ubuntu-based version. Review Dockerfile.alpline and note the following:

The FROM line indicates a specific version of alpine (after the colon). If you do not specify the version, Docker defaults to latest. This is great for exploration, but leads to unstable builds because version changes can introduce more randomness. You should also specify package versions with the package manager (such as apt-get).
The first RUN line uses apk to install packages rather than apt-get. apk is the package installer used by Alpine Linux. The package names overlap, but sometimes you need to find your own package. To see all g++ packages (for example), type

        docker run -it alpine sh
        apk update
        apk search g++

Installing the apk file gtest-dev means that there is no need to build the Google Test package using CMake. There is probably a similar package available for apt-get.

Capturing Evidence

To show you have completed this exercise, return to the directory containing your local copy of the cube repository. Introduce a logic error that fails some but not all tests. Capture two screen shots, one of the broken code and one showing the output of /cube_test (in the Docker terminal window) that shows your test setup catches the error.

Cleanup

If you click on the Containers and Images you will likely see a lot of entries at this point. You can delete them one at a time, but that can take a while. A faster way is to enter the command

    docker system prune -a

This deletes all stopped containers, all images without at least one associated container, and build caches. This often reclaims a significant amount of disk space.