Docker started out in 2012 as an open-source project, originally named dotcloud, to build single-application Linux containers. Since then, Docker containers has taken off as a popular development tool and also increasingly used as a runtime environment. Redmonk’s James Governor said of Docker, “We have never seen a technology become ubiquitous so quickly.”
One reason Docker is so popular is that it delivers the promise of “develop once, run anywhere.” Docker offers a simple way to package an application and its runtime dependencies into a single container, and provides a runtime abstraction that enables the container to run across different versions of Linux kernel. Using Docker, a developer can make a containerized application on his/her workstation and then easily deploy the container to any Docker-enabled server without having to retest or retune the container for the server environment, be that in the cloud or on premises.
In addition, Docker provides a software sharing and distribution mechanism that allows developers and operations teams to easily share and reuse container content. This distribution mechanism, coupled with portability across machines, is what gave rise to Docker’s incredible popularity with developers and operations teams.
Docker is both a development tool and a runtime environment. To understand Docker, we must first understand the concept of a Docker container image. A container always starts with an image and is considered an instantiation of that image.
An image is a static specification what the container should be in runtime, including the application code inside the container and runtime configuration settings. Docker images contain read-only layers, which means once an image is created it is never modified.
Figure 3 shows an example of a container image. This image depicts an Ubuntu image with an Apache installation. The image is a composition of three base Ubuntu layers plus an update layer, with an Apache layer and a custom file layer on top.
Figure 3: Docker image view
A running Docker container is an instantiation of an image. Containers derived from the same image are identical to each other in terms of their application code and runtime dependencies. But unlike images that are read-only, each running container includes a writable layer (a.k.a. the container layer) on top of the read-only content. Runtime changes, including any writes and updates to data and files, are saved in the container layer only. Thus multiple concurrent running containers that share the same underlying image may have different container layers.
When a running container is deleted, the writable container layer is also deleted and will not persist. The only way to persist changes is to do an explicit “docker – commit” prior to deleting the container. When you do a “docker – commit,” the running container content, including the writable layer, is written into a new container image and stored to the disk. This becomes a new image that is distinct from the prior image from which the container instantiated.
Using this explicit “commit” command, one can create a successive, discrete set of Docker images, each one built on top of the previous image. In addition, Docker uses a Copy-on-Write strategy to minimize the disk footprint of containers and images that share the same base components. This helps to optimize storage space and minimize container start time.
Figure 4: From images to containers
Figure 4 depicts the difference between an image and a running container. Note that each running container can have a different writable layer.
Beyond the image concept, Docker also has a few specific components that are different from those in Linux containers.
- Docker daemon: Also known as the Docker Engine. Docker daemon is a thin layer between the containers and the Linux OS (see Figure 2 in Part 1). Docker daemon is the persistent runtime environment that manages application containers. Any Docker container can run on any server that is Docker-daemon enabled, regardless of the underlying operating system.
- Dockerfile: Developers use Dockerfiles to build container images, which then become the basis of running containers. A Dockerfile is a text document that contains all the configuration information and commands needed to assemble a container image. With a Dockerfile, Docker daemon can automatically build a container image. This process greatly simplifies the steps for container creation. More specifically, in a Dockerfile, you first specify a “base image” from which the build process starts. You then specify a succession of commands, following which a new container image can be built.
- Docker Command Line Interface (CLI) tools: Docker provides a set of CLI commands for managing the lifecycle of image-based containers. Docker commands span development functions such as build, export, and tagging, as well as runtime functions such as running, deleting, starting and stopping a container, and more.
You can execute Docker commands against a particular Docker daemon or a registry. For instance, if you execute “Docker –ps,” the command will return a list of containers running on the daemon.
Content distribution with Docker
In addition to the runtime environment and container formats, Docker provides a software distribution mechanism, commonly known as “Registry,” that facilitates container content discovery and distribution.
The concept of registry is critical to the success of Docker, as it provides a set of utilities to pack, ship, store, discover, and reuse container content. Docker itself also runs a public, free registry called Docker hub.
Registry: A Docker registry is a place where container images are published and stored. A registry can be remote or on premises. It can be public, so everyone can use it, or private, restricted to an organization or a set of users. A Docker registry comes with a set of common APIs that allow users to build, publish, search, download, and manage container images.
Docker hub: Docker hub is a public, cloud-based container registry managed by Docker. Docker hub provides image discovery, distribution, and collaboration workflow support. In addition, Docker hub has a set of official images that are certified by Docker. These are images from known software publishers such as Canonical, Redhat, and MongoDB. Users can use official images as a basis for building their own images or applications.
Figure 5 depicts a workflow where a user constructs an image and uploads it to the registry and others can pull the image from the registry to make production containers and deploy them to Docker hosts, wherever they are.
Figure 5: Docker content distribution through registries
The Immutability of Docker Containers
One of the most interesting properties of Docker containers is their immutability and the resulting statelessness of containers.
As we described in the previous section, a Docker image, once created, does not change. A running container derived from the image has a writable layer to temporarily house runtime changes. If the container is committed prior to deletion with “docker –commit”, the changes in the writeable layer will be saved into a new image that is distinct from the previous one.
Why is immutability good? Immutable images and containers lead to an immutable infrastructure, and an immutable infrastructure has many interesting benefits that are unachievable with traditional systems. For example,
- Version control: With this explicit commit method, Docker forces you to do version control. You can keep track of successive versions of an image; rolling back to a previous image (therefore to a previous system component) is entirely possible, as previous images are kept and never modified.
- Cleaner updates and more manageable state changes: With immutable infrastructure, you no longer have to upgrade your server infrastructure, which means no need to change configuration files, no software updates, no operating system upgrades, and so on. When changes are needed, you simply make new containers and push them out to replace the old ones. This is a much more discrete and manageable method for state change.
- Minimized drift: To avoid drift, you can periodically and proactively refresh all the components in your system to ensure they all contain the latest version. This practice is a lot easier with containers that encapsulate a smaller component of the system than it is with traditional, bulky software.
The Docker Difference
Docker’s image format, its extensive APIs for container management, and the innovative software distribution mechanism via registries have made it a popular platform for development and operations teams alike.
Docker brings these notable benefits to an organization
- Minimal, declarative systems: Docker containers are at their best if they are small, single-purpose applications. This gives rise to containers that are minimal in size, which in turn leads to rapid delivery, continuous integration and deployment.
- Predicable operations: The biggest headache of system operations has always been the seemingly random behavior of the infrastructure or the applications. Docker forces you to make smaller, more manageable updates and provides a mechanism to minimize system drift; both capabilities are exactly what one needs to build predicable systems. When drifts are eliminated or minimized, you can have assurance that the same system or application should behave in an identical manner, no matter how many times you deploy them.
- Extensive software reuse: Docker containers reuse layers from other images; which promotes software reuse. The way Docker shares images via registries is another great means to further the sharing and reuse of components.
- True multi-cloud portability: Docker brings true platform independence, which allows containers to migrate freely between different cloud platforms, on-premises infrastructures, and even development workstations.
Docker is already changing the way organizations build systems and deliver services. It is also starting to reshape the way we think about software design and the economics of software delivery. But before these changes truly take roots, organizations need to better understand how to manage security and policies for the Docker environment, which is the topic of our next chapter.
Next Chapter: Container security (coming soon)
Back to Chapter 1: Containers, LXC, and Docker
- Container Security
Geek Guide: Deploying Kubernetes with Security and Compliance in Mind
Guide to Modernizing Traditional Security
Containers for Better Application Defense
Modern App Security Requires Containers – Dockercon EU 2017 Panel
Get Stronger Security through Containers and Machine Learning – Dockercon EU 2017 session