This installment of Open source Java projects introduces Java developers to Docker Swarm. You'll learn why so many enterprise shops have adopted container-managed development via Docker, and why clustering is an important technique for working with Docker containers. You'll also find out how two popular Docker clustering technologies--Amazon ECS and Docker Swarm--compare, and get a quick guide to choosing the right solution for your shop or project. The introduction concludes with a hands-on demonstration of using Docker Swarm to develop and manage a two-node enterprise cluster.
What's the deal with Docker?
Docker is an open platform for building, shipping, and running distributed applications. Dockerized applications can run locally on a developer's machine, and they can be deployed to production across a cloud-based infrastructure. Docker lends itself to rapid development and enables continuous integration and continuous deployment like almost no other technology does. Because of these features, it's a platform that every developer should know how to use.
It's essential to understand that Docker is a containerization technology, not a virtualization technology. Whereas a virtual machine contains a complete operating system and is managed by a heavyweight process called a hypervisor, a container is designed to be very lightweight and self-contained. Each server runs a daemon process called a Docker engine that runs containers and translates operating system calls inside the container into native calls on the host operating system. A container, which is analogous to a virtual machine, only much smaller, hosts your application, runtime environment, and a barebones operating system. Containers typically run on virtual machines. Whereas a virtual machine can take minutes to startup, a container can do it in seconds.
Figure 1 illustrates the difference between a container and a virtual machine.
Docker containers are self-contained, which means that they include everything that they need to run your application. For example, for a web application running in Tomcat, the container would include:
- A WAR file
- The base operating system
Figure 2 shows the architecture of a web app inside a Docker container.
In the case of Docker, each virtual machine runs a daemon process called the Docker engine. You build your application, such as your WAR file, and then create a corresponding Dockerfile. A Dockerfile is a text file that describes how to build a Docker image, which is a binary file containing everything needed to run the application. As an example, you could build a Dockerfile from a Tomcat base image containing a base Linux OS, Java runtime, and Tomcat. After instructing Docker to copy a WAR file to Tomcat's webapps directory, the Dockerfile would be compiled into a Docker image consisting of the base OS, JVM, Tomcat, and your WAR file. You can run the Docker image locally, but you will ultimately publish it to a Docker repository, like DockerHub.
While a Docker Image is a binary version of your container, a runtime instance of a Docker Image is called a Docker container. Docker containers are run by your Docker engine. The machine that runs your Docker engine is called the Docker host; this could be your local laptop or a cloud platform, depending on the scale of your application.
The basics in this section provide a foundation for understanding why clustering is an important addition to your Docker toolkit. See my introduction to Docker for more.
Most developers getting started with Docker will build a Dockerfile and run it locally on a laptop. But there's more to container managed development than running individual Docker containers locally. Docker's superpower is its ability to dynamically scale containers up or down. In production, this means running Docker in a cluster across a host of machines or virtual machines.
Various Docker clustering technologies are available, but the two most popular are Amazon EC2 Container Service (ECS) and Docker Swarm.
Amazon's Docker clustering technology leverages Amazon Web Services (AWS) to create a cluster of virtual machines that can run Docker containers. An ECS cluster consists of managed ECS instances, which are EC2 instances with a Docker engine and an ECS agent. ECS uses an autoscaling group to expand and contract the number of instances based on CloudWatch policies. For example, when the average CPU usage of the ECS instances is too high, you can request ECS to start more instances, up to the maximum number of instances defined in the autoscaling group.
Docker containers are managed by an ECS service and configured by the amount of compute capacity (CPU) and RAM that the container needs to run. The ECS service has an associated Elastic Load Balancer (ELB). As it starts and stops Docker containers, the ECS service registers and deregisters those containers with the ELB. Once you've set up the rules for your cluster, Amazon ECS ensures that you have the desired number of containers running and those containers are all accessible through the ELB. Figure 3 shows a high-level view of Amazon ECS.
It is important to distinguish between ECS instances and tasks. The ECS cluster manages your ECS instances, which are special EC2 instances that run in an autoscaling group. The ECS service manages the tasks, which can contain one or more Docker containers, and which run on the cluster. An ELB sits in front of the ECS instances that are running your Docker containers and distributing load to your Docker containers. The relationship between ECS tasks and Docker containers is that a task definition tells the ECS service which Docker containers to run and the configuration of those containers. The ECS service runs the task, which starts the Docker containers.
Docker's native clustering technology, Docker Swarm allows you to run multiple Docker containers across a cluster of virtual machines. Docker Swarm defines a manager container that runs on a virtual machine that manages the environment, deploys containers to the various agents, and reports the container status and deployment information for the cluster.
When running a Docker Swarm, the manager is the primary interface into Docker. Agents are "docker machines" running on virtual machines that register themselves with the manager and run Docker containers. When the client sends a request to the manager to start a container, the manager finds an available agent to run it. It uses a least-utilized algorithm to ensure that the agent running the least number of containers will run the newly requested container. Figure 4 shows a sample Docker Swarm configuration, which you'll develop in the next section.
The manager process knows about all the active agents and the containers running on those agents. When the agent virtual machines start up, they register themselves with the manager and are then available to run Docker containers. The example in Figure 4 has two agents (Agent1 and Agent2) that are registered with the manager. Each agent is running two Nginx containers.
Docker Swarm vs Amazon ECS
This article features Docker Swarm, but it's useful to compare container technologies. Whereas Amazon ECS offers a well developed turnkey solution, Docker Swarm gives you the freedom to configure more of your own infrastructure. As an example, Amazon ECS manages both containers and load balancers, while in Docker Swarm you would configure a load balancing solution such as Cisco LocalDirector, F5 BigIp, or an Apache or Nginx software process.
If you're already running your app in AWS, then ECS makes it much easier to run and manage Docker containers than an external solution would. As an AWS developer, you're probably already leveraging autoscaling groups, ELBs, virtual private clouds (VPC), identity and access management (IAM) roles and policies, and so forth. ECS integrates well with all of them, so it's the way to go. But if you aren't running in AWS, then Docker Swarm's tight integration with the Docker tools makes it a great choice.
Getting Started with Docker Swarm
In the previous section you saw a sample architecture for a two-node Docker Swarm cluster. Now you'll develop that cluster using two Nginx Docker container instances. Nginx is a popular web server, publicly available as a Docker image on DockerHub. Because this article is focused on Docker Swarm, I wanted to use a Docker container that it quick and easy to start and straightforward to test. You are free to use any Docker container you wish, but for illustrative purposes I chose Nginx for this example.
My introduction to Docker includes a guide to setting up Docker in your development environment. If you've installed and setup the Docker Toolbox then it includes everything that you need to run Docker Swarm. See Docker's official documentation for further setup instructions.
Docker Swarm on the command line
If you've previously used Docker, then you're familiar with using the
docker command-line to start and stop containers. When using Docker Swarm, you'll trade
docker-machine. Docker Machine is defined as follows in the Docker documentation:
Docker Machine is a tool that lets you install Docker Engine on virtual hosts, and manage the hosts with docker-machine commands. You can use Machine to create Docker hosts on your local Mac or Windows box, on your company network, in your data center, or on cloud providers like AWS or Digital Ocean. Using docker-machine commands, you can start, inspect, stop, and restart a managed host, upgrade the Docker client and daemon, and configure a Docker client to talk to your host.
If you've installed Docker then your installation already includes Docker Machine. To get started with Docker Swarm, start Docker and open a terminal on your computer. Execute the following
docker-machine ls command to list all the VMs on your local machine:
$ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM default * virtualbox Running tcp://192.168.99.100:2376
If you've only run Docker from your local machine, then you should have the default Docker virtual machine running with an IP address of
192.168.99.100. To conserve resources on your local machine you can stop this virtual machine by executing:
docker-machine stop default.
Create a swarm
A Docker swarm consists of two or virtual machines running Docker instances. For this demo, we'll create three new virtual machines: manager, agent1, and agent2. Create your virtual machines using the
docker-machine create command:
$ docker-machine create -d virtualbox manager $ docker-machine create -d virtualbox agent1 $ docker-machine create -d virtualbox agent2
docker-machine create command creates a new "machine." Passing it the
-d argument lets you specify the driver to use to create the machine. Running locally, that should be
virtualbox. The first machine created is the
manager, which will host the manager process. The last two machines,
agent2, are the agent machines that will host the agent processes.
At this point, you've created the virtual machines but you haven't created the actual Swarm manager or agents. To view the virtual machines and their state execute the
docker-machine ls command:
$ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS agent1 - virtualbox Running tcp://192.168.99.101:2376 v1.11.1 agent2 - virtualbox Running tcp://192.168.99.102:2376 v1.11.1 default - virtualbox Stopped Unknown manager * virtualbox Running tcp://192.168.99.100:2376 v1.11.1
You now have three running machines: manager, agent1, and agent2, as well as one stopped machine: default. Note the asterisk in the manager's
ACTIVE column. This means that all Docker commands will be sent to the manager. You'll learn how to change the active machine a little later, when you setup the environment.
Create a discovery token
Next you'll need to obtain a Swarm discovery token. The discovery token is a unique identifier for your Swarm cluster. You'll use it to start your manager and to connect your agents to your manager. This discovery token should only be used in test environments; production deployments are a little more complex. Create a discovery token by running the Swarm container and passing it the
create command, as follows:
$ docker run --rm swarm create Unable to find image 'swarm:latest' locally latest: Pulling from library/swarm eada7ab697d2: Pull complete afaf40cb2366: Pull complete 7495da266907: Pull complete a3ed95caeb02: Pull complete Digest: sha256:12e3f7bdb86682733adf5351543487f581e1ccede5d85e1d5e0a7a62dcc88116 Status: Downloaded newer image for swarm:latest 7c14cbf2a86ecd490a7ea7ae4b795a6b
Notes about this command:
docker runcommand launches the specified Docker image, which in this case is
swarm, or more specifically
- The command that you've passed to the Swarm container is
create, which is defined on the container and tells the Swarm application to connect to the DockerHub discovery service and retrieve a unique Swarm ID, the discovery token.
-rmargument tells Docker to automatically remove the container when it exits. This command can be read: run the latest version of the
swarmcontainer, execute the
createcommand, and, when the it completes, remove the swarm container from the local machine.
- The last line in the output is the discovery token, which in this example is
Save your discovery token in a safe place: you'l need it for the next step.
Run the Swarm manager and agents
Next you'll start the Swarm manager and create agents to join your Swarm cluster. Both activities are accomplished by launching the
swarm container and passing different arguments. The manager is already the "active" machine. Create the Swarm cluster manager with the following command:
$ docker run -d -p 3376:3376 -t -v ~/.docker/machine/machines/manager:/certs:ro swarm manage -H 0.0.0.0:3376 --tlsverify --tlscacert=/certs/ca.pem --tlscert=/certs/server.pem --tlskey=/certs/server-key.pem token://7c14cbf2a86ecd490a7ea7ae4b795a6b
This command runs the
swarm container with the following configuration:
--detach): run the swarm container in background and print its container ID after it starts
-t: allocate a pseudo-TTY terminal output
-p: map port 3376 on the Docker Container to port 3376 on the Docker Host (your local laptop). This is the default port that the
dockercommand expects to connect to
-v: mount the local volume (
~/.docker/machine/machines/manager) on the container at the specified location (
/certs) with read-only access (
Depending on your version of Docker and your operating system, your certificates directory may be in a different location: this is was the only problem that I ran into when starting the Swarm manager. On a Mac, Docker creates a
.docker folder in your home directory. When you created your
manager virtual machine, it created its configuration at the location:
machine/machines/manager. If you run into problems finding files during startup for files such as
server-key.pem, locate those files in your own Docker configuration and update your volume mounting accordingly:
manage: you've seen that the
swarmDocker container has a
createargument that starts the Swarm container, connects to DockerHub to obtain a discovery token, and then exits. Here you can see the
manageargument in use. The
manageargument tells the
swarmcontainer to run in "manage" mode, which essentially means that it will start your Swarm manager process.
-Hargument tells Swarm what host and port to bind to, which in this case is
tls: The various
tlsarguments tell the Swarm manager where to find certificate files for TLS (HTTPS) communication.
tokenargument references the discovery token that we created earlier. (Be sure to use the token that you created and not the one that I created for this example!)
Once you have the Swarm manager running, your next step is to start your agents and tell them to join the cluster. Accomplish this by running the Swarm container in "join" mode. Before you do that, you need to tell your local
docker command-line client to send commands to the "agent1" machine that you created earlier. Do so with the command:
$ eval $(docker-machine env agent1)
This command tells the docker client to send all docker commands to the Docker engine running on the "agent1" machine. Now run the following command to start agent1:
$ docker run -d swarm join --addr=$(docker-machine ip agent1):2376 token://7c14cbf2a86ecd490a7ea7ae4b795a6b Unable to find image 'swarm:latest' locally latest: Pulling from library/swarm eada7ab697d2: Pull complete afaf40cb2366: Pull complete 7495da266907: Pull complete a3ed95caeb02: Pull complete Digest: sha256:12e3f7bdb86682733adf5351543487f581e1ccede5d85e1d5e0a7a62dcc88116 Status: Downloaded newer image for swarm:latest 99c5ec703dc3230fcf769eb13e639079803ee36c33447a0290a2fb7ffe5e7952
This command, similar to the previous one, tells Docker to run the
swarm container, but this time in "join" mode. You'll tell it to run in detached mode (
-d) and pass it two arguments:
--addr: The address and port of the agent, which is used to advertise the presence of the agent to the manager.
token: the discovery token that we created earlier and used to start the manager.
With agent1 running, execute the same command to start agent2:
$ eval $(docker-machine env agent2) $ docker run -d swarm join --addr=$(docker-machine ip agent2):2376 token://7c14cbf2a86ecd490a7ea7ae4b795a6b Unable to find image 'swarm:latest' locally latest: Pulling from library/swarm eada7ab697d2: Pull complete afaf40cb2366: Pull complete 7495da266907: Pull complete a3ed95caeb02: Pull complete Digest: sha256:12e3f7bdb86682733adf5351543487f581e1ccede5d85e1d5e0a7a62dcc88116 Status: Downloaded newer image for swarm:latest 0b16ee511399c27d849c6a6c628822375c27755b14719b5295c9038f97ede72a
At this point, the manager and both agents should be running. Next you'll configure the
docker client to connect to the Docker Swarm manager and retrieve information about the environments. First, set your
DOCKER_HOST environment variable to point to the manager Docker machine:
$ DOCKER_HOST=$(docker-machine ip manager):3376
You can retrieve the manager's IP address by executing the
docker-machine ip command and passing it the name of the machine for which you want to retrieve the IP address (
manager). Likewise, on Windows you can execute the
SET DOCKER_HOST command to setup the environment. With the
DOCKER_HOST environment set, you can now retrieve information about your Swarm cluster by executing the
docker info command:
$ docker info Containers: 2 Running: 2 Paused: 0 Stopped: 0 Images: 2 Server Version: swarm/1.2.2 Role: primary Strategy: spread Filters: health, port, containerslots, dependency, affinity, constraint Nodes: 2 agent1: 192.168.99.101:2376 - ID: RDNQ:VD3I:AZPE:LSWW:7NND:XV7C:KHGH:5KR5:MZHG:4I7H:7RMU:XGQG - Status: Healthy - Containers: 1 - Reserved CPUs: 0 / 1 - Reserved Memory: 0 B / 1.021 GiB - Labels: executiondriver=, kernelversion=4.4.8-boot2docker, operatingsystem=Boot2Docker 1.11.1 (TCL 7.0); HEAD : 7954f54 - Wed Apr 27 16:36:45 UTC 2016, provider=virtualbox, storagedriver=aufs - Error: (none) - UpdatedAt: 2016-05-22T19:03:35Z - ServerVersion: 1.11.1 agent2: 192.168.99.102:2376 - ID: DXN7:FLLA:RMDW:HSPS:WT74:YM2I:CM3G:QBY7:FR7G:4WEO:LJ72:XB6L - Status: Healthy - Containers: 1 - Reserved CPUs: 0 / 1 - Reserved Memory: 0 B / 1.021 GiB - Labels: executiondriver=, kernelversion=4.4.8-boot2docker, operatingsystem=Boot2Docker 1.11.1 (TCL 7.0); HEAD : 7954f54 - Wed Apr 27 16:36:45 UTC 2016, provider=virtualbox, storagedriver=aufs - Error: (none) - UpdatedAt: 2016-05-22T19:03:32Z - ServerVersion: 1.11.1 Plugins: Volume: Network: Kernel Version: 4.4.8-boot2docker Operating System: linux Architecture: amd64 CPUs: 2 Total Memory: 2.042 GiB Name: 77d61b0fe67f Docker Root Dir: Debug mode (client): false Debug mode (server): false WARNING: No kernel memory limit support
From this output you can see that you're running two containers (Swarm containers running in "join" mode) for your two agents and those agents are healthy. Note that these are the Docker machine containers and not your custom Docker containers, such as Nginx Docker containers. You can view the running custom containers by executing the
docker ps command:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
As expected, no custom containers are running.
Running containers in Docker Swarm
Thus far you've created three Docker machines and created a discovery token for your cluster. You've started one instance of the
swarm container in "manage" mode and two instances of the
swarm container in "join" mode. You have a running cluster, but it's not yet running any containers. In this section you'll start a Nginx container and connect to it. Begin with the following command:
$ docker run -d -p 80:80 nginx cc6d627873f7b33f910129fafdcc5c544048cc864ef5433e667afc9a88632931
In this example, you run an
nginx container (note that omitting a version defaults to
latest) in detached mode and bind port 80 on the container to port 80 on the Docker host. Use the
docker client to view your running container:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cc6d627873f7 nginx "nginx -g 'daemon off" 28 seconds ago Up 27 seconds 192.168.99.101:80->80/tcp, 443/tcp agent1/goofy_bassi
The Docker manager deployed the Nginx container to agent1. We can connect to it by opening a browser to the following URL:
If successful you should see a website similar to the interface in Figure 5.
To complete this example, let's startup a second Nginx container and review its deployment:
$ docker run -d -p 80:80 nginx $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 737d5d37d5a6 nginx "nginx -g 'daemon off" About a minute ago Up About a minute 192.168.99.102:80->80/tcp, 443/tcp agent2/condescending_galileo cc6d627873f7 nginx "nginx -g 'daemon off" 3 minutes ago Up 3 minutes 192.168.99.101:80->80/tcp, 443/tcp agent1/goofy_bassi
As you can see, Swarm deployed the first Nginx container to agent1 and the second to agent2. The algorithm it uses to deploy containers is to search for the agent with the least number of containers and deploy the newly requested container there. You can connect to the new instance at the following URL:
You now have a Swarm cluster with two agents running two Nginx containers. Verify that you can access both. When you're finished, you can clean up your environment by stopping containers as you normally would, with the
docker stop command:
$ docker stop 737 737 $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cc6d627873f7 nginx "nginx -g 'daemon off" 6 minutes ago Up 6 minutes 192.168.99.101:80->80/tcp, 443/tcp agent1/goofy_bassi $ docker stop cc6 cc6 $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Both Nginx containers are now stopped. You can stop Docker Swarm using the
docker-machine stop command:
$ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS agent1 - virtualbox Running tcp://192.168.99.101:2376 v1.11.1 agent2 - virtualbox Running tcp://192.168.99.102:2376 v1.11.1 default - virtualbox Stopped Unknown manager - virtualbox Running tcp://192.168.99.100:2376 v1.11.1 $ docker-machine stop agent1 Stopping "agent1"... Machine "agent1" was stopped. $ docker-machine stop agent2 Stopping "agent2"... Machine "agent2" was stopped. $ docker-machine stop manager Stopping "manager"... Machine "manager" was stopped. $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS agent1 - virtualbox Stopped Unknown agent2 - virtualbox Stopped Unknown default - virtualbox Stopped Unknown manager - virtualbox Stopped Unknown
At this point the containers have been stopped as well as the Docker Swarm machines. You can restart those as you wish, using the discovery token that you created earlier.
This article provided an overview of Docker Swarm and demonstrated how to create a Swarm cluster on your local machine. In began with a primer on Docker itself, then reviewed the two most popular Docker clustering technologies, namely Amazon ECS and Docker Swarm, and then it demonstrated, step-by-step, how to create a Docker Swarm cluster. Through the process we discussed Swarm discovery tokens as the unique identifier for a cluster, managers that manage the cluster and deploy containers to agents, and agents that run containers. At this point you should understand Docker Swarm with enough detail to setup and run a local environment.
From here, I recommend that you review the online documentation about running Docker Swarm in production. Production deployment could warrant a full article in and of itself, but the core concepts are similar: run a virtual machine with the Docker Engine, run a discovery backend (their example uses Consul), run the manager by starting the
swarm Docker container in "manage" mode, run agents (or nodes as they call them) by starting
swarm containers in "join" mode, and deploy containers to the manager, which in turn will distribute them to agents in the cluster.