Docker is an open platform for building, shipping, and running distributed applications. Dockerized applications can run locally on a developer's machine, and they can be deployed to production across a cloud-based infrastructure. Docker lends itself to rapid development and enables continuous integration and continuous deployment like almost no other technology does. In short, Docker is a platform that every developer should be familiar with.
This installment of Open source Java projects introduces Java developers to Docker. I'll explain why it's important to developers, walk you through setting up and deploying a Java application to Docker, and show you how to integrate Docker into your build process.
A little over a decade ago, software applications were large and complex things, deployed to large machines. In the Java world, we developed enterprise archives (EARs) that contained both Enterprise JavaBeans (EJB) and web components (WARs), then we deployed them to large application servers. We did everything that we could to design our applications to run optimally on large machines, maximizing all of the resources available to us.
In the early 2000s, with the advent of the cloud, developers began using virtual machines and server clusters to scale out applications to meet user demand. Applications deployed virtually had to be designed quite differently from the monoliths of years past. Lighter weight, service-oriented applications were the new standard. We learned to design software as a collection of interconnected services, with each component being as stateless as possible. The concept and implementation of scalable infrastructure transformed; rather than depend on the vertical scalability of a single large machine, developers and architects started thinking in terms of horizontal scalability: how to deploy a single application across numerous lightweight machines.
Docker takes this virtualization a step further, providing a lightweight layer that sits between the application and the underlying hardware. Docker runs the application as a process on the host operating system. Figure 1 compares a traditional virtual machine to Docker.
A traditional virtual machine runs a hypervisor on the host operating system. The OS, in turn, runs a full guest operating system inside the virtual machine. The guest operating system then hosts the binaries and libraries required to run an application.
Docker, on the other hand, provides a Docker engine, which is a daemon that runs on the host operating system. The Docker engine translates operating system calls in the Docker container to native calls on the host operating system. A Docker image, which is the template from which Docker containers are created, contains a bare-bones operating system layer, and only the binaries and libraries required to run an application.
The differences might seem subtle, but in practice they are profound.
Understanding process virtualization
When we look at the operating system in a virtual machine, we see the virtual machine's resources, such as its CPU and memory. When we run a Docker container, we directly see the host machine's resources. I liken Docker to a process virtualization platform rather than a machine virtualization platform. Essentially, your application is running as a self-contained and isolated process on the host machine. Docker achieves isolation by leveraging a handful of Linux constructs, such as cgroups and namespaces, to ensure that each process runs as an independent unit on the operating system.
Because Dockerized applications run similar to processes on the host machine, their design is different from applications that run on a virtual machine. To illustrate, we might normally run Tomcat and a MySQL database on a single virtual machine, but Docker would have us run the app server and database in their own, respective Docker containers. This allows Docker to better manage the individual processes as self-contained units on the host operating system. It also means that in order to effectively use Docker, we need to design our applications as finely granular services, like microservices.
Microservices in Docker
In a nutshell, microservices is a software architectural style that facilitates a modular approach to system building. In a microservices architecture, complex applications are composed of smaller, independent processes. Each process performs one or more specific tasks, communicating with other processes via language-independent APIs.
Microservices are very fine-grained, highly decoupled services that perform a single function, or a collection of related functions, very well. For example, if you are managing a user's profile and shopping cart, rather than packaging them together as a set of user services, you might opt to define them separately, as user profile services and user shopping cart services. In practical terms, building microservices means building web services, most commonly RESTful web services, and grouping them by functionality. In Java, we will package these as WAR files and deploy them to a container, such as Tomcat, then run Tomcat and our services inside a Docker container.
Setting Up Docker
Before we dive into Docker, let's get your local environment set up. It's great if you are running Linux: you can just install Docker directly and start running it. For those of us using Windows or a Mac, Docker is available through a tool called the Docker Toolbox, which installs a virtual machine (using Oracle's Virtual Box technology), which runs Linux with the Docker daemon. We can then use the Docker client to execute commands that are sent to the daemon for processing. Note that you won't be managing the virtual machine; you'll just be installing the toolbox and executing the
docker command line tool.
I use a Mac, so I downloaded the Mac version of Docker Toolbox and ran the installation file. Once the install completed I ran the Docker Quickstart Terminal, which started the Virtual Box image and provided a command shell. The setup should be more or less the same for Windows users, but see the Windows instructions for more information.
DockerHub: The Docker image repository
Before we start using Docker, take a minute to visit DockerHub, the official repository for Docker images. Explore DockerHub and you'll find that it hosts thousands of images, both official ones and many built by independent developers. You'll find base operating systems like CentOS, Ubuntu, and Fedora as well as configured images for Java, Tomcat, Jetty, and more. You can also find almost any popular application out-of-the-box, including MySQL, MongoDB, Neo4j, Redis, Couchbase, Cassandra, Memcached, Postgres, Nginx, Node.js, WordPress, Joomla, PHP, Perl, Ruby, and so on. Before you build an image, make sure it's not already on DockerHub!
As an exercise, try running a simple CentOS image. Enter the following command in your Docker Toolbox command prompt:
$ docker run -it centos
docker command is your primary interface to communicating with the Docker daemon. The
run directive tells Docker to download and run the specified image (assuming it isn't already on your computer). Alternatively, you can download an image without running it by using the
pull directive. There are two arguments:
i tells Docker to run this image in interactive mode, and
t tells it to create a TTY shell. Note that unofficial images are named with the convention
username/image-name, while official images are run without a username, which is why we only need to specify "centos" as the image we want to run. Finally, you can specify a version of the image to run by appending a
:version-number to the end of the image name, such as
centos:7. Each image defines a
latest version that is used by default; in the case of CentOS, the latest version is 7.
$ docker run -it centos you should see Docker downloading the image, after which it will present an output similar to the following:
$ docker run -it centos [root@dcd69de89aad /]#
Because we ran this container in interactive mode, it presented us with a root command shell. Browse around the operating system for a little bit and then exit by executing the
You can see all images that you have downloaded by executing
$ docker images REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE java 8 5282faca75e8 4 weeks ago 817.6 MB tomcat latest 71093fb71661 8 weeks ago 347.7 MB centos latest 7322fbe74aa5 11 weeks ago 172.2 MB
You can see that I have the latest versions of CentOS and Tomcat, as well as Java 8.
Running Tomcat inside Docker
Starting a Tomcat instance inside Docker is a little more complex than starting the CentOS image. Issue the command:
$ docker run -d -p 8080:8080 tomcat
In this example, we're running
tomcat as a daemon process, using the
-d argument. We're exposing port
8080 on our Docker container as port
8080 on our Docker host (the Virtual Box virtual machine). When you run this command you should see output similar to the following:
$ docker run -d -p 8080:8080 tomcat bdbedc47836028b9d1fee3d4a96eee4d89838d7b6b0b4298d9e5a7d117292003
This horribly long hexidecimal number is the container ID, which we will use in a minute. You can see all the running processes by executing
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bdbedc478360 tomcat "catalina.sh run" 3 seconds ago Up 3 seconds 0.0.0.0:8080->8080/tcp focused_morse
You'll notice that the container ID returns the first 12 characters from the ID above. This ID is painful to type, so Docker allows you to specify enough of it in commands to uniquely identify it. For example, you could specify "bdb" and that would be enough for Docker to uniquely identify this instance. In order to see when Tomcat has finished loading, you would typically tail the
catalina.out file. The alternative in the Docker world is to view the standard output using the
docker logs command. Specify the
-f argument to follow the logs:
$ docker logs -f bdb 09-Sep-2015 02:15:21.611 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version: Apache Tomcat/8.0.24 ... 09-Sep-2015 02:15:25.137 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 3188 ms
Exit the log tailing by pressing Ctrl-C.
Testing and exploring
To test Tomcat, you need to find the address of your Virtual Box host. When you started the Docker Quickstart Terminal, you should have seen a line that looked like the following:
docker is configured to use the default machine with IP 192.168.99.100
Alternatively, you can look at the
DOCKER_HOST environment variable to find the machine's IP address:
Open a browser window to port 8080 on the Docker Host:
You should see the standard Tomcat homepage.
Before we stop our instance, use the
docker command line tool to learn a little more about our processes:
docker stats CONTAINER IDdisplays the CPU, memory, and network I/O for each image.
docker inspect CONTAINER IDdisplays the configuration of the image.
docker infodisplays information about the Docker host.
When you're finished, you can stop Tomcat by executing the
docker stop command with your container ID:
$ docker stop bdb.
You can confirm that Tomcat is no longer running by executing
docker ps again to verify that it is no longer listed. You can see the history of all the images you have run by executing
docker ps -a:
$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bdbedc478360 tomcat "catalina.sh run" 26 minutes ago Exited (143) 5 seconds ago focused_morse
At this point you should understand how to find images on DockerHub, how to download and run instances, how to view running instances, how to view an instance's runtime statistics and logs, and how to stop an instance. Now let's turn our attention to how Docker images are defined.