Building a project is complex business. Due to the dozens of tasks required in converting your hodge-podge of files into a working program, there exist literally hundreds of tools that do everything from generating source code, to compiling, to testing, to distribution, to brewing your morning coffee (if you find one, dear reader, let me know). Many of these programs are excellent at what they do. Unfortunately, for those of us who manage large-scale build systems for a living, there is rarely much commonality; each program requires its own disparate installation and esoteric configuration. It has become an inevitable fact of our lives that the majority of build systems are custom built by hand-gluing these tools with several homebrew scripts (yeah, Ant scripts count).
More than another build tool, Maven is a build framework. It cleanly separates your code from configuration files, documentation, and dependencies. Maven is surprisingly flexible in letting users configure most aspects of their code, as well as in controlling the behavior of plug-ins, individual goals, and even the build lifecycle itself. Maven is the actual structure, and within these walls, your project dwells; it wants to be an accommodating host.
But the problem still remains: managing the work of thousands of custom build scripts within a single framework is tough and, to be done correctly, requires much information. Fortunately, the Maven 2 team has been quite successful. Learning from the mistakes of Maven 1, countless user requests, tweaking, and updating, Maven 2 is more powerful than ever. Unfortunately, with great power comes great configuration. In order for Maven 2 artifacts to be easily portable units, that complex configuration falls into a single file. Enter the Maven POM.
What is the POM?
POM stands for project object model. It is an XML representation of a Maven project held in a file named pom.xml. In the presence of Maven folks, speaking of a project is speaking in the philosophical sense, beyond a mere collection of files containing code. A project contains configuration files, as well as developers involved and roles they play, the defect tracking system, the organization and licenses, the URL where the project lives, the project's dependencies, and all the other little pieces that come into play to give code life. A project is a one-stop shop for all things related to it. In fact, in the Maven world, a project need not contain any code at all, merely a pom.xml. We will encounter a couple such types of projects later in the article.
A quick structural overview
The POM is large and complex, so breaking it into pieces eases digestion. For the purposes of this discussion, these pieces are regrouped into four logical units, as shown in Figure 1: POM relationships, project information, build settings, and build environment. We shall begin by discussing POM relationships.
Below is a listing of the elements directly under the POM's project element. Notice that
4.0.0. That is currently the only supported POM version for Maven 2 and is always required. The Maven 4.0.0 XML schema definition is located at http://maven.apache.org/maven-v4_0_0.xsd. Its top-level elements are as follows:
<!-- POM Relationships --> <groupId>...</groupId> <artifactId>...</artifactId> <version>...</version> <parent>...</parent> <dependencyManagement>...</dependencyManagement> <dependencies>...</dependencies> <modules>...</modules>
<!-- Project Information --> <name>...</name> <description>...</description> <url>...</url> <inceptionYear>...</inceptionYear> <licenses>...</licenses> <developers>...</developers> <contributors>...</contributors> <organization>...</organization>
<!-- Build Settings --> <packaging>...</packaging> <properties>...</properties> <build>...</build> <reporting>...</reporting>
<!-- Build Environment -->
<!-- Environment Information -->
<issueManagement>...</issueManagement> <ciManagement>...</ciManagement> <mailingLists>...</mailingLists> <scm>...</scm>
<!-- Maven Environment -->
<prerequisites>...</prerequisites> <repositories>...</repositories> <pluginRepositories>...</pluginRepositories> <distributionManagement>...</distributionManagement> <profiles>...</profiles> </project>
Our first order of business is to investigate project relationships, represented in Figure 2 as the top-left corner of the chart in Figure 1.
Projects must relate to each other in some way. Since the creation of the first assemblers, software projects have had dependencies; Maven has introduced more forms of relationships hitherto unused in such a form for Java projects. These relationships are Maven coordinates, coordinate-based dependencies, project inheritance, and aggregation.
Each Maven project contains its own unique identifier, dubbed the project's coordinates, which acts like an artifact's address, giving it a unique place in the Maven universe. If projects had no way of relating to each other, coordinates would not be needed. That is, if a universe had just one house, why would it need an address like 315 Cherrywood Lane?
The code below is the minimum POM that Maven 2 will allow—
<version> are all required fields. They act as a vector in Maven space with the elements grouper, identifier, and timestamp.
<project> <modelVersion>4.0.0</modelVersion> <groupId>org.codehaus.mojo</groupId> <artifactId>a</artifactId> <version>1</version> </project>
In the Maven world, these three main elements (The Maven trinity—behold its glory!) make up a POM's coordinates. The coordinates are represented by Figure 3.
Perhaps this POM is not so impressive by itself. It gets better.
One of the most powerful aspects of Maven is its handling of project dependencies, and in Maven 2, that includes transitive dependencies. Figure 4 illustrates how we shall represent them graphically.
Dependency management has a long tradition of being a complicated mess for anything but the most trivial of projects. "Jarmageddon" quickly ensues as the dependency tree becomes huge, complicated, and embarrassing to architects who are scorned by new graduates who "totally could have done it better." "Jar Hell" follows, where versions of dependencies on one system are not quite the same versions as those used for development; they have either the wrong version or conflicting versions between similarly named JARs. Hence, things begin breaking and pinpointing why proves difficult. Maven solves both of these problems by having a common local repository from which to link to the correct projects, versions and all.
One of the features that Maven 2 brings from the Maven 1 days is project inheritance, as represented in Figure 5. In build systems, such as Ant, inheritance can certainly be simulated, but Maven has taken the extra step in making project inheritance explicit to the project object model.
The following code defines a parent POM in Maven 2:
<project> <modelVersion>4.0.0</modelVersion> <groupId>org.codehaus.mojo</groupId> <artifactId>b</artifactId> <version>2</version> <packaging>pom</packaging> </project>
This parent looks similar to our first POM, with a minor difference. Notice that we have set the
packaging type as
pom, which is required for both parent and aggregator projects (we will cover more on
packaging in the "Build Settings" section). If we want to use the above project as a parent, we can alter the project
org.codehaus.mojo:a POM to be:
<project> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.codehaus.mojo</groupId> <artifactId>b</artifactId> <version>2</version> </parent> <!-- Notice no groupId or version. They were inherited from parent--> <artifactId>a</artifactId> </project>
It is important to note that all POMs inherit from a parent whether explicitly defined or not. This base POM is known as the "super POM," and contains values inherited by default. An easy way to look at the default configurations of the super POM is by creating a simple pom.xml with nothing but
version, and running the command
Beyond simply setting values to inherit, parents also have the power to create default configurations for their children without actually imposing values upon them. Dependency management is an especially powerful instrument for configuring a set of dependencies through a common location (a POM's parent). The
dependencyManagement element syntax is similar to that of the dependency section. What it does, however, is allow children to inherit dependency settings, but not the dependency itself. Adding a dependency with the
dependencyManagement element does not actually add the dependency to the POM, nor does it add a dependency to the children; it creates a default configuration for any dependency that a child may choose to add within its own dependency section. Settings by
dependencyManagement also apply to the current POM's dependency configuration (although configurations overridden inside the dependency element always take precedence).
A project with modules is known as a multimodule project. Modules are projects that a POM lists, executed as a set. Multimodule projects know of their modules, but the reverse is not necessarily true, as represented in Figure 6.
Assuming that the parent POM resides in the parent directory of where POM for project a lives, and that the project a also resides in a directory of the same name, we may alter the parent POM b to aggregate the child a by adding it as a module:
<project> <modelVersion>4.0.0</modelVersion> <groupId>org.codehaus.mojo</groupId> <artifactId>b</artifactId> <version>2</version> <packaging>pom</packaging> <modules> <module>a</module> </modules> </project>
Now if we ran
mvn compile in the base directory, you would see the build start with:
[INFO] Scanning for projects... [INFO] Reactor build order: [INFO] Unnamed – org.codehaus.mojo:b:pom:2 [INFO] Unnamed – org.codehaus.mojo:a:jar:2
The Maven lifecycle will now execute up to the lifecycle phase specified in correct order; that is, each artifact is built one at a time, and if one artifact requires another to be built first, it will be.
A note on inheritance vs. aggregation
Inheritance and aggregation create a nice dynamic for controlling builds through a single, high-level POM. You will often see projects that are both parents and multimodules, such as the example above. Their complementariness makes them a natural match. Even the Maven 2 project core runs through a single parent/multimodule POM
org.apache.maven:maven, so building a Maven 2 project can be executed by a single command:
mvn compile. Although used in conjunction, however, a multimodule and a parent are not one in the same, and should not be confused. A POM project (acting as a parent) may be inherited from, but that parent project does not necessarily aggregate any modules. Conversely, a POM project may aggregate projects that do not inherit from it.
When all four pieces of the equation are put together, hopefully you will see the power of the Maven 2 relationship mechanism, as shown in Figure 7.
Maven gives us a nice framework for relating projects to each other, and through these relationships, we may create plug-ins reusable by any project following Maven's conventions. But the ability to manage project relationships is only a part of the overall Maven equation. The rest of the POM is concerned not with other projects, but with its build settings, its information, and with its environment. With a quick understanding of how projects relate to one another out of the way, let's begin to look at how a POM contains information about the project proper.