Java Tip: Orthogonality by example

The principles of modular and maintainable design in Log4j

Orthogonality is a concept often used to describe modular and maintainable software, but it's more easily understood by way of a case study. In this article, Jens Dietrich demystifies orthogonality and some related design principles by demonstrating their use in the popular Log4j utility library. He also discusses how Log4j violates orthogonality in a couple of instances and discusses possible workarounds to the issues raised.

The concept of orthogonality is based on the Greek word orthogōnios, meaning "right-angled." It is often used to express the independence between different dimensions. When an object moves along the x-axis in a three-dimensional space, its y and z coordinates don't change. Change in one dimension does not cause change in another dimension, which means that one dimension cannot cause side-effects for others.

This explains why the concept of orthogonality is often used to describe modular and maintainable software design: thinking about systems as points in a multi-dimensional space (spawned by independent, orthogonal dimensions) helps software developers to ensure that our changes to one aspect of system will not have side-effects for another.

It happens that Log4j, a popular open source logging package for Java, is a good example of a modular design based on orthogonality.

The dimensions of Log4j

Logging is just a fancier version of the System.out.println() statement, and Log4j is a utility package that abstracts the mechanics of logging on the Java platform. Among other things, Log4j features allow developers to do the following:

  • Log to different appenders (not only the console but also to files, network locations, relational databases, operating system log utilities, and more)
  • Log at several levels (such as ERROR, WARN, INFO, and DEBUG)
  • Centrally control how much information is logged at a given logging level
  • Use different layouts to define how a logging event is rendered into a string

While Log4j does have other features, I will focus on these three dimensions of its functionality in order to explore the concept and benefits of orthogonality. Note that my discussion is based on Log4j version 1.2.17.

Log4j on JavaWorld

Get an overview of Log4j and learn how to write your own custom Log4j appenders. Want more Java tutorials? Get the Enterprise Java newsletter delivered to your inbox.

Considering Log4j types as aspects

Appenders, level, and layout are three aspects of Log4j that can be seen as independent dimensions. I use the term aspect here as a synonym for concern, meaning a piece of interest or focus in a program. In this case, it is easy to define these three concerns based on the questions that each addresses:

  • Appender: Where should the log event data be sent for display or storage?
  • Layout: How should a log event be presented?
  • Level: Which log events should be processed?

Now try considering these aspects together in three-dimensional space. Each point within this space represents a valid system configuration, as shown in Figure 1. (Note that I am offering a slightly simplified view of Log4j: Each point in Figure 1 is actually not a global system-wide configuration, but a configuration for one particular logger. The loggers themselves can be considered as a fourth dimension.)

Figure 1. An orthogonal system in three dimensions

Listing 1 is a typical code snippet implementing Log4j:

Listing 1. A Log4j implementation example

// setup logging !
Logger logger = Logger.getLogger("Foo");        
Appender appender = new ConsoleAppender();
Layout layout = new org.apache.log4j.TTCCLayout()
appender.setLayout(layout);
logger.addAppender(appender);
logger.setLevel(Level.INFO);
// start logging !
logger.warn("Hello World");

What I want you to notice about this code is that it is orthogonal: you could change the appender, layout, or level aspect without breaking the code, which would remain completely functional. In an orthogonal design, each point in the given space of the program is a valid system configuration. No constraint is allowed to restrict which points in the space of possible configurations are valid or not.

Orthogonality is a powerful concept because it enables us to establish a relatively simple mental model for complex application use cases. In particular, we can focus on one dimension while ignoring other aspects.

Testing is a common and familiar scenario where orthogonality is useful. We can test the functionality of log levels using a suitable fixed pair of an appender and a layout. Orthogonality ensures us that there will be no surprises: log levels will work the same way with any given combination of appender and layout. Not only is this convenient (there is less work to do) but it is also necessary, because it would be impossible to test log levels with every known combination of appender and layout. This is especially true given that Log4j, like many software tools and utilities, is designed to be extended by third parties.

The reduction in complexity that orthogonality brings to software programs is similar to how dimensions are used in geometry, where the complicated movement of points in an n-dimensional space is broken down to the relatively simple manipulation of vectors. The entire field of linear algebra is based on this powerful idea.

Designing and coding for orthogonality

If you are now wondering how to design and code orthogonality into your programs, then you are in the right place. The key idea is to use abstraction. Each dimension of an orthogonal system addresses one particular aspect of the program. Such a dimension will usually be represented by a type (class, interface, or enumeration). The most common solution is to use an abstract type (interface or abstract class). Each of these types represents a dimension, while the type instance represents the points within the given dimension. Because abstract types can not be directly instantiated, concrete classes are also needed.

Figure 2. Inside the Appender dimension

In some cases we can do without them. For instance, we don't need concrete classes when the type is just a markup, and doesn't encapsulate behavior. Then we can just instantiate the type representing the dimension itself, and often predefine a fixed set of instances, either by using static variables, or by using an explicit enumeration type. In Listing 1 this rule would apply to the "level" dimension.

Figure 3. Inside the Level dimension

Writing generic code

The general rule of orthogonality is to avoid references to specific concrete types representing other aspects (dimensions) of the program. This enables you to write generic code that will work the same way for all possible instances. Such code can still reference properties of instances, as long as they are part of the interface of the type defining the dimension.

For instance, in Log4j the abstract type Layout defines the method ignoresThrowable(). This method returns a boolean indicating whether the layout can render exception stack traces or not. When an appender uses a layout, it would be perfectly fine to write conditional code on ignoresThrowable(). For instance, a file appender could print exception stack traces on System.err when using a layout that could not handle exceptions.

In a similar manner, a Layout implementation could refer to a particular Level when rendering logging events. For instance, if the log level was Level.ERROR, an HTML-based layout implementation could wrap the log message in tags rendering it in red. Again, the point is that Level.ERROR is defined by Level, the type representing the dimension.

You should, however, avoid references to specific implementation classes for other dimensions. If an appender uses a layout then there is no need to know what kind of layout it is. Figure 4 illustrates good and bad references.

Figure 4. Violating orthogonality

Several patterns and frameworks make it easier to avoid dependencies to implementation types, including dependency injection and the service locator pattern.

Violating orthogonality

Overall, Log4j is a good example of the use of orthogonality. However, some code in Log4j violates this principle.

Log4j contains an appender called JDBCAppender, which is used to log to a relational database. Given the scalability and popularity of relational database, and the fact that this makes log events easily searchable (with SQL queries), JDBCAppender is an important use case.

JDBCAppender is intended to address the problem of logging to a relational database by turning log events into SQL INSERT statements. It solves this problem by using a PatternLayout.

PatternLayout uses templating to give the user maximum flexibility to configure the strings generated from log events. The template is defined as a string, and the variables used in the template are instantiated from log events at runtime, as shown in Listing 2.

Listing 2. PatternLayout

String pattern =
"%p [@ %d{dd MMM yyyy HH:mm:ss} in %t] %m%n";
Layout layout =
new org.apache.log4j.PatternLayout(pattern);
appender.setLayout(layout);

JDBCAppender uses a PatternLayout with a pattern that defines the SQL INSERT statement. In particular, the following code can be used to set the SQL statement used:

Listing 3. SQL insert statement

public void setSql(String s) {
   sqlStatement = s;
   if (getLayout() == null) {
      this.setLayout(new PatternLayout(s));
   }
   else {
       ((PatternLayout)getLayout()).setConversionPattern(s);
   }
}

Built into this code is the implicit assumption that the layout, if set before using the setLayout(Layout) method defined in Appender, is in fact an instance of PatternLayout. In terms of orthogonality, this means that suddenly a lot of points in the 3D cube that use JDBCAppender with layouts other than PatternLayout do not represent valid system configurations anymore! That is, any attempts to set the SQL string with a different layout would result in a runtime (class cast) exception.

Figure 5. JDBCAppender violating orthogonality

Bypassing precompiled statements

There is another reason that JDBCAppender's design is questionable. JDBC has its own template engine prepared statements. By using PatternLayout, however, the template engine is bypassed. This is unfortunate because JDBC precompiles prepared statements, leading to significant performance improvements. Unfortunately, there is no easy fix for this. The obvious approach would be to control what kind of layout can be used in JDBCAppender by overriding the setter as follows.

Listing 4. Overriding setLayout()

public void setLayout(Layout layout) {
   if (layout instanceOf PatternLayout) {
      super.setLayout(layout);
   }
   else {
      throw new IllegalArgumentException("Layout is not valid");
   }
}

Unfortunately, this approach also has problems. The method in Listing 4 throws a runtime exception, and applications calling this method may not be prepared to catch it. In other words, the setLayout(Layout layout) method cannot guarantee that no runtime exception will be thrown; it therefore weakens the guarantees (postconditions) made by the method it overrides. If we look at it in terms of preconditions, setLayout requires that the layout is an instance of PatternLayout, and has therefore stronger preconditions than the method it overrides. Either way, we've violated a core object-oriented design principle, which is the Liskov substitution principle used to safeguard inheritance.

Workarounds

The fact that there is no easy solution to fix the design of JDBCAppender indicates that there is a deeper problem at work. In this case, the level of abstraction chosen when designing the core abstract types (in particular Layout) needs fine-tuning. The core method defined by Layout is format(LoggingEvent event). This method returns a string. However, when logging to a relational database a tuple of values (a row), and not a string needs to be generated.

One possible solution would be to use a more sophisticated data structure as a return type for format. However, this would imply additional overhead in situations where you might actually want to generate a string. Additional intermediate objects would have to be created and then garbage-collected, compromising the performance of the logging framework. Using a more sophisticated return type would also make Log4j more difficult to understand. Simplicity is a very desirable design goal.

Another possible solution would be to use "layered abstraction" by using two abstract types, Appender and CustomizableAppender which extends Appender. Only CustomizableAppender would then define the method setLayout(Layout layout). JDBCAppender would only implement Appender, while other appender implementations such as ConsoleAppender would implement CustomizableAppender. The drawback of this approach is the increased complexity (e.g., how Log4j configuration files are processed), and the fact that developers must make an informed decision about which level of abstraction to use early.

1 2 Page 1
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.