Code generation using Javadoc

Extend Javadoc by creating custom doclets

Automatic code generation is becoming increasingly common in software development, a result of the need to hide complexity from the software developer and the acceptance of various standard and de facto standard application programming interfaces. Hiding complexity from the developer can be demonstrated by creating stub and skeleton classes in CORBA from their interface definition language descriptions and by some object-oriented databases that create the necessary adapter code to persist and retrieve objects from the database.

Java contains many APIs that Java developers regard as de facto standards. The complexity of these APIs ranges from those that constitute the "core" of the Java language to those found in the Java 2 Platform, Enterprise Edition. For example, the Java Database Connectivity API presents a unifying interface for interacting with databases from various companies. Suppose that you want a Java object to be able to persist itself to a database by implementing a simple save() method that maps the object's attributes to a database table. That method would extract the attributes from the object and use the JDBC API to build up a JDBC statement that is executed against the database. After implementing the save() method for a few classes, you begin to see the similarities in the code structure and the repetitive nature of implementing that method. Often the basic attributes of an object need to be transliterated and "plugged in" to the appropriate Java API. That is when a code generator can be a useful tool to have in your programming toolbox.

By using a code generator you can automate the process of some tedious, repetitive, and error-prone coding tasks. The fact that you are plugging in to well-known APIs increases the utility of such a tool, since it is applicable to a wide audience of developers. Furthermore, some typically "in-house" domain-specific frameworks can also be considered as fixed API targets for code generators.

A code generator can be a timesaving tool that increases code quality and introduces a more formal and automated approach to part of the development cycle. Another advantage of automated code generation is the synchronization of object definitions across various programming languages. In many tightly bound applications, the same business object (for example, an order to purchase a stock) must be represented consistently in C++, Java, and SQL. The ability to output different representations from a common model is available in various modeling tools; however, I have found it awkward to use those tools to achieve the level of customization required. A dedicated custom code generator is simple enough to create and does not tie you into a specific modeling tool.

The path to Javadoc

The path my team took to choosing Javadoc for code-generation purposes was somewhat long, and probably common. In early implementations, we used Perl scripts to parse custom metadata grammar in a text file. This was an ad hoc solution, and adding additional output formats was difficult. Our second, short-lived attempt was to modify an existing Java-based IDL compiler. We soon realized that additional IDL keywords would have to be introduced to send hints to the code generator. Making an extension to IDL, or even starting from scratch with tools such as lex and yacc (which split a source file into tokens and define code that is invoked for each recognized token) were not personally palatable. (See Resources for more information.)

A third more promising solution was to describe the class metadata using XML. Defining an XML DTD schema and creating XML documents to describe classes seemed like a natural fit. The file could then be verified and easily parsed. To avoid starting from scratch, I figured that someone must have tried to create a similar XML DTD, and I soon came across XMI. XMI is a full-blown description of UML using XML, and it is now used as an exchange format between UML tools. (See Resources for more information.)

However, the XML documents that described classes were extremely verbose and difficult to edit manually. There are simply too many seemingly superfluous tags and descriptions to weed through in order for you to change one class attribute. Also, manipulating XML files at the application-domain level can be quite tedious. IBM alphaWorks produces an XMI toolkit that makes the processing of XMI-based XML documents much easier, but the XMI toolkit API for manipulating class descriptions is extremely similar to the Java Reflection or Doclet API. With that in mind, my organization decided to use the doclet approach, which has been successful.

Introducing Javadoc

Javadoc is the program used to create the HTML-format Java API documentation. It is distributed as part of the Java SDK and its output stage is designed to be extensible through doclet creation. The Doclet API provides the infrastructure to access all aspects of a Java source-code file that has been parsed by Javadoc. By using the Doclet API, which is similar to the Reflection API, you can walk through a Java class description, access custom Javadoc tags, and write output to a file. The standard doclet used to produce the HTML documentation does just that; it writes out HTML files as it traverses all the Java source code. More detailed information on Javadoc can be found in Resources.

By creating simple Java classes that contain attributes and some custom Javadoc tags, you allow those classes to serve as a simple metadata description for code generation. Javadoc parses those metadata classes, and custom doclets access the metadata class information to create concrete implementations of the metadata class in specific programming languages such as Java, C++, or SQL. You can also create variations of the standard doclet that produces simple HTML tables describing the metadata class, which would be appropriate to include in a word processing document. Those metadata Java classes serve the same purpose as an IDL description whose syntax is similar to C++.

Using Javadoc as a code generation tool has several benefits:

  • You don't need to write any parsing code; the parsing of the metadata classes is performed by Javadoc, and presented in an easy-to-use API.
  • By using custom Javadoc tags, you add just enough flexibility to define special hooks during code generation.
  • Since Java types are well defined, an int is 32 bits; therefore, you don't have to introduce additional primitive type keywords to achieve that clarity level.
  • You can check the Java metadata classes for syntax and other errors by compilation.

Introducing doclets

Before jumping into the doclet used for code generation, I'll present a simple "Hello World" example that exposes the relevant parts of how to create, run, and play with the Doclet API. The sample code for SimpleDoclet is given below. (You can obtain the source code for this article in Resources.) If you consider this code somewhat lengthy for a true "Hello World" program, the Sun Website presents an even simpler doclet to help you get started. (See Resources.)

package codegen.samples;
import com.sun.javadoc.*;
import java.text.*;
public static boolean start(RootDoc root) {
  //iterate over all classes.
  ClassDoc[] classes = root.classes();
  for (int i=0; i< classes.length; i++) {
    //iterate over all methods and print their names.
    MethodDoc[] methods = classes[i].methods();
    for (int j=0; j<methods.length; j++) {
      out("Method: name = " + methods[j].name());
    //iterate over all fields, printing name, comment text, and type.
    FieldDoc[] fields = classes[i].fields();
    for (int j=0; j<fields.length; j++) {
      Object[] field_info = {fields[j].name(), fields[j].commentText(),
      //iterate over all field tags and print their values.
      Tag[] tags = fields[j].tags();
      for (int k=0; k<tags.length; k++) {
     out("\tField Tag Name= " + tags[k].name());
     out("\tField Tag Value = " + tags[k].text());
  //No error processing done, simply return true.
  return true;
private static void out(String msg) {
private static MessageFormat METHODINFO =
  new MessageFormat("Method: return type {0}, name = {1};");
private static MessageFormat FIELDINFO =
  new MessageFormat("Field: name = {0}, comment = {1}, type = {2};");

The above doclet prints out descriptive information of the classes, methods, fields, and some Javadoc tag information of the class listed below:

public class SimpleOrder  {
  public SimpleOrder() { }
  public String getSymbol() {
    return Symbol;
  public int getQuantity() {{escriptive
    return Quantity;
   * A valid stock symbol.
   * @see A big book of valid symbols for more information.
  private String Symbol;
   * The total order volume.
   * @mytag My custom tag.
  private int Quantity;
  private String OrderType;
  private float Price;
  private String Duration;
  private int AccountType;
  private int TransactionType;

After compiling these files, you invoke the Javadoc tool using this command:

javadoc -private -doclet codegen.samples.SimpleDoclet

The -private option tells Javadoc to expose private field and method information, and the -doclet option tells Javadoc what doclet to invoke. The last parameter is the file to be parsed. The output of the program is the following:

Loading source file
Constructing Javadoc information...
Method: name = getSymbol
Method: name = getQuantity
Field: name = Symbol, comment = A valid stock symbol., type = 
      Field Tag Name= @see
      Field Tag Value = A big book of valid symbols for more information.
Field: name = Quantity, comment = The total order volume., type = int;
      Field Tag Name= @mytag
      Field Tag Value = My custom tag.
Field: name = OrderType, comment = , type = java.lang.String;
Field: name = Price, comment = , type = float;
Field: name = Duration, comment = , type = java.lang.String;
Field: name = AccountType, comment = , type = int;
Field: name = TransactionType, comment = , type = int;

The sample code shows that the Doclet API is contained in the package com.sun.javadoc. Since you are plugging in to the Javadoc tool and are not creating a standalone application, Javadoc calls your doclet from the method public static boolean start(RootDoc root).

Once the start method executes, RootDoc holds all the information parsed by Javadoc. You can then start to walk through all the parsed classes by invoking the method classes() on RootDoc. That method returns a ClassDoc array describing all the parsed classes. ClassDoc in turn contains methods such as fields() and methods(). These methods return FieldDoc and MethodDoc arrays that describe all the fields and methods of the parsed class. All the "Doc" classes contain the method tags, which returns a Tag array describing both custom and standard Javadoc tags. The standard tag used in this example is @see.

The out() method simply wraps the standard output, and the MessageFormat class helps format the output according to a fixed template.

Reusable classes for code generation

In light of the above example, I hope you agree that creating your own doclets and extracting the class information using the Doclet API is easy. The next step to parsing the Java classes and generating code to a file is relatively straightforward. To make creating code-generation doclets easier, I developed a small set of interfaces and abstract base classes. The class diagram of these utility classes is shown below.

Utility classes

The interface Maker defines the method signature public void make(ClassDoc classdoc) that you will use to interact with your code generators. The abstract class CodeMaker provides default implementations for manipulating files and indention, which are common to all code generators. Specific code generators inherit from the abstract base class and provide an implementation of the make method. The make method has the class ClassDoc as an argument, not RootDoc. That causes the Maker to enter the code generation logic at the class level.

All classes parsed by Javadoc are looped over in the doclets plug-in method start. An example of how that is done (described in the file is shown below:

public static boolean start(RootDoc root) {
  ClassDoc[] classes = root.classes();
  //Set up CodeMakers to run
  Maker simplemaker = new SimpleCodeMaker("Description Maker");
  //Iterate through all classes and execute the "make" method the Maker
  for (int i=0; i < classes.length; i++ ) {
    ClassDoc classdoc = classes[i];
  return true;

Following are parts of the code from a simple code generator called SimpleCodeMaker, which performs the same task as the SimpleDoclet previously listed. Instead of sending the output to the screen, SimpleCodeMaker saves it to a file in the subdirectory genclasses. The implementation of the make method is also becoming more structured with separate methods to process fields and methods. Only the methods make and processMethods are listed here for brevity.

1 2 Page 1