Persist data with Java Data Objects, Part 2

Sun JDO vs. Castor JDO

In Part 1, I provided an overview of available persistence mechanisms and their implementations, and introduced the Java Data Objects (JDO) approach to persistence. In Part 2, I conclude this series by looking more closely at the two competing JDO standards: the Sun Microsystems JDO and the Castor JDO. Both specifications provide unified, simple, and transparent persistence interfaces between Java application objects and data stores, and create an interesting alternative to entity beans.

Read the whole series on Java Data Objects:


The Sun JDO architecture provides application programmers a Java-centric view of persistent information. Sun's specification defines a standard API for data contained in various enterprise information systems, such as enterprise resource planning, mainframe transaction processing, and database systems. The architecture also follows the Java Connector Architecture (JCA), which defines a mechanism set for integrating an executive information system with an application server.


Figure 1 illustrates the Sun JDO architecture. The specification allows multiple JDO implementations -- possibly attached to different data stores -- to be plugged into an application server or used directly in a two-tier architecture. This approach lets application components access the underlying data stores using one consistent Java-centric data view. The JDO implementation provides the necessary mapping from Java objects into the underlying data store's special data types and relationships.

Figure 1. The Sun JDO architecture. Click on thumbnail to view full-sized image.

Before delving into the Sun JDO interfaces, I discuss the specification's three fundamental concepts:

  • JDO instance
  • First-class objects
  • Second-class objects

JDO instance

A JDO instance is a Java class instance that implements the application functions and represents data in an enterprise data store. A JDO instance has an important limitation: its class must always implement the PersistenceCapable interface (defined below), either explicitly by the class writer or implicitly by the enhancer's results.

First-class objects

Instances of the PersistenceCapable classes that have JDO identity represent first-class objects; these objects are stored in a data store with their associated second-class objects (if any) and primitive values. First-class objects are unique: when a PersistentManager (defined below) instantiates one into memory, that same PersistentManager manages an instance representing that first-class object, though other PersistentManagers might manage other instances of that same class.

Second-class objects

Also PersistenceCapable instances, second-class objects differ from first-class objects in that they have no JDO identity of their own. Second-class objects notify their first-class objects of their modification; that modification reflects as a change to that first-class object. Second-class objects are stored in the data store as part of a first-class object only.

A good example of a first-class object: an object of class Order. An Order usually has one or more instances of OrderLine items. OrderLine is a second-class object example.

Now, I'll detail the specification's major public interfaces:

  • PersistenceManagerFactory
  • PersistenceManager
  • PersistenceCapable
  • Transaction
  • Query


The PersistenceManagerFactory creates PersistenceManager instances for application use. It also lets application developers configure the persistence layer behavior -- set transaction options, perform connection pool administration, and so on.

In a managed environment, the application uses JNDI (Java Naming and Directory Interface) lookup to retrieve an environment-named object, which is then cast to javax.jdo.PersistenceManagerFactory.


The JDO PersistenceManager provides the primary interface for JDO-aware application components. PersistenceManager administers persistent instances' lifecycles, and provides transaction and cache management. It also acts as the Query interface's factory.


As mentioned before, any user domain class must implement the PersistenceCapable interface. Note: There are no special methods for persistence; the PersistenceCapable interface is really empty. You can implement this interface in one of three ways: through source code, an enhancer, or generation (tool-based).


Persistence operations usually occur within a transactional context. A one-to-one relationship exists between the PersistenceManager and the Transaction. In managed environments, the container provides actual transaction services, but the Transaction interface provides methods for managing transaction options. In a standalone environment, the Transaction implementation, provided by the JDO software vendor, must ensure a successful transaction -- commit or rollback.

Cache management and JDO instance lifecycle

Every JDO object (instance) goes through a series of state changes in its lifetime. The Sun JDO specification defines 10 JDO instance states. It requires seven transactional instance states (the remaining three are optional):

  • Transient: When an instance is transient, the object lacks persistent identity. A transient instance changes its state to persistent-new (see below) in one of two ways: when passed as an argument to the makePersistent() method or when referenced by a persistent instance's persistent field after that same instance commits.
  • Persistent-new: An instance that is newly persistent in the current transaction. When an application component requests an instance to become persistent, that instance assumes the persistent-new state and receives a persistent identity.
  • Persistent-dirty: An instance's state when one or more of its attributes have changed (within the current transaction), but not yet persisted.
  • Hollow: A JDO instance that represents specific persistent data in the data store, but whose values are not in the JDO instance.
  • Persistent-clean: A JDO instance that represents specific transactional persistent data in the data store, and whose values have not changed (within the current transaction).
  • Persistent-deleted: JDO instances that represent specific persistent data in the data store and have been deleted in the current transaction.
  • Persistent-new-deleted: JDO instances that represent new persistent instances deleted from the current transaction.

For more information regarding state transitions and JDO instances' persistence states, consult the Sun JDO spec.


An important part of every data manipulation subsystem is its query language. In the Sun JDO, the PersistentManager instance is a factory for query instances, and queries execute in the context of the PersistentManager instance.

Queries must conform to the Object Query Language (OQL) grammar, defined in the Object Management Group (OMG) 3.0 OQL Specification. The JDO OQL resembles SQL, except that it operates on Java classes and objects, not tables. A JDO OQL query has at least three elements:

  • Class of result.
  • JDO instances' candidate collection (usually extent).
  • Query filter.

Optional query elements include the following:

  • Parameter declaration(s) (follows formal Java syntax parameters).
  • Value(s) to be bound.
  • Ordering specification.

Query filters can be the following:

  • Names of instance fields in the Java objects.
  • Operators (a subset of Java operators, for example: ==, !=, >, ||, and so on). Of course not all Java operators make sense for data selection.

Please consult the JDO spec for more details regarding queries.

Below is a JDO query example:

 // Example of a query
Class target = Employee.class;
Extent extent = pm.getExtent (target, false);
String filter = "getEmpId () >= 1 && getEmpId () <= 10";
Query query = pm.newQuery (extent, filter);
query.setClass (target);
query.compile ();
Collection result = (Collection) query.execute ();

Sun JDO: The good, the bad, and the ugly

Let's see how the Sun JDO compares to the ideal persistence layer presented in Part 1. As you might recall, the desirable traits I outlined include the following (Note: I reference only the most important features):

  • Simplicity
  • Minimal intrusion
  • Transparency
  • Consistent, concise persistence APIs

The Sun JDO's major advantage: it provides a unified, standard persistence interface supported by multiple vendors delivering competing implementations. Another advantage is its transparency, or its data-store type independence.

The Sun JDO almost fulfils the simplicity trait, the only offending part being the OQL Query specification, which could be simplified. Sun also fails to offer minimal intrusion. In addition, Sun's JDO could do without the class enhancer or the obligatory PersistenceCapable interface.

Some designed-by-committee traits are also evident: The API is not always consistent. Also, specification development progresses slowly; Sun JDO has been in draft form for several months now.

Overall, however, its standard persistence interface and transparency really set the tone for the Sun JDO. Its future looks quite bright.

Castor JDO

Castor is an open source data-binding framework for Java. Castor's multiple projects target mapping between Java objects, XML documents, SQL tables, and LDAP (lightweight directory access protocol) directories. The Castor JDO project focuses on the Java object persistence-to-relational data stores. Despite its name and resemblance, Castor JDO is not compatible with Sun's spec. Note: Because Castor JDO concentrates on relational data stores exclusively, it does not support data-storage type transparency. Among the nonproprietary API (open source or multivendor) solutions available, the Castor JDO feature set is more than sufficient for most projects, and the price can't be beat.


Figure 2 illustrates the Castor JDO architecture. Unlike the Sun JDO, which can have multiple PersistentManager instances in the JVM, Castor JDO maintains only one instance of its PersistenceEngine in Castor. Castor does not require application objects to implement a special interface, but it provides a callback Persistent interface, which can be implemented if the object wants to receive notification of Castor events -- such as object creation or deletion. This interface allows the creation of user-defined actions that an object will perform at various times during its lifecycle.

Figure 2. The Castor JDO architecture. Click on thumbnail to view full-sized image.

The Castor JDO's principal workhorse -- the one you will use most -- is the Database module. It represents an active connection to the database, which you can use to perform transactional operations on the database. Database's major methods:

  • public Object load( Class type, Object identity ): Loads an object of a specified type and given identity into the cache. Once loaded, the object is marked as persistent.
  • public void create( Object object ): Creates a new object in persistent storage. The object will persist only if the transaction commits.
  • public void remove( Object object ): Removes the object from persistent storage -- only if the transaction commits.
  • public void update( Object object ): Updates a data object queried/loaded/created in another transaction.

The Database module also has the usual methods used for transaction demarcation:

  • public void begin()
  • public void commit()
  • public void rollback()

Castor JDO is designed to work in managed environments -- J2EE (Java 2 Platform, Enterprise Edition) application servers, for example -- and nonmanaged environments. The specification does not need or use a preprocessor (also known as a precompiler) or class enhancer (bytecodes modification) for data-binding and object persistence.

Cache management

Castor JDO's central concept is the cache. Castor implements a data cache to reduce database access. The cache properly locks and isolates data objects from other transactions. Castor provides several alternative LRU (least recently used)-based caching strategies, such as instance eviction based on a cache's number of instances or an instance's age. Castor JDO also offers dirty-checking and deadlock detection. Caching modes are self-explanatory: none, count-limited, time-limited, and unlimited. The cache is write-through, as all changes to a transaction's objects persist at commit time without delay. Castor supports several locking modes, including shared, exclusive, database-locked, and read-only. In addition, Castor also supports long transactions, which allow objects to be read in one transaction, and modified and then committed in a second transaction, with built-in dirty-checking to prevent data that has changed since the initial transaction from being overwritten.

Modeling relationships

Castor distinguishes the relationships between two objects as either dependent or related, and maintains the lifecycle differently for the two relationship types. The mapping file explicitly defines the relationships. Castor JDO supports different relationship cardinalities, including one-to-one, one-to-many, and many-to-many. The framework distinguishes between related (i.e., association) and dependent (i.e., aggregation) relationships during an object's lifecycle, automatically creating and deleting dependent objects at appropriate times. It also supports multiple-column primary keys and a variety of unique key generators.


The mapping module reads and parses the XML-based mapping file, and then provides information to Castor JDO for automatic translation between database data and Java objects.

Loading a mapping file is simple:

mapping = new Mapping(this.getClass().getClassLoader());


Castor JDO OQL is a subset of OQL, but not compatible with Sun JDO OQL. Like the Sun JDO OQL, Castor JDO OQL resembles SQL, except that it performs operations directly on Java objects instead of database tables, making the language more appropriate for use within a Java-based application.

The public interface org.exolab.castor.jdo.Query is also simple:

      public void bind( Object value )
      public QueryResults execute()

Note: The Query interface actually contains many bind(...) methods overloaded to allow for different argument types to be bound. Please consult Castor JDO documentation for details.

Below, you will find some JDO OQL query examples:

 OQLQuery query = db.getOQLQuery("SELECT p FROM Person p WHERE name LIKE  AND dob >  AND married="); 
OQLQuery query2 = db.getOQLQuery("select a from a where to_lower(loginname) like  order by lastname, firstname"); 
OQLQuery query3 = db.getOQLQuery( "SELECT c FROM Course c WHERE categories =  AND program =  AND status = " );

More advanced query features are still under development -- consult the documentation for more information.

Data-store support

Castor JDO supports many relational databases. The list includes the usual suspects, such as Oracle and SQL Server, but also open source data stores, such as MySQL, InterBase, and SAP DB. Castor JDO only supports relational databases.

Castor JDO: The good, the bad, and the ugly

Castor JDO compares to the ideal persistence layer quite favorably. It is elegant in its simplicity, nonintrusiveness, and transparency -- although its transparency is limited to relational data stores. The API is concise and well thought out.

As an open source project, Castor JDO is free and modifiable. But along with open source benefits come its drawbacks: limited support and lack of tutorials, or nonskeletal documentation for that matter. Lack of support and documentation can add hours to your first Castor project. Open or not, single implementation and limited support is not a good combination. You might stumble upon an issue currently not handled by Castor, grinding your project to a halt. With multiple implementations, you could look at alternative JDO products. This issue, though, is not Castor-specific; it is common with all open source software.

However, from a purely technical viewpoint, Castor is well thought out and developer friendly. It will remain popular with developers for a long time.


Let's now compare JDO with JDBC (Java Database Connectivity). Look at some sample code written using Castor JDO running in a J2EE environment (the session bean method sets transaction boundaries). The following code creates a new Incident instance, adds three children objects of class EventLogItem, and stores the object and its children in a database:

  // Add a new Incident
try {
  Incident i = new Incident(title);
  i.addEventLogItem(new EventLogItem(username, comment));
  i.addEventLogItem(new EventLogItem(username, comment2));
  i.addEventLogItem(new EventLogItem(username, comment3));
  // pm is an instance of the PersistenceEngine
  Long id = (Long) pm.create(i);
  return id.longValue();
catch (Exception ex) {
   // Handle exception

The same logic implemented with JDBC would look as follows:

 // Create Incident
try {
  Connection conn = ds.getConnection();
   PreparedStatement ps1 = conn.prepareStatement("select 
MAX(incidentID) from incidents");
   ResultSet rs = ps1.executeQuery();
    if ( {
         incidentID = rs.getInt(1);
    PreparedStatement ps2 = conn.prepareStatement(createIncidentSQL);
    ps2.setLong(1, incidentID);
    ps2.setString(2, title);
    int rows = ps2.executeUpdate();
    Incident incident = new Incident(title);
    PreparedStatement ps3 = conn.prepareStatement("select MAX(itemID) from " + LOG_TABLE);
    for (int i = 0; i < 3; i++) {
        int itemID = 0;
        ResultSet rs3 = ps3.executeQuery();
        if (
               itemID = rs3.getInt(1) + 10;
        PreparedStatement ps4 = conn.prepareStatement(createLogSQL);
        ps4.setLong(1, itemID);
        ps4.setLong(3, System.currentTimeMillis());
        ps4.setString(4, username);
        ps4.setString(5, comment);
        ps4.setInt(6, 0);
        ps4.setInt(7, 0);
       int logrows = ps4.executeUpdate();
       EventLogItem eli = new EventLogitem(username, content);
        catch (Exception e) {
       // Handle connection

JDO's one clear advantage: less code to write and maintain. In most nontrivial cases, I found that the JDBC API produces an average of three times more code than the JDO API.

JDO vs. entity beans

Many in the developer community debate, rather heatedly, the advantages of JDO over those of entity beans. If a developer wants to take advantage of EJB's (Enterprise JavaBeans) lifecycle management, security, and an entity bean's distributed nature, then EJB is the right choice. However, both the Sun and Castor JDO specifications have much less overhead and provide more design freedom -- data objects are still Java objects, not entity beans. In most cases, developers don't need entity bean's remote capability because they access the entity beans through appropriate session beans. That approach follows the Façade pattern, which you can use with JDO; using JDO to model data and session beans as the interface feels natural.

JDO in real life

Every technology has its strengths and weaknesses. Contrary to widespread expectations, JDO in the real world doesn't necessarily free architects from knowing their data store.

Developers must also devote considerable effort to keeping the data model, the mapping file, and the database model in synch. You must consider that issue for major projects with changing requirements -- requirements change in probably 99 percent of software projects. Luckily, the available Java code and/or DDL (data description language) generators are growing more mature. Also, you will find that using alternative approaches to persistence in synching up data models proves more painful -- JDBC comes to mind.

To use JDO-based products, you must master JDO-specific approaches to data-type and relationship mapping, caching mechanisms, and many other powerful features. Persistence is a complex subject, and the product documentation does not necessarily tell you the whole story. However, once you've completed the learning phase, your development productivity will shoot through the roof with JDO.

Software architects working on large, scalable environments should note that JDO does not currently support distributed caching. Developers have two obvious options available:

  • Implement caching, but not clustering
  • Implement clustering, but not caching

Help is on the way, however. Several vendors, like ObjectFrontier and GemStone have announced distributed cache products, and some will work with JDO.

JDO: The way to go

Regardless of which solution you decide on, JDO is an exciting, new, and almost mission-critical-ready technology. The main advantage of JDO over entity beans is its relative simplicity; it lets developers concentrate more on business logic. Less code to write means fewer errors, which means higher software development productivity. Just as Java's automatic memory management frees developers from mundane and completely unnecessary pointer arithmetic details, JDO relieves developers from labor-intensive persistence issues.

More JDO-compliant products are appearing on the market. JDO makes it easier to write data-oriented applications, and the advantage is especially pronounced in the J2EE development area.

Both the Sun and Castor JDO approaches come closer than any other alternative to the ideal persistence layer presented in Part 1. No other technology available offers a comparable combination of portability, power, and simplicity. What's not to like?

Jacek Kruszelnicki is president of Numatica Corporation, an information technology consulting firm providing expertise in information-systems strategy development, analysis, and planning; software development; and training. Jacek (pronounced Yatsek) received his master's degree in computer science from Northeastern University in Boston, Mass., and has more than 15 years' experience delivering maintainable, large-scale, distributed enterprise solutions.

Learn more about this topic