Build an object database, Part 2: Object storage backend

Implement relational database storage for Java objects

Mapping objects into relational databases and vice versa is almost always a hassle. A general solution is difficult because of the inherent differences in the relational and object-oriented approaches to data modeling.

Tools do exist to help with this process, usually at a "generous fee" (with respects to the late Curtis Mayfield). Sometimes the tools are necessary, especially when mapping from an existing relational data model to an object model. However, when the relational model is not set in stone, it is possible to let the object model drive the relational model, allowing storage options to surface closer to home.

TEXTBOX:

TEXTBOX_HEAD: Build an object database: Read the whole series!

:END_TEXTBOX

The most obvious solution: serialize each Java object using the object streams and slap the result into a database as a binary blob. While this is certainly a valid option and the JDBC explicitly supports it, blobs cannot be readily manipulated (or even read) by anything other than other Java applications. So interoperability, along with human readability, go out the door.

The relational storage backend we'll build alleviates this data- interoperability problem. Our backend actually creates relational tables for each class and maps each instance's variables into them as columns, creating a sort of poor man's object-relational mapping, if you will. (To download this article's complete source code, go to Resources.)

Framework overview revisited

The object-storing framework introduced in the January Java Step by Step time divides the work of persisting Java objects into two tasks:

  1. Frontend: Scatter the object into its fields, preserving its type information -- the responsibility of an ObjectStorer implementation.

  2. Backend: Store the values of the fields along with type information -- the responsibility of an ObjectStorage implementation.
Figure 1. The object storage architecture

The object-storing framework enables object storers and object storage implementations to vary independently. The object storer doesn't care how the object storage implements its storage behavior, and the object storage doesn't know that the object storer exists. This is as it should be.

The ObjectStorer interfaces look like this:

 public interface ObjectStorer {
  public void put (Object key, Object object) throws IOException;
  public Object get (Object key) throws IOException,
                                        ClassNotFoundException, 
                                        IllegalAccessException, 
                                        InstantiationException;
}

In the January Java Step by Step, Merlin provides implementations of this interface -- most notably a SerializationStorer.

In this article, we'll complete the framework by implementing the ObjectStorage interface, as seen below:

 public interface ObjectStorage {
  public void put (Object key, StorageFields object) throws IOException;
  public RetrievalFields get (Object key) throws IOException;
  } 

The ObjectStorage interface uses the StorageFields and the RetrievalFields classes to pass information about fields that are stored and retrieved, as we'll see next.

Class StorageFields

As you'll no doubt recall, the StorageFields is a collection of the scattered fields, as well as the stored Java object's type information. The ObjectStorer implementation hands an instance of this class to the ObjectStorage implementation as a parameter to the put() method.

Class RetrievalFields

The RetrievalFields class is a collection of fields returned by the ObjectStorage implementation in response to a get() call. It is practically the same as the StorageFields class, except that there is no need to explicitly state type information since it is represented by the class of the object being returned.

We'll call our ObjectStorage implementation class SQLObjectStorage.

SQLObjectStorage's data model and constraints

Before we delve into the code, let's outline how our SQLObjectStorage's relational data model will work.

Figure 2. SQLObjectStorage's data model

The main concept behind this extremely simple data model is: SQLObjectStorage creates a relational table for each class to be stored in the database. SQLObjectStorage then stores instances of a given class in the table created to correspond to that class.

Each table includes a set of columns that correspond to the fields defined in the class. SQLObjectStorage dynamically creates tables and columns using a database-specific mapping from Java types to column types supported by the database.

A bit of overhead information -- the mapping from each key to its respective instance in the database -- is kept in the database in addition to the stored instances. To achieve this end, SQLObjectStorage creates a key column in each table to key the instances therein. It also creates a special key table in the database to store every key and the table where it resides.

In addition, SQLObjectStorage imposes the following constraints:

  • Each database's keys are unique: A key uniquely identifies an object in a specific database. If a key/value pair exists in the database and another value is entered with the same key, the new value replaces the old value.
  • There are no duplicate entries per key: Each key will store against only one value.
  • One table exists per primary class type (classname): Each class is stored as an instance of its primary class, with all of its fields as well as superclass fields flattened out into the columns of the table that bears its name.
  • Class and field names are escaped: In order to avoid illegal database characters, encode class names with escape characters to get table names. The same process applies to mapping fields to column names.
  • Values are escaped: Field values are escaped to avoid problems with SQL statement syntax.

Abstract class SQLObjectStorage

So now we come to the ObjectStorage implementation. We'll look at just the important methods here. (For the complete source code, see Resources.)

First, we look at the ObjectStorage-interface method implementations, starting with put().

The put() implementation takes a key and a StorageFields object as arguments. It then removes any existing entries in the database under the key. If there is an object to store, it creates a table and stores the object's field values to it as follows:

  public synchronized void put (Object key, StorageFields object) throws 
IOException {
    if (key == null) return;
    if (DEBUG) System.out.println (this + "::put: key is '" + key + "', object 
is " + object);  
    if (DEBUG_2) System.out.println (object);
    String keyString = key.toString ();
    try {
      removeEntries (keyString); // always remove old entries under key, if 
they exist
      if (object != null) {
        createTable (object);
        storeValuesToTable (keyString, object);
      }
    } catch (SQLException ex) {
      if (DEBUG) {
        System.err.println (this + "::put: caught exception... ");
        System.err.println ("SQLException: " + ex.getMessage ());
        System.err.println ("SQLState:     " + ex.getSQLState ());
        System.err.println ("VendorError:  " + ex.getErrorCode ());
      }
      throw new IOException (ex.getMessage ());
    }
  }

Next, the get() method returns a RetrievalFields. The first step is to get the class table for the key. Then the get() method obtains a RetrievalFields for the key in the correct class table:

  public synchronized RetrievalFields get (Object key) throws IOException {
    if (key == null) return null;
    if (DEBUG) System.out.println (this + "::get: key is '" + key + "'");
    RetrievalFields fields = null;
    try {
      String keyString = key.toString ();
      String classTable = getClassTableForKey (keyString);
      fields = getEntryForKeyInTable (keyString, classTable);
    } catch (SQLException ex) {
      if (DEBUG) {
        System.err.println (this + "::get: caught exception... ");
        System.err.println ("SQLException: " + ex.getMessage ());
        System.err.println ("SQLState:     " + ex.getSQLState ());
        System.err.println ("VendorError:  " + ex.getErrorCode ());
      }
      throw new IOException (ex.getMessage ());
    }
    return fields;
  }

Moving on, the getConnection(), getTypeString(), and getObjectFromResultSet() methods are deferred to subclasses. The getConnection() method knows how to make a connection to the specific database a subclass represents. getTypeString() and getObjectFromResultSet() mapped type information from the database column types to Java types and vice versa:

  // subclasses define driver-specific connection tasks and set the connection 
variable
  protected abstract void getConnection () throws SQLException;
  // subclasses define database-specific type mapping
  protected abstract String getTypeString (Class type);
  // subclasses define type information when reconstructing objects and 
sqlDecode () strings
  protected abstract Object getObjectFromResultSet (ResultSet rs, 
                                                    int colNum,
                                                    String colName,
                                                    int jdbcType) throws 
SQLException;

Next is an example of the flavor of the methods that SQLObjectStorage uses to store objects in a generic SQL database:

  protected void createTable (StorageFields object) throws SQLException {
    // CREATE TABLE TABLE_NAME (COL1_NAME COL1_TYPE, COL2_NAME COL2_TYPE, ...);
    Statement statement = null;
    try {
      StringBuffer sqlBuffer = new StringBuffer ();
      String tableName = sqlEncode (object.getClassName ()); // to get rid of . 
in classname
      if (doesTableExist (tableName)) return;
      sqlBuffer.append ("CREATE TABLE " +
                        tableName + " (" +
                        KEY_COLUMN_NAME +
                        " VARCHAR(" +
                        getMaxKeyLength () +
                        "), "); 
      Iterator fieldNames = object.getFieldNames ();   
      while (fieldNames.hasNext ()) {
        String fieldName = (String) fieldNames.next ();
        Class type = object.getType (fieldName);
        String typeString = getTypeString (type);
        sqlBuffer.append (sqlEncode (fieldName) + " " + typeString);
        if (fieldNames.hasNext ()) sqlBuffer.append (", ");
      }
      sqlBuffer.append (")");
      if (DEBUG) System.out.println (this + "::createTable: sending: " + 
sqlBuffer);
      statement = connection.createStatement ();
      statement.executeUpdate (sqlBuffer.toString ());
    } finally {
      if (statement != null) {
        try {
          statement.close ();
        } catch (SQLException ignored) {
        }
      }
    }
  }

The createTable() method above creates a StringBuffer which it uses to build an SQL statement, starting with CREATE TABLE, followed by the name of the key column that is added to each table as it is defined.

Of course, subclasses can override these methods if absolutely necessary, although the SQL used should work with drivers for standard databases.

Note that everything that goes to the database is encoded (KEY_COLUMN_NAME and similar constants are pre-encoded). This ensures that no errors will be raised because of attempts to create column names containing . and such. This would be of concern in any class that is in a package other than the default package -- java.lang.Integer for example, or practically any real-world class for that matter.

Next, we see a (typically) lengthy method used by the get() method. The getEntryForKeyInTable() method populates a RetrievalFields object based on the fields and values obtained from the row in table classTable, which itself corresponds to key:

  protected RetrievalFields getEntryForKeyInTable (String keyString,
                                                   String classTable) throws 
SQLException {
    if (classTable == null) return null;
    // find first entry for key in class table (there should be only one...)
    Statement statement = null;
    ResultSet rs = null;
    try {
      String className = sqlDecode (classTable);
      RetrievalFields fields = new RetrievalFields (className);
      if (DEBUG) System.out.println (this + "::getEntryForKeyInTable: class 
name: " + className);
      StringBuffer sqlBuffer = new StringBuffer ();
      sqlBuffer.append ("SELECT * FROM " +
                        classTable +
                        " WHERE " +
                        KEY_COLUMN_NAME + " = '" +
                        sqlEncode (keyString) + "'");
      if (DEBUG) System.out.println (this + "::getEntryForKeyInTable: sending: 
" + sqlBuffer);
      statement = connection.createStatement ();
      rs = statement.executeQuery (sqlBuffer.toString ());
      rs.next (); // advance to the first entry
      ResultSetMetaData meta = rs.getMetaData ();
      int n = meta.getColumnCount ();      
      for (int colNum = 1; colNum <= n; ++ colNum) { 
        String colName = meta.getColumnLabel (colNum);
        int jdbcType = meta.getColumnType (colNum);
        if (DEBUG) System.out.println (this + "::getEntryForKeyInTable: found 
colName: '" + colName + "' of JDBC type: " + jdbcType);
        if (!colName.equals (KEY_COLUMN_NAME)) {
          Object object = getObjectFromResultSet (rs, colNum, colName, 
jdbcType);
          String field = sqlDecode (colName); 
          if (DEBUG) System.out.println (this + "::getEntryForKeyInTable: field 
'" + field + "' value is: " + object);
          fields.addField (field, object);
        }
      }
      return fields;
    } finally {
      if (rs != null) {
        try {
          rs.close ();
        } catch (SQLException ignored) {
        }
      }
      if (statement != null) {
        try {
          statement.close ();
        } catch (SQLException ignored) {
        }
      }
    }
  }

Note the use of metadata to figure out the columns that the table contains, as well as their respective types. Metadata proves useful for this sort of thing; otherwise we would have no way to get properties of the database itself.

getObjectFromResultSet (rs, colNum, colName, jdbcType) is one of the deferred superclass methods that a database-specific subclass implements to return an object of the correct type for the field.

Class MySQLObjectStorage

The MySQLObjectStorage concrete class specializes SQLObjectStorage for the MySQL database. A counterpart to MySQLObjectStorage must be defined for every database that is to be used with the framework:

  protected void getConnection () throws SQLException {
    String connectionString = "jdbc:" +
                               driverIDString +
                               "://" +
                               hostname +
                               (port.equals ("") ? "" : ":") +
                               port +
                               "/" +
                               databaseName +
                               "?user=" +
                               username +
                               "&password=" +
                               password;
    connection = DriverManager.getConnection (connectionString);
  }

getConnection() serves as an example of one of the methods that specialize the superclass for use with the MySQL database. Because of getConnection(), MySQLObjectStorage knows how to make a connection to the MySQL database using a MySQL-specific connection string.

Using the framework

With the code safely implemented, we can turn our attention to how to use it -- if we use the MySQL database. If you're not using MySQL, you'll have to make your own subclass of SQLObjectStorage first. (The complete MySQLObjectStorage class (for reference purposes) is provided with the code in Resources.)

You'll also need to obtain the JDBC driver for your database. Look at Resources for Sun's listing of JDBC drivers, or contact your vendor directly.

With that out of the way, let's proceed to a discussion of how a containing class will use the framework. The containing class will use your subclass and the framework in a manner similar to the following:

 ObjectStorage storage = new MySQLObjectStorage (DRIVER_CLASS_NAME,
                                                 DRIVER_ID_STRING,
                                                 PORT,
                                                 HOSTNAME,
                                                 USER_NAME,
                                                 PASSWD,
                                                 DATABASE_NAME);
    ObjectStorer storer = new SerializationStorer (storage);
    storer.put ("Helms", new RepublicanSenator ("Helms", 16000000000L));
    storer.put ("Clinton", new Senator ("Clinton", 52, true));
    storer.put ("Nobody", new LibertarianSenator ("Nobody", 0, 0.00001F)); 
    LibertarianSenator lib = (LibertarianSenator) storer.get ("Nobody");

Simple enough, but when using the code, keep the following in mind:

  • The user USER_NAME must have permission to read to, write from, and create and drop tables in the database.
  • Keys should be strings or return useful toString() representations.
  • Remember that in this version of the framework, object fields that are reference types (that is, graphs of objects) aren't stored as reference types. The framework supports simple types only.

Furthermore, when you run your code you will need to incorporate the JDBC driver class files for your database. If these are contained in a JAR file, you might issue something like the following command (under Linux):

java -cp mysql_2_comp.jar:. DBStorageTest

Enhancements

As always, there are a number of things you can do to the supplied code to add features and increase robustness:

  • Identify and escape SQL reserved words in class and variable names.
  • Implement a Connection pool, or alternatively a database bridge that exposes only needed methods such as getMetadata() and createStatement(), but encapsulates failure recovery and multiple Connections.
  • Implement some other kind of Connection failure-recovery mechanism.

Conclusion

We have reached the end of our little trip ... and what a long, strange trip it's been. We now have a fairly functional object-storage framework that persists objects to a standard relational database using standard database types and column structures.

In the next edition of Java Step By Step, Merlin will return to iterate on the framework. Meanwhile, please send in comments about the framework -- loves, hates, errors, omissions, and/or improvements.

Until next time, bon appétit!

Here's my bumper sticker idea for this election year: Don't blame me -- I voted for NOTA! http://www.nota.org

Learn more about this topic