Build an object database

Construct a frontend to translate between Java objects and relational database records

One of the great strengths of Java is its ability to automate tedious programming tasks. For example, in the realm of I/O, object serialization automates the encoding of an arbitrary object as a stream of bytes. It could be done manually, but serialization makes it much more accessible and effective. Similarly, in the realm of networking, RMI (Remote Method Invocation) automates network requests between objects. That, again, could be done manually, but RMI makes it more accessible and effective. Finally, in the world of XML, work is under way to automate the generation of Java maps of XML document types, which will provide automated facilities for encoding and decoding XML documents. Again, much better than doing it manually.

TEXTBOX:

TEXTBOX_HEAD: Build an object database: Read the whole series!

:END_TEXTBOX

One notable gap in this capability is the realm of databases. In this article, we will present our first go at overcoming this limitation by providing automated transformations between Java objects and records in a database.

(Some credit for this article must be extended to JavaWorld reader Alasdair Gilmour, who graciously pointed us in the direction of accessing private object fields.)

Note: The article's full source code can be downloaded from Resources.

Terminology

Let's first get some terminology down. You see, I'm a firm believer in flat files, binary editors, and linear searches. There's really nothing you can't do with them. Since I'm now venturing into the scary world of databases, I want to make sure you know what I mean by ftang when I say ftang.

A database is a collection of tables. A table is a collection of records. A record is a collection of fields. Fields are the basic units of information in a database.

So, a database is a big bad backend system containing all of your information. Remember, information is power. A table is a specific subset of information of a particular type: for example, a table of all the employees in your company. A record is a single row within that table: for example, all information about a particular employee. Finally, a field is a particular datum within that record: for example, an employee's expendability rating.

In this article, we want to map between database records and Java objects. That means we want to be able to automatically translate between a Java User object and a particular record within, for example, an employees table in a database. Consequently, the name variable of the Java object will be mapped to the name field of the database record, and vice versa.

To accomplish this, we could simply serialize the Java object and throw the resulting binary blob into the database. But that's no fun. It does not lend itself to convenient access from anything but a Java program. In particular, we lose interoperability and we lose human accessibility. So we're not going there.

Architecture

Let's look at the 33,000-foot view of storing a Java object in a database:

  1. Vivisect a Java object to determine all its fields and their values

  2. Store those fields and values into the backend database

Dropping rapidly to sea level, we have a couple of options for vivisecting the object: reflection allows us to programmatically query the object's public fields, and serialization allows us to automatically encode all the fields of the object. Alternatively, a public interface would allow the object to decide on its own encoding. Similarly, we have a variety of options for storing the fields in the backend database: JDBC (Java Database Connectivity), flat files, maps, and so on.

Conveniently, the logical separation between vivisection and cold storage provides a convenient break between this article (mine) and the next (Michael Shoffner's). I'll be your vivisectionist on this little tour; Mike, your mortician. I mean, database consultant. The separation also provides a convenient variety of runtime configuration options.

Having briefly outlined the broad division of labor, I'm now going to nail down two interfaces that describe the processes:

  • ObjectStorer. This interface describes the frontend object database process: scattering a Java object into fields within backend storage, and gathering those fields back into a Java object.

  • ObjectStorage. This interface describes the backend object database process: storing object fields in a database and retrieving them from the database.
The object storage architecture

To actually make use of the system, you must connect an ObjectStorer with an ObjectStorage. So, we next need to nail down the datastructures that will be communicated between these interfaces:

  • StorageFields. This class encapsulates information about the object being stored, including the names, types, and values of all its fields. An ObjectStorer will hand the information to an ObjectStorage for placement in the database.

  • RetrievalFields. This class encapsulates information about the record that was retrieved from the database, including the names and values of all fields. The difference from StorageFields is that the backend need not maintain type information about the fields; that is automatically determined during the retrieval process.

This is about as far as I can go without getting my feet wet, so I guess it's about time to code ...

Implementation

I start with the code for the architectural classes. After this, I'll go through a reflection-based object storer, a serialization-based object storer, and finally a map-based object storage so that we can check that things are working out.

Interface ObjectStorer

The ObjectStorer interface leads to the frontend of our object storage system:

import java.io.*;
public interface ObjectStorer {
  public void put (Object key, Object object) throws IOException;
  public Object get (Object key) throws IOException,
    ClassNotFoundException, IllegalAccessException, InstantiationException;
}

The API is quite simple: You insert an object into storage with the put() method. Along with an object, you specify a key with which it will be stored in the backend. The meaning of this key is backend specific. For a database, it might identify a particular record within a table. For a flat file, it might identify the index number. Any existing object is removed. If you put null into storage, then the record is emptied.

Similarly, you retrieve an object from storage with the get() method, specifying the key under which the object was stored. Again, the key is backend specific. The result has type Object; you must cast it to the particular type that you expect. The type returned depends on the backend and what was originally stored there. If no record is found, then null is returned.

Various exceptions may be raised, depending on whether there is an I/O error communicating with storage, or if there is a problem constructing a retrieved type.

Interface ObjectStorage

The ObjectStorage interface leads to the backend of our object storage system:

import java.io.*;
public interface ObjectStorage {
  public void put (Object key, StorageFields object) throws IOException;
  public RetrievalFields get (Object key) throws IOException;
}

The API is again quite simple: An object, represented as a StorageFields datastructure containing all the object's fields, is placed into backend storage under the user-specified key. If object is null, then the key should be removed from the database.

Similarly, retrieval from the database involves extracting an object of type RetrievalFields corresponding to the user-specified key.

Class StorageFields

The StorageFields class represents an object's fields that are to be placed in backend storage. Rather than bore you with details, I'll just provide a skeleton of the API to this class. Internally, it's a few maps and lists. (See Resources for the complete source code.)

import java.util.*;
public class StorageFields {
  public StorageFields (String className) ...
  public void addField (String field, Class type, Object value) ...
  public String getClassName () ...
  public Iterator getFieldNames () ...
  public Class getType (Object field) ...
  public Object getValue (Object field) ...
}

The constructor for this class accepts the class name of the object it is representing (for example, org.merlin.Employee). The addField() method then allows object fields to be added. Each field is fully specified as a name ("value," for example), type (represented by the appropriate Class object), and value. Primitive values are wrapped in the appropriate holder (for example, java.lang.Integer).

To query this class, getClassName() returns the name of the represented class and getFieldNames() returns an iteration of the field names. For each field name, getType() returns the corresponding type, and getValue() the corresponding value.

Now, there are a few caveats involved in storing an object from this representation. First is the issue of the class name. To recreate an object from its fields, we must (at the very least -- I'll discuss this more along with serialization) know its class name. This information should, therefore, typically be stored along with the fields. However, it may be that this information is implicitly obvious. For example, a particular table may only ever hold org.merlin.Employee objects in bondage, in which case the information need not be stored along with each record; it can instead be retrieved based on the implicit context.

Next we have the issue of duplicate field names. If a superclass and subclass happen to declare the same particular field, then we would end up with a clash of names in the fields to be stored. To overcome this, the storer must explicitly rename duplicate field names. For example, the subclass field might be named value and the superclass field renamed value'. The storer can then reverse the process during retrieval.

Class RetrievalFields

The RetrievalFields class represents an object's fields that have been retrieved from backend storage. Again, rather than bore you with details, I'll simply provide a skeleton of the API to this class. Internally, it's just a few maps, lists, and a bit of reflection for good measure. The full code is supplied in the Resources section at the end of this article.

import java.util.*;
import java.lang.reflect.*;
public class RetrievalFields {
  public RetrievalFields (String className) ...
  public void addField (String field, Object value) ...
  public String getClassName () ...
  public Iterator getFieldNames () ...
  public Object getValue (Object field, Class type) ...
}

The constructor for the class accepts the class name of the object it is representing (for example, org.merlin.Employee). It should be retrieved either from backend storage or else from context (for example, which table was accessed). The addField() method then allows retrieved object fields to be added. Each field is specified as a name ("value," for example) and value. Ordinarily, the value is encoded as expected (using the appropriate type holder for primitive types). However, to support storage systems that do not maintain type information, the value can also be expressed as a string, which will be decoded as appropriate during retrieval. Other value-encoding mechanisms can be supported by a storage-specific subclass.

To query this class, the getClassName() method returns the name of the represented class and the getFieldNames() method returns an iteration of the field names. For each field name, the getValue() method returns the corresponding value as an instance of the specified class (or, for primitive classes, of the appropriate holder).

Class GeneralStorer

The GeneralStorer class, a convenience implementation of ObjectStorer, handles some basic services on behalf of a full implementation:

import java.io.*;
import java.util.*;
public abstract class GeneralStorer implements ObjectStorer {
  protected GeneralStorer (ObjectStorage storage) ...
  public void put (Object key, Object object) throws IOException ...
  protected abstract StorageFields getFields (Object object) throws IOException;
  public Object get (Object key) throws IOException,
    ClassNotFoundException, IllegalAccessException, InstantiationException ...
  protected abstract Object setFields (RetrievalFields object) throws IOException,
    ClassNotFoundException, IllegalAccessException, InstantiationException;
}

A subclass should pass the appropriate ObjectStorage object into the constructor of this class. The class then implements the put() method to call on the subclass getFields() method and then to store the result in backend storage. Similarly, it implements the get() method to retrieve fields from backend storage, which are then passed to the subclass setFields() method for object reconstruction.

Reflection-based object storage

We'll now look at perhaps the most obvious solution to the object-storage problem. We will use the Java reflection API to introspect the fields of an object to be stored in a database and then restore those fields.

Class ReflectionStorer

The ReflectionStorer class, an ObjectStorer, uses reflection to divine the public fields of an object for storage and to restore them after retrieval:

import java.io.*;
import java.lang.reflect.*;
import java.util.*;
public class ReflectionStorer extends GeneralStorer {

We extend GeneralStorer to avail ourselves of the general support provided by that class.

1 2 3 Page 1
Page 1 of 3