Reflections on Java, Beans, and relational databases

Here's how to get to your relational database with the new 1.1 APIs

A relational database basically consists of a set of interconnected tables, where each table stores one sort of data relevant to the application. An address book database might, for example, have tables for people, addresses, phone numbers, and so on. Each of these entities would be stored in the database as series of strings, integers, and other primitive types. The definition of the database tables would describe how the relevant information on each type of entity is to be stored in a table's columns. You could, for example, have a "person" table with columns meant to contain strings to store "first name" and "last name." Each table should have one or more columns containing a value to uniquely identify each row. These identifiers or "keys" are used to connect the information in different tables. You could, for example, assign a unique "person_number" to each person in the "person" table and use the same numbers in an extra column of the "address" table. You could then link people to addresses by matching values in both "person_number" columns.

Relational database systems have been around since the 1970s, and nowadays they are the predominant way to store large amounts of data. Hence, Java software tools are needed to access the databases built using these systems.

There are, basically, two problems that need to be solved before a relational database can be used from within a Java application. First, you need some basic middleware to establish the connection with the database, send SQL queries to it, and so forth. Second, you would like to manipulate the results just as you do any other piece of information in Java -- as objects. The former already has been solved by Sun and several database vendors; the latter is left for us to work on.

Sun has been cooperating with many software companies to define a large number of APIs for common programming tasks. The API for Java database connectivity (JDBC) was among the first JDK 1.1 APIs to stabilize, and there are numerous implementations of it available from various sources. Some of these are 100 percent pure Java. Others use a mixture of Java and native code to connect to, for example, existing ODBC data sources (see Figure 1). The JavaSoft people have put an extensive overview of available JDBC drivers on their Web site at http://splash.javasoft.com/jdbc/jdbc.drivers.html.

The pros and cons of each of these implementations obviously depend heavily on your particular set-up and environment, and therefore I will not discuss each of them individually. There are just too many drivers from too many different vendors to get all this information into one article. Furthermore, most of the Java files available for downloading on the 'Net contain detailed release notes describing the installation procedures. Therefore, I will assume throughout the rest of this article that you have at least some Java development environment up and running, and that you have successfully installed and tested either a JDBC driver or a driver plus Sun's JDBC/ODBC bridge. In other words: I will assume you are ready to develop some JDBC-based software.

Figure 1: A typical JDBC or JDBC/ODBC set-up

The JDBC API

The JDBC API comes as a single Java package (java.sql) containing a series of classes. These classes provide just the type of middleware needed to handle a relational database. Basically, they allow you to establish a connection to a database and send queries. You can process the results of these queries, retrieve meta-information on your database, and handle the exceptions that might occur in the midst of all this.

A simple JDBC example

Let us, just briefly, look at a how a simple example query can be performed using Java JDBC classes. The extremely small database in Table 1 suffices to demonstrate the basic principles. The code in Listing 1 is about the smallest possible piece of Java code that allows you to execute a single SQL query against a relational database.

Person#First nameLast name
43674SandorSpruit
90329JohnDoe
65435DonaldDuck

Table 1: A simple "Sample" database with three rows and three columns from a "Person" table

        String url = "jdbc:odbc:sample";
        String query = "SELECT * FROM PERSON";
        boolean more;
        try
        {
                Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
                Connection con = DriverManager.getConnection(url,"sandor","guest");
                Statement stmt = con.createStatement();
                ResultSet rs = stmt.executeQuery(query);
                while (more = rs.next())
                {
                        int number = rs.getInt("PERSON#");
                        String firstName = rs.getString("FIRST_NAME");
                        String lastName = rs.getString("LAST_NAME");
                        System.out.println(number + " " + firstName + " " + lastName);
                }
                rs.close();
                stmt.close();
                con.close();
        }
        catch (SQLException ex)
        {
                ex.printStackTrace();
        }
Listing 1: A single SQL query using JDBC

If you have ever written some Java yourself, you'll see that this code is fairly straightforward. It loads Sun's JDBC/ODBC driver, establishes a connection to the database pointed to by jdbc:odbc:sample, and executes the simple SELECT query against it. Unless an SQLException is caught, it loops through the ResultSet to extract the rows from the database one field at a time and display the results on the screen.

Now, what is wrong with this code? I mean, all programmers with an urge to produce elegant code will feel very uncomfortable writing this sort of code. I think there are two things that are fundamentally wrong with the sort of code presented in Listing 1.

  • This code can only be written using a considerable amount of database meta-information, and this information is all hard-coded into the program. If you want to fetch a number, for example, you have to know in advance whether you will actually receive an integer, a float, or a double. It is difficult -- if not impossible -- to write a class that can handle any database; each minor adjustment to the database forces you to carefully check and modify the code.

  • The database information is always delivered as a single RecordSet instance, which is not a real object. The RecordSet class -- not unlike other database class wrappers -- is more like a pointer or a cursor, providing access to database information via methods. The instances of RecordSet do not actually contain information, they merely represent the means to get to it. This is exactly why you have to work your way through the RecordSet -- advancing a cursor using ResultSet.next() -- while calling other RecordSet methods to get to some real information. In fact, the JDBC classes do only deliver such sets of loosely coupled fields. These are of little use, even if you know all the inner details of the database's structure, because in Java, real objects provide the predominant way to store and process information.

Thus, ideally you would have to find some elegant way to extract the records and fields from a database one by one (via a RecordSet) and stuff the information you get into fresh objects.

The primary key (pun intended) to such a solution is the similarity between a relational and an object-oriented model of a certain data set. The definitions of classes and tables partially serve similar purposes, while objects and records share some properties too. You can readily think of records as sets of values to initialize an object's data members. However, you need to call a class constructor in order to create the fresh objects mentioned above, once you have fetched an arbitrary record from an arbitrary table. You could easily create objects from records, if only you could automatically deliver each record to the right constructor.

When you are developing a small application, it is possible to deliver each record to some constructor that creates an object from it. You can always use an Object reference to manipulate any value extracted from the database, because any object you obtain using a RecordSet is an extended java.lang.Object in the end. It may be possible to define one BigClass implementing the features you would otherwise put into separate classes. You can use references to an Object to manipulate any value extracted from the database, because any object you obtain using a RecordSet is an extended java.lang.Object in the end. The BigClass implementation could use the Java instanceof operator to determine the runtime type of the database information and a huge switch to jump to the right code.

You could also define a similar BigClass with several constructors, each having a slightly different signature. You could use BigClass(int,int), BigClass(int,float), and so on, depending on the sort of information you get while looping through your RecordSet. (This implementation would undoubtedly include a lot of nasty casting.) However, neither of these suggestions would really solve anything, because the relations between the records and constructors would still have to be hard-coded into your program. For generic database code to work with anything but toy applications, we obviously need some different way to automatically connect tables and constructors.

This is where a nice Java feature comes to our rescue. The code fragment in Listing 2 creates a new instance of any Java class, given nothing but a class name. This means we can use names of classes and tables to find out which constructor can handle a record fetched from a table. It is easy to obtain a complete list of table names from a database using standard JDBC classes, so we can put this little Java trick to good use! You simply develop a Java class per database table, where the class name matches the table name. Whenever your program fetches a record from a table, it creates an object by passing the table name to Class.forName(). It's that easy!

        Class c = Class.forName("Person");
        Person p = (Person)c.newInstance();
        System.out.println("... just created a " + c.getName());
Listing 2: A simple Class.forName() example

However, there is a complication. The forName() method will call the void constructor for the given class, so we can not pass our RecordSet variable directly to the constructor. We need an initialization method with a ResultSet parameter that extracts one record from it, and uses it to assign the values to the object's data members. It may be wise to introduce a superclass as the common ancestor to all the classes linked to a table, because they all need it. In fact, this class plays a key role in querying the database, as I will demonstrate shortly.

Querying the database

It is all very well to create objects from records, but you still need SQL statements to query a database, right? You can not construct these statements without having some insight into the database structure, so it seems we are back where we started. The SQL statements would still have to be hand-coded, even if we can automatically match table and class names. This means you would still have to manually edit these statements every time you update your database structure. Notice, however, that we can overcome this hurdle using the approach outlined above once more. After all, there is little you need to know to query a database table. You will normally query a database table using names and values for those fields that are part of either the primary key or an index. In other words, you can extract records (or objects) from a table if someone supplies you with suitable values for all the right fields. A quick look at the JDBC specification reveals that DatabaseMetaData objects can be used just to retrieve lists of table names (see above) but also to obtain lists of primary key and index fields. Similar problem, similar solution.

A relational database can be queried using a relatively small piece of code, by feeding it a series of proper (name, value) pairs. You can match all the names in the pairs with the field names of the primary keys and the indices. Whenever you find a complete primary key or an index in the list of names, you can use the corresponding values to build an SQL statement, execute it to get a RecordSet, and turn it into objects via the Class.forName() construct. Easy.

This idea requires instances of any class associated with a database table to have methods that allow access to its data members as (name, value) pairs, but such methods can be implemented perfectly well by the common ancestor we introduced in the previous section. Listings 3 and 4 present this method in pseudo-code.

        Open the database connection
        Retrieve a list of user defined tables
        For each table
        {
                Check whether there is a corresponding class file
                if (it is available)
                {       load the class file
                        Retrieve lists of key fields and indices for this table
                        Store these lists in hashtables for easy access
                }
                else throw an exception
        }
Listing 3: Pseudocode to initialize a database connection
1 2 Page 1
Page 1 of 2