Java Tip 82: Build data-type-independent JDBC applications

Use JDBC metadata and precompiled SQL to turn data into database data types

Suppose you need to write a program that populates a table in a database with data from a text file. Say the table contains ticket information for an airline, including ticket number, date of purchase, date of travel, departure and arrival location, and the ticket's price. Each piece of information has a specific data type associated with it in the database, such as number, text, date, or currency. The program must read information for several such tickets from a text file, translate each piece of information to the appropriate data type in the database, and load the table in the database with this translated information. In order to perform such a translation, the program needs to know the database-specific type of each piece of information. A simple approach would be to hardcode the type conversion in your program -- treating data types as static information known at programming time, for example. By adopting this simple approach, you would need to write similar code in another program to populate a second table of the same database, such as a customer table with customer information. You can repeat a lot of boilerplate code from the previous program, but you would need to redo all the database-specific data-type-handling code.

As the number of tables that need to be populated increases, you'll need to write more code. But even this is only the case for a single database. If the data is ported to a different database system, you must rewrite all the existing code to account for the different data types in the new database. The issue of cross-database data typing has earned well-deserved notoriety in database programming.

Create reusable code

How can you avoid all this rewriting? The code you write to populate the table of a database from a text file can make the database name, table name, and column names into parameters. The question is, how can you do the same for the

data types,

so that you can write a class, or collection of classes, just once to populate any set of tables in any database? The answer is to postpone the determination of data types until runtime, using the

DatabaseMetaData

interface in the

java.sql

package. By writing to this interface, you avoid the hardcoding of data types and develop generic, reusable code. The

DatabaseMetaData

interface provides metadata information for a database.

Metadata

is data that describes data. For instance, our airline database table contains ticketing information. This is data. In this scenario, metadata would contain information on things like the number of columns in the table, the data types of these columns, whether a column can have null values, and so on. This is data about data. The specific focus of this article is on the data types of columns. In the following discussion, I will show you how to develop a reusable library of three Java classes that are organized in layers, from closest to the database to farthest. The farthest layer from the database becomes the nearest layer to the application that uses the library. Although this library is meant for populating a database table, you can use the techniques illustrated in building it, as well as some of the code, to build libraries for data-type-independent database queries and updates.

Follow the scenario

Suppose you need to populate a table called

emp

in an organization database. Here is a sample data file that contains the table name on the first line, and a set of column names in the second. Every subsequent line contains row data for the specified columns.

  

emp

 hiredate

  sal  ename  empno
  1996-09-01  1250.00  jackson  7123
  1980-01-01  2500.50  walsh  7124
  1985-01-01  12345.67  gates  7125

While all input data is ASCII text, the data in these columns will be translated to the following database-specific types:

  column  database type
  --------  -------------
  hiredate  date
  sal  decimal
  ename  ASCII text
  empno  number

The structure of emp is shown in the top row of Figure 1.

Figure 1. Mapping from input columns to database table columns

Not all columns in a table need to be populated at once. In this example, only those columns marked as true are to be populated; the active columns are empno, ename, hiredate, and sal.

Also, the order in which the columns appear in the input file need not be the same as the order in which they are arranged in the database table. An index array maintains a mapping from the columns in the input to their respective positions in the table. In this example, the active column order array is the index array. If i is the position of a column in the input file (hiredate is 0, sal is 1, and so forth), the activeColumnOrder[i] is the position of this column in the table (activeColumn[0] is 5, which means hiredate is the fifth column in the emp table).

SUBHEAD_BREAK: Building library class layers Our library consists of three layers:

  • A TableColumns class, closest to the database, which discovers and manages table column information

  • A TableMediator class, which prepares for populating the table using information managed in TableColumns

  • A TableBuilder class, farthest from the database and thus closest to the application, which reads the data from input and uses the TableMediator class to populate the given table.

Figure 2 illustrates this layering.

Figure 2. Application layers

These classes are all collected into a package called tablebuild.

The Java source files for the tablebuild package are available in the Resources section below. In order to avoid clutter, only relevant portions of the code appear in this article.

Layer 1: The TableColumns class

The

TableColumns

class discovers information about the given columns of a database table by querying the given database metadata instance. It stores column information in two parallel arrays: a string array named

columnNames

, and an array of short integers named

columnTypeCodes

. The types of all the columns in a given table can be discovered using

DatabaseMetaData

's

getColumns

method:

    public abstract ResultSet getColumns(String catalog,
                                  String schemaPattern,
                                  String tableNamePattern,
                                  String columnNamePattern)
        throws SQLException

The getColumns method takes four parameters: a catalog name, a schema name, a table name, and a column name. Of these, the last three, which carry the Pattern suffix (see the code above), are interesting in that they let you search for all target strings that match a pattern expression. In other words, you specify search criteria through these parameters. This is what is being said:

Find all columns that satisfy the following criteria: they belong to the given catalog, to ANY of the schemas that match the given schema pattern, and to ANY of the tables that match the given table name pattern. Their names also match the given column name pattern.

The pattern is specified with syntax similar to that which is applied in SQL statements, in which the LIKE phrase is used to match names. In particular, the underscore character (_) in a pattern expression matches any character of a target string, and the percent character (%) matches any number of consecutive characters in a target string. For instance, the pattern expression j_b would match the target strings "job" and "jab," while the pattern expression j%b would match any target string starting with "j" and ends with "b," with any number (including zero) of intervening characters.

The following is the invocation of the getColumns method in the TableColumns constructor, where dbMeta is a DatabaseMetaData instance for a given database:

    ResultSet rset = dbMeta.getColumns(null, null, 
                    table.toUpperCase(), "%");

Notice that the values of the catalog name and schema pattern parameters are null. A null value indicates that a parameter should be dropped from the search criteria. Also note that we specifically send in the uppercase version of the given table name for the table name pattern. It is not clear whether this is a JDBC requirement or a JDBC driver preference. I spent some frustrating moments looking at all kinds of other bugs before I discovered that the uppercase version of the table name makes my code work. Finally, the catch-all pattern "%" is specified for the column name pattern. This all translates to: "Get ALL columns in the GIVEN table."

The java.sql.ResultSet returned by the getColumns method consists of one row per matching column. Each row consists of 18 descriptive fields, which are the result-set columns. The relevant fields for our class are the column name and the column type code, which are the fields numbered four and five, respectively. The column type code is a constant in the java.sql.Types class that represents the SQL type of the column. In this example, the getColumns invocation with emp for the table name would return a description of all eight columns, one per row of the result set. The type codes for the columns are extracted and stored in the array called columnTypeCodes for the TableMediator class to use. We'll examine this process next.

Layer 2: The TableMediator class

The

TableMediator

class encapsulates the following functionality:

  • It constructs a prepared statement (a precompiled SQL statement with parameters for which values are supplied at runtime) to populate a table row. The statement parameters correspond to the values of the active columns.

  • It executes the prepared statement for a given set of active columns.

The TableMediator class uses the TableColumns class to get the complete column information for the given table, and then matches this information against the given set of active columns to arrive at the configuration illustrated in Figure 1.

The prepareRowInserts method constructs the required prepared statement. In this example, the prepared statement would look like this:

 
    insert into emp (empno, ename, job, mgr, hiredate, sal, comm, deptno)
               values (?, ?, ?, ?, ?, ?, ?, ?)

All the columns in the table will be populated for every row, with values for the inactive columns (those that do not appear in the input file) set to null.

The first key to making your library portable is to use the DatabaseMetaData interface to discover column types, as seen in the TableColumns class. The second is to use the generic setObject method of the java.sql.PreparedStatement interface:

    ps.setObject(activeColumnOrder[i], columnValues[i]);

This line of code appears in the insertRow method of the TableMediator class, and ps is the PreparedStatement instance constructed earlier.

The setObject method takes two parameters: the first is the parameter position in the associated prepared statement, and the second is the value of this parameter. (Recall that these parameters stand for columns in a table.) In general, the second parameter can be any type of object; in this code, it is a string. The JDBC driver is responsible for converting this string into the appropriate database type.

In this statement, we have not used the column types discovered earlier. However, we do need the column types for values that are null -- to populate a column for which no value is supplied, for example. In that case, we would use a supporting setNull method in the PreparedStatement interface. Here is how it is used in the insertRow method of the TableMediator class:

    ps.setNull(i, tc.columnTypeCodes[i]);

This statement sets the value of prepared statement parameter i to null, but this setting requires that the type of the corresponding column be sent as the additional, second parameter.

Layer 3: The TableBuilder class

Once the

TableColumns

and

TableMediator

classes are established, the code for populating the table with input data is straightforward, as developed in the

TableBuilder

class.

The TableBuilder class encapsulates the following functionality:

  • It reads the table name from the first line of the text file, and creates an instance of the TableMediator class

  • It reads the column names from the second line of the text file, and asks the TableMediator instance to set the active column list to these columns

  • It asks the TableMediator instance to prepare for inserting rows into the table

  • It reads the column data from each subsequent line of the text file, and asks the TableMediator instance to insert this data into the next row of the table

SUBHEAD_BREAK: Using the tablebuild library Given the tablebuild library, it is very easy to write an application that populates the table of a database. The following is such a sample application.

1 2 Page 1