Eliminate tedious programming: Recover data with XML and Reflection

Automate ResultSet parsing using XML and Reflection

Enterprise data consists of various types of functional information, such as product specifications, vendor details, invoices, and sales orders. Whether this data is critical or not, its persistence should not be compromised in any enterprise application. Because of their robustness and proven history tackling persistence, relational databases often persist enterprise data. Thus, data retrieval from a relational database is an integral task for any middleware application. Java's Enterprise JavaBeans architecture is fast becoming the most obvious choice for developing robust middleware applications. The JDBC API facilitates an application layer that performs the data retrieval. This data-access layer translates, or maps, the data in the database entities -- rows, columns, and tables -- into instances of Java classes. In this article, I will demonstrate how to establish the mapping between the database entities and Java classes through XML. I will also use XML to show you how you can rid yourself of certain mundane steps involved in data retrieval.

Typical data retrieval

Simple value-holder classes -- popularly called value classes -- encapsulate functional data in Java. Typically, such classes are made up of one or more private fields and their get() and set() accessor methods. Each field maps to one of the data entities in the database. A simple data retrieval will involve:

  1. Writing the appropriate fetch statement (in SQL)
  2. Executing the fetch statement on an open database connection
  3. Parsing the returned ResultSet
  4. Creating value objects for each row retrieved and adding to a collection that is returned

Although the JDBC API provides powerful support for each of the above steps, it cannot completely eliminate them. Hence, developers end up writing repetitious and functionally similar pieces of code; writing the SQL is the only imaginative part of data retrieval. By purging this task's mundane code, object-relational mapping tools can keep programming somewhat interesting. But, for simple data retrieval, the same benefits can be realized using XML and Reflection.

Sample implementation

Let's assume this standard example: Our employee database features columns such as EMPLOYEE_FIRST_NAME, EMPLOYEE_LAST_NAME, EMPLOYEE_NUMBER, and so on. An EmployeeInfo class with the corresponding fields below encapsulates the employee information:

      public class EmployeeInfo
      {
         private String employeeNumber;
         private String employeeFirstName;
         private String employeeLastName;
         private int    employeeExperience;
         private Timestamp employeeDOB;
         private String employeeEmail;
         private Timestamp employeeDOJ;
         public String getEmployeeNumber()
         {
            return this.employeeNumber;
         }
         .
         .
         public void setEmployeeNumber(String employeeNumber)
         {
            this.employeeNumber=employeeNumber;
         }
         .
         .
         .
         .
      }

Now, let's assume that some business functionality requires employee objects to be fetched from the relational database. Typical code that creates the EmployeeInfo objects from the data in the database would look like this:

     public Vector getActiveEmployees() throws SQLException
     {
        String sQuery = "select employee_first_name, "+
                        " employee_last_name, "+
                        " employee_dob,"+
                        " employee_experience,"+
                        " employee_DOJ, "+
                        " employee_email "+
                        "from employee";
        Statement querySmt = null;
        ResultSet rs = null;
        Vector vecEmployees = null;
        try
        {
           querySmt = databaseConnect.createStatement();
           rs = querySmt.executeQuery(sQuery);
           //LOOP
           while(rs.next())
           {
              EmployeeInfo empInfo = new EmployeInfo();
              empInfo.setEmployeeFirstName(rs.getString("EMPLOYEE_FIRST_NAME"));
              empInfo.setEmployeeLastName(rs.getString("EMPLOYEE_LAST_NAME"));
              empInfo.setEmployeeNumber(rs.getString("EMPLOYEE_NUMBER"));
              empInfo.setEmployeeDOB(rs.getTimestamp("EMPLOYEE_DOB"));
              empInfo.setEmployeeDOJ(rs.getTimestamp("EMPLOYEE_DOJ"));
              empInfo.setEmployeeEmail(rs.getString("EMPLOYEE_EMAIL"));
              empInfo.setEmployeeExperience(rs.getInt("EMPLOYEE_EXPERIENCE"));
              vecEmployees.addElement(empInfo);
           }//END LOOP
        }
        finally
        {
           try
           {
              if(smt != null) smt.close();
              if(rs != null) rs.close();
              }
           catch(SQLException sqlE)
           {
           }
        }
     }
        

The code regions between LOOP and END LOOP denote the most unimaginative and repetitious portion of this assignment. A developer would have to complete similar processing in other database fetches. Most programmers, whether lazy or not, would fail to enthusiastically embrace this particular challenge. The combination of XML and Reflection provides an approach that helps developers circumvent this type of tedious processing.

Let's add the class ResultSetParser with the getCollection() method, which returns the value objects' collection and takes the ResultSet as one of the arguments. The ResultSetParser class diminishes the above code to:

     public Vector getActiveEmployees() throws SQLException
     {
        String sQuery = "select employee_first_name, "+
                        " employee_last_name, "+
                        " employee_dob,"+
                        " employee_experience,"+
                        " employee_DOJ, "+
                        " employee_email "+
                        "from employee";
        Statement querySmt = null;
        ResultSet rs = null;
        Vector vecEmployees = null;
        try
        {
           querySmt = databaseConnect.createStatement();
           rs = querySmt.executeQuery(sQuery);
           ResultSetParser rsp = new ResultSetParser(new File(Constants.LOAD_XML_PATH+
                                                     "LOAD_EMPLOYEE.XML"));
           vecEmployees = rsp.getCollection(rs);
        }
        finally
        {
           try
           {
              if(querySmt != null) querySmt.close();
              if(rs != null) rs.close();
           }
           catch(SQLException sqlE)
           {
           }
        }
        return vecEmployees;
     }

The code reduction in the above example may not seem very significant, but using ResultSetParser does make a difference for value classes with numerous state variables. The other advantage to this approach is that any ResultSet can be parsed.

ResultSetParser's generic method uses XML and Reflection to parse the ResultSet and return the collection. The following code is the XML used for the data access above:

     < VALUES CLASS="myproj.employee.model.EmployeeInfo" >
        < COLUMNS NAME="EMPLOYEE_FIRST_NAME"    METHOD="setEmployeeFirstName"
/>
        < COLUMNS NAME="EMPLOYEE_LAST_NAME"     METHOD="setEmployeeLastName"
/>
        < COLUMNS NAME="EMPLOYEE_EXPERIENCE"   
METHOD="setEmployeeExperience" />
        < COLUMNS NAME="EMPLOYEE_DOB"           METHOD="setEmployeeDOB" />
        < COLUMNS NAME="EMPLOYEE_DOJ"           METHOD="setEmployeeDOJ" />
        < COLUMNS NAME="EMPLOYEE_EMAIL"         METHOD="setEmployeeEmail"
/>
     < /VALUES>

The VALUES element holds the class name as an attribute. The child elements map each database column to the corresponding set() methods in the VALUES CLASS, which are used to set a state variable in the valueClass.

A simple SAX parser can now parse this XML and store the mapping in a hashtable. The following code excerpt completes this task:

      .
      .
      .
      private class PropertyParser extends HandlerBase
      {
         //mappingTable is a hashtable in the Outer Class
         public void startElement(String strElement,
                                  AttributeList attrList)
         {
            if(strElement.equals("VALUES"))
            {
               valueClassName = attrList.getValue("CLASS");
            }
            else if(strElement.equals("COLUMNS"))
            {
               String columnName = attrList.getValue("NAME");
               String methodName = attrList.getValue("METHOD");
               mappingTable.put(columnName,methodName);
            }
         }
         public void endElement(String strElement)
         {
         }
         public void parsePropertyFile(File xmlFile)
         {
            try
            {
                SAXParser saxParser = SAXParserFactory
                                      .newInstance()
                                      .newSAXParser();
                saxParser.parse(xmlFile,this);
            }
            catch(Exception e)
            {
               System.out.println(""+e);
            }
         }
      }

An inner class called PropertyParser is used for parsing the XML document. An instance of SAXParser completes the actual parsing of the XML document and fires different parsing events, which are handled by the PropertyParser. We implement only one of the events via startElement() to obtain the mapping between the method name and the database column name.

Retrieving data from the ResultSet will require a call to its appropriate get() method. ResultSet implementations support a wide variety of ways to obtain data; the method used depends on the database's data type. For example, ResultSet's getString() method can retrieve numeric data type from the database as a String. There are two ways to ensure that the most appropriate get() method is called on the ResultSet:

  1. Obtain the data type of the column whose data is being retrieved. Use the get() method that returns the equivalent data type in Java. For example, to fetch data that stores as VARCHAR(2) in Oracle, you would call getString() on the ResultSet.
  2. Obtain the argument type of the valueClass's corresponding set() method, as defined by the XML mapping. Call the get() method on the ResultSet that returns the compatible data type. For example, if the retrieved data originates from a column called EMPLOYEE_DOB, then getTimestamp() is used on the ResultSet, since the corresponding set() method -- setEmployeeDOB() -- takes java.sql.Timestamp as an argument.

Option 1 may not be broad enough to support different data types across different database types. Option 2 is simpler to realize using Reflection. Since Reflection is necessary for instantiating the value classes (known only at runtime), option 2 would be the most obvious choice.

Next, you must obtain the arguments for each of the set() methods indicated in the XML. The code excerpt below attempts to accomplish this:

      private void getMethodArguments() throws ClassNotFoundException
      {
         Class c = Class.forName(valueClassName);
         Method[] valueClassMethods = c.getMethods();
         for(int i=0;i<valueClassMethods.length;i++)
         {
            String methodName = valueClassMethods[i].getName();
            Class[] parameter = valueClassMethods[i]
                                .getParameterTypes();
            if(parameter.length > 0)
               methodArguments.put(methodName,parameter[0].getName());
         }
      }

The methodArguments is a hashtable in the outer class. The code above uses Reflection to obtain the argument type for each method in the ValueClass. For each class, the Java Runtime Environment maintains an immutable Class object that contains the information about the class. The call to the static method forName() obtains the Class object corresponding to the valueClass, whose name is known only at runtime. getMethods() is an instance method of Class that returns an array of Method objects for the class. Method objects encapsulate the method information -- such as the method's name, return type, parameter types, set of modifiers, and set of throwable exceptions -- in a class. In the code above, we obtain the method name and its parameter type for each method in the valueClass and store them in a Hashtable instance called methodArguments. Since the valueClass has a set() method for every field, we assume that it takes only one argument.

ResultSetMetaData encapsulates the column's information -- such as the number of columns, their names, and the column data type and its precision -- contained in a ResultSet. The getColumnName() of the ResultSetMetaData object returns the column's name in the ResultSet:

             .
             .
             .
             ResultSetMetaData rsmd = rs.getMetaData();
             int columnCount = rsmd.getColumnCount();
             columnNames = new String[columnCount+1];
             for(int i=1;i<=columnCount;i++)
                     columnNames[i]=rsmd.getColumnName(i);
                     
             .
             .

The code above obtains all the information needed about the ResultSet. The next step is to retrieve data from the ResultSet and create a collection of value classes.

Obtaining the collection of the value class instances requires:

  1. Instantiating the class
  2. Setting the state variables to the data retrieved in the ResultSet

Invoking the newInstance() method on the corresponding Class object in turn calls the default no-arg constructor, which then obtains an instance of the valueClass for each row in the ResultSet.

The code excerpt below demonstrates this usage:

               .
               .
        Object valueClassInstance = valueClass.newInstance();
               .
               .
               .

Calling the appropriate get() method on the ResultSet instance obtains the data in each column. In turn, the most appropriate get() method depends on the column's data type or, in our case, the valueClass's corresponding set() method's parameter type. We obtain the set() method that corresponds to each column name. Based on the parameter type of the set() method -- which was obtained and stored in a hashtable -- we call the most appropriate get() method on the ResultSet. For example, if the fetched data comes from a column called "EMPLOYEE_EXPERIENCE", the corresponding set() method to be invoked in the valueClass is setEmployeeExperience. Because the parameter type for setEmployeeExperience is long, the most appropriate get() method for this column would be getLong(). Thus, handling each data type used by the valueClass might entail a bulky if-else block. Once get() obtains the column's data, it has to be set in a valueClass instance. This requires the invocation of the corresponding set() method. We invoke the set() method on the instance of the valueClass using Reflection. Implementing the valueClass's instance and the column data as arguments, we call the invoke() method of the Method object corresponding to set().

The code below illustrates this approach:

     .
     .
     .
     .
     while(rs.next())
     {
         Object valueClassInstance = valueClass.newInstance();
         for(int i=1;i < columnNames.length;++i)
         {
             String methodName = (String)mappingTable.get(columnNames[i]
                                         .toString());
             String paramName = (String)methodArguments.get(methodName);
             Class  paramTypes[] = new Class[1];
             Object arguments[] = new Object[1];
             if(paramName.equals("long"))
             {
                 System.out.println("long type");
                 long l = rs.getLong(columnNames[i]);
                 paramTypes[0]=long.class;
                 arguments[0] = new Long(l);
             
             }
             else if(paramName.equals("java.sql.Timestamp"))
             {
                 System.out.println("timestame type");
                 Timestamp ts = rs.getTimestamp(columnNames[i]);
                 paramTypes[0] = java.sql.Timestamp.class;
                 arguments[0] = ts;
             }
             else if(paramName.equals("java.lang.String"))
             {
                 System.out.println("String type");
                 String s = rs.getString(columnNames[i]);
                 paramTypes[0] = java.lang.String.class;
                 arguments[0] = s;
             }
             Method instanceMethod = valueClass
                                  .getMethod(methodName,paramTypes);
             instanceMethod.invoke(valueClassInstance,arguments);
         }
         databaseRows.addElement(valueClassInstance);
     }
     .
     .
     .

Using columnNames as the key on the mappingTable obtains the set() method. The class's name that is the argument for the set() method is obtained from the hashtable (methodArguments) using the methodName as the key. To invoke the set() method on the instance, obtain the class's Method object corresponding to the valueClass's set() method, and call invoke() with the valueClassInstance and an Object array as arguments. After all the row's columns are read and the values set, the valueClassInstance is added to a collection, which returns after the ResultSet is parsed.

Implementation requirements

The above methodology requires that you define the XML files for each valueClass and that the valueClass provides a no-arg constructor and a set() method for each field.

You can implement a simple frontend tool, which might take the columnName and the methodName mapping in a couple of text fields, to generate the XML files. This approach makes maintaining the XML files and enforcing standards relatively easy. The tool can also include logic to validate the mapping (class existence or column names, for example). The figure below illustrates a screen shot of the tool I developed for generating the XML:

Figure 1. A sample mapping tool to generate the XML file

This approach would also require the value classes to map onto the database columns cleanly; you shouldn't have to complete any processing to set the values in the valueClass. (If required, such logic could be included in the set() methods of the valueClass itself.)

The date fields, however, present a potential issue. Databases feature different ways of storing dates -- date, time, or timestamp, for example -- but most methods in java.util.Date have deprecated from JDK 1.1. The usage of java.util.Calendar is recommended for any date manipulations. If a project standard calls for using Calendar in all value classes to represent the date (which would most likely be the case), then to set it in the valueClass, the date fetched from the ResultSet should convert to Calendar. You will have to incorporate this task in the ResultSetParser. The code below converts the Timestamp retrieved from the database to a Calendar object by invoking a method provided in Utility.

 
        .
        .
        .
     else if(paramName.equals("java.util.Calendar"))
     {
        Timestamp ts = rs.getTimestamp(columnNames[i]);
        Calendar cal = Utility.getCalendar(ts);
        paramTypes[0] = java.util.Calendar.class;
        arguments[0] = cal;
     }
        .
        .
        .

One of the advantages of converting the date during data retrieval is that all the conversion logic remains in one place so that changes will not be unwieldy to manage. Date conversions could become messy if you have an application that spans multiple time zones.

Possible extensions

You can easily extend this approach to dispense with the DataAccess object. Since the ResultSetParser manages data retrieval, you only use the DataAccess object to create the SQL String -- which might be dynamically generated, depending on various conditions -- and execute the String on a Statement to obtain a ResultSet.

If your application does not use PreparedStatments, then dispense with the DataAccess object. I advise adding a method in the ResultSetParser class that takes a String (your SQL query) and returns the Vector of the corresponding value classes. The method below is from the ResultSetParser class:

     public Vector executeQuery(String query) throws SQLException
     {
        Connection databaseConnect = null;
        Statement querySmt = null;
        ResultSet rs = null;
        Vector rows = null;
        try
        {
           databaseConnect = Utility.getDBConnection();
           if(!xmlLoadFile.exists()) 
              throw new SQLException("Unable to fetch data access properties
file");
           querySmt = databaseConnect.createStatement();
           rs = querySmt.executeQuery(query);
           rows = getCollection(rs);
        }
        finally
        {
           try
           {
              if(querySmt != null) querySmt.close();
              if(rs != null) rs.close();
              if(databaseConnect != null) databaseConnect.close();
           }
           catch(SQLException sqlE)
           {
           }
        }
        return rows;
     }

The above code is beneficial in that it easily enforces any type of discipline associated with ResultSets and Statements. Most system administrators get mad when they find code that doesn't close ResultSets and Statements. By limiting developers to the composition of crafty SQL statements, you avoid this rogue code. Of course, this approach can be accomplished only if Statements execute queries.

In addition, your DataAccess objects simplify. They will only craft the SQL statements and will take on the appearance of the code below:

     public class ASimpleDAO
     {
        public Vector getActiveEmployees() throws SQLException
        {
           String sQuery = "select EMPLOYEE_FIRST_NAME, "+
                           "EMPLOYEE_LAST_NAME, "+
                           "EMPLOYEE_NUMBER, "+
                           "EMPLOYEE_DOB, "+
                           "EMPLOYEE_DOJ "+
                           "from employee "+
                           "where employee_experience > 2";
          ResultSetParser rsp = new ResultSetParser(new File(Constants.LOAD_XML_PATH+
                                                    "LOAD_EMPLOYEE.XML"));
          return rsp.executeQuery(sQuery);
        }
     }   

No Connections. No Statements. No ResultSets. Only happy developers!

The dark side

Implementing XML parsing and Reflection affects performance. The parsing of the XML file and the valueClass analysis (Reflection) is completed only once, prior to invoking the get() methods on the ResultSet. Therefore, if the data access is only fetching a single row in the database, performance may suffer. But if the database fetches include multiple rows -- and hence multiple instances of the value objects -- the performance cost may be slight.

The above example returns the value classes as a Vector. It does not support different collection implementations such as hashtables or hashmaps. Retrieval in these forms will require a more complex XML format and definitely more intricate parsing logic.

A note on the example code

Most of the code presented throughout this article originated from the example implementation that I managed to bundle. The code consists of the ResultSetParser class, which implements the data retrieval. It supports the get() methods for only the basic data types -- int, long, Timestamp, Calendar, and String. The sample application's Main class demonstrates the nature of a typical client to the ResultSetParser. The application fetches the employee information from the database and displays a formatted output on the console. The two methods in the application, getEmployeeData() and getEmployeeDataUsingSimpleDAO(), demonstrate the two ways of retrieving data using ResultSetParser. getEmployeeDataUsingSimpleDAO() uses a DataAccess object that does not handle Connections, Statements, or ResultSets. The DataAccess object only constructs a rather silly SQL statement. getEmployeeData(), on the other hand, uses the ResultSetParser only to parse the ResultSet.

The employee information is contained in a database table (Oracle) called EMPLOYEE with the following structure:

     EMPLOYEE_NUMBER                VARCHAR2(10)
     EMPLOYEE_FIRST_NAME            VARCHAR2(50)
     EMPLOYEE_LAST_NAME             VARCHAR2(50)
     EMPLOYEE_DOB                   DATE
     EMPLOYEE_DOJ                   DATE
     EMPLOYEE_EXPERIENCE            NUMBER(3)
     EMPLOYEE_EMAIL                 VARCHAR2(50)

I used an Oracle thin driver to establish database connectivity.

Please note that the ResultSetParser does not offer any elaborate exception handling. All the checked exceptions are caught and wrapped around SQLException and thrown. I found this the most prudent approach to preventing any impact on the existing DataAccess objects.

Conclusion

This article has demonstrated how to map the relational entities to the value classes in Java with XML. We then used mapping to automate ResultSet parsing to obtain a collection of value classes. The article also showed how the methodology can be extended to remove the JDBC calls from DataAccess objects, and also avoid any rouge code that might originate from neglectful usage of JDBC objects like Connections, Statements, and ResultSets.

Abhilash Koneri is a senior systems engineer working for Wipro Technologies in India. Abhilash is a Sun-certified Java programmer for the Java 2 Platform, Standard Edition (J2SE).

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies