JDBC scripting, Part 1

Introducing JudoScript, a functional scripting language

JDBC (Java Database Connectivity) is a standard Java API for SQL database accesses. What does JDBC scripting mean? It means using JDBC for manipulating SQL databases in a way comparable to and beyond what native database clients do. JDBC scripting is an attractive, practical matter, because SQL databases are and will be the dominant data sources in any enterprise system. It is also important to note that all major RDBMS (relational database management system) vendors provide pure-Java JDBC drivers for their own products, so a capable JDBC scripting tool can easily take advantage of that capability. All you need is a JDBC URL and the JDBC driver jar file in the classpath. You can connect to multiple databases simultaneously for multisource, heterogeneous data processing.

Philosophically, JDBC scripting is part of a grandiose idea: using the Java platform, in addition to Java's traditional role as an object-oriented language, to regularly perform practical computing tasks. The JudoScript language was designed for this idea.

Today, enterprise software handles not only relational databases, but also richer data formats such as XML and SGML, abstract data types such as Enterprise JavaBeans (EJB), Web services, as well as old friends such as flat files and spreadsheets. Spreadsheets are so popular in business—they are touted as the language of business. Traditionally, and naturally, people use various tools for data querying, reporting, and processing. Wouldn't it be nice to have a unified tool for all data-processing needs? The benefits are obvious:

  • You can do any processing with any data sources of any format simultaneously
  • Correct solutions and results are faster to reach
  • You have less tools to learn, install, configure, and launch
  • A single, free tool is cheaper to own
  • The problem solving process is mentally less taxing because you can focus on the problem at hand without constant worries over environmental issues, context switching, and integration of heterogeneous components

Java looks like a candidate, but it falls short as a practical tool because as a system language, Java is designed for creating object-oriented software systems; it is too verbose and detail-oriented for practical uses. Various Java scripting languages may help; they can "script" Java components and are easier to code, but they are just as programming-centric and detail-oriented as Java or Pascal.

A good scripting tool should have abstract functional features for people to intuitively specify today's common computing tasks and, at the same time, not lose programming power. This is the design goal of the JudoScript language.

Pure SQL is so limited, simple query tools are weak in capability. JudoScript has many powerful data structures and is fully potent to script Java components. You can use JudoScript to easily collect data from database A, compute some intermediate result in memory, and finish the task with database B. If you use Oracle, for example, you will need to use PL/SQL, database links, and possibly temporary tables. If database B is not an Oracle database, things can quickly spin out of control. JudoScript can easily deal with flat files, spreadsheets, XML, SGML, EJBs, etc. It is the synergy of all these capabilities that makes JudoScript's JDBC scripting so cool and useful, especially for J2EE applications.

This two-part series introduces JDBC scripting with JudoScript and its related features. You will discover how much you can easily accomplish with nothing but JudoScript and the underlying Java platform. In Part 1, I discuss JudoScript's support for JDBC scripting.

Note: You can download the source code that accompanies this article from Resources.

Read the whole series on JDBC scripting:

JDBC support in JudoScript

JudoScript has dedicated syntactic support for running SQL. The data definition language (DDL) and data manipulation language (DML) statements can be run individually or in groups or batches. You can prepare parametized SQL statements and bind parameters to the prepared statement to run; you can call stored procedures with in-, out- and in-out parameters. Let's see a simple example:

connect to 'jdbc:oracle:thin:@dbsvr:1521:dbname', 'user', 'pass';
// Create table and insert a few rows.
executeSQL {
  CREATE TABLE emp(emp_no      INTEGER PRIMARY KEY,
                   first_name  VARCHAR(100),
                   last_name   VARCHAR(100),
                   birth_date  DATE,
                   salary      NUMBER);
  CREATE INDEX emp_fname ON emp(first_name);
  CREATE INDEX emp_lname ON emp(last_name);
  INSERT INTO emp(emp_no,first_name,last_name,birth_date,salary)
  VALUES(100, 'Jim', 'Billups', to_date('1954-1-3','yyyy-mm-dd'), 86500.0);
  INSERT INTO emp(emp_no,first_name,last_name,birth_date,salary)
  VALUES(101, 'Linda', 'Jordan', to_date('1980-7-24','yyyy-mm-dd'),45250.0);
}
// Query and print out rows.
executeQuery qry:
  SELECT emp_no, first_name, last_name, salary
    FROM emp
    WHERE salary < 50000
    ORDER BY salary ASC
;
while qry.next() {
  println '#', qry[1], '  ', qry.last_name, ', ', qry.first_name, ': ',
qry.salary;
}
disconnect(); // From database

The first thing to note is that this script is all about business! This program connects to the database, runs a number of SQL DDL and DML statements, and finally executes a query to print the results in the table. In the executeQuery statement, the qry variable is fundamentally a java.sql.ResultSet instance, so you can call its next() method. Columns in a row can be accessed via their names or indices (starting at 1). The next example demonstrates how values can be bound; it omits the connection parts:

// Prepare a SQL
prepare qry:
  SELECT emp_no, first_name, last_name, salary
    FROM emp
    WHERE salary < ?
    ORDER BY salary ASC
;
// Run the query
executeQuery qry with @1:number = 5000.0;
while qry.next() {
  println '#', qry[1], '  ', qry.last_name, ', ', qry.first_name, ': ',
qry.salary;
}

The @1:number syntax is used to bind parameter values. If no type is appended, it is assumed to be a string.

You can dynamically create SQL statements as well:

// Construct an IN list for the query
last_names = [ 'Olajuwon', 'Yao' ]; // an array
in_expr = last_names.toCsv(',', function(x){ return "'"+x+"'"; });
// Now run the query
exeucteQuery qry:
  SELECT * FROM emp WHERE last_name IN ( (* in_expr *) )
;

The (* *) syntax within SQL statements can take any expression. In the code above, we used Array's toCsv() method and supplied an anonymous function to quote the values. The resultant SQL statement is: SELECT * FROM emp WHERE last_name IN ( 'Olajuwon','Yao' ).

We have seen examples that dynamically construct SQL statements and run SQL directly. Another mode of running SQL is batch, either statically (by using executeBatch instead of executeSQL) or dynamically. I don't discuss batch running in detail in this series.

Next, we dive into the particulars of major JDBC scripting topics. Keep in mind that JDBC scripting (actually, JudoScript itself) is intended for everyone's use; because of the nature of this audience, I relate more to the underlying Java classes.

Database connections

You obtain a database connection like this:

connect mycon to 'jdbc:oracle:thin:@dbsvr:1521:dbname', 'user', 'pass';

The obtained connection is stored in the variable mycon. If the connection variable name is omitted, JudoScript uses a predefined global variable, $$con, which is typically used for single-connection programs. You can specify connection attributes like this:

connect mycon ( autoCommit=false ) to
  'jdbc:oracle:thin:@dbsvr:1521:dbname', 'user', 'pass';

How does JudoScript load the JDBC driver? JudoScript maintains a list of known JDBC driver class names and their corresponding URL prefixes. For instance, when JudoScript sees oracle in jdbc:oracle:..., it loads the Oracle JDBC driver class oracle.jdbc.driver.JdbcDriver. If a driver is absent from the list, it can be specified as a driver attribute or loaded via the old-fashioned Java style:

// JudoScript style
connect ( driver='my.db.jdbc.driver' ) to 'jdbc:mydb:...', '', '';
// Java style
(java::Class).forName('my.db.jdbc.driver');

The connection object is fundamentally a java.sql.Connection object, (or more precisely, a wrapper object of java.sql.Connection, so you can access all public methods of that Java class.) Sometimes you can obtain a connection, say, via a Java method call; it can be used just like one obtained through the connect statement. The use clause in all SQL execution statements designates which connection to use:

executeSQL use mydb { ... }

Once the database operations are complete, you should call the connection object's disconnect() method. The disconnect() system function calls the name-sake method of the global $$con object. A connection object also has many other methods, such as commit() and rollback().

Directly execute queries and query results

Individual DML statements, which include SELECT, INSERT, UPDATE, and DELETE, are run in these modes:

Table 1. Modes of running SQL

Running modeQueryUpdate
Run directlyexecuteQuery executeUpdate
Prepare and runprepare executeQuery prepare executeUpdate

Let's first look at how to run queries and process the results.

Run queries

The general syntax for executeQuery in BNF (Backus Naur Form) is:

  executeQuery variable [ ( attributes ) ] [ use variable ] :
    sql_statement ; [ with bind_list ; ] | executeQuery variable with bind_list ;

The syntax really includes two forms of the executeQuery statement; the second form runs a prepared statement stored in the variable, which I explain later. Both forms return result sets.

The use variable clause indicates which database connection to use, where the variable holds a connection object. sql_statement can be any query statement that returns a result set. SELECT is the most common query SQL statement, but is not the only one. Many JDBC drivers support their RDBMS's own commands that also return result sets, such as MySQL's show commands.

The query object (stored in the variable immediately following the executeQuery keyword) also represents the result set. It is a built-in object that has many methods. The most frequently used method is next(), which loops through the result set. For the query object's role as result set, columns in each row can be referenced either by indices (starting at 1) or by column names. JudoScript converts the returned values to JudoScript data types:

executeQuery qry:
  SELECT emp_no, first_name, last_name, birthday, salary
    FROM emp
    WHERE salary < 50000
    ORDER BY salary ASC
;
while qry.next() {
  println '#', qry[1], '  ', qry[2], ', ', qry[3],
          ' (', qry[4].fmtDate('yyyy-MM-dd'), '): ', qry.salary;
}

In the example above, the first column is an integer, the second and third are strings, the fourth is a date, and the last is a number. Because the fourth column is a date, we can call its fmtDate() method.

Beyond SQL

The power of JDBC scripting is not only its ability to run SQL statements, but also its synergy of SQL and computation. Suppose we have a Web server log stored in a table like this:

CREATE TABLE web_log (
  uri      VARCHAR(1500),
  referer  VARCHAR(1500),
  time     TIMESTAMP
);

We need to report the numbers of hits per day, per week, and per month. The daily and monthly reports may be possible with SQL if the RDBMS supports functions in the GROUP BY clause; the idea is to convert the time stamp into a string and use that in GROUP BY. But the weekly report is tricky. Let's see how JudoScript handles this:

 1: executeQuery qry: SELECT time from web_log;
2:
3: // Step 1. Collect weekly counts
4: counts = new Object;
5: while qry.next() {
6:   time = qry.time;
7:   token = time.year + '-week-';
8:   if time.weekOfYear < 10 {
9:     token += '0';  // Fill in a 0 for single digit numbers.
10:   }
11:   token += time.weekOfYear; // The week-token for the day.
12:   ++ counts.(token);
13: }
14:
15: // Step 2. Print out weekly counts
16: for wk in counts.keysSorted() {
17:   println wk, ':', counts.(wk) :>8; // Right-aligned, 8-digits
18: }

The keys to this simple solution are the built-in data structure Object and the Date value's features. Object is essentially a map; it stores name-value pairs. On lines 12 and 16, the .() operator gets and sets the value for a key. In lines 6-11, we construct a weekly token for the time and increment the count in the Object. If the key did not exist, the value would be null or, numerically, 0. On line 15, we obtain the keys sorted by natural text order (for this reason, we make all week numbers two digits in lines 8-10.) The result looks like this:

2004-week-06:    8438
2004-week-07:   21409
2004-week-08:   34940
2004-week-09:  128343
2004-week-10:   99827
2004-week-11:   78343
2004-week-12:   30968
2004-week-13:   44021

I further discuss the data structures and general programming techniques in Part 2.

Query result methods

The query object has some useful methods, as shown in Table 2.

Table 2. Methods of the result set object

Method Description
getColumnAttributes()Returns the query's column attributes as a TableData object.
getResult(limit)Returns up to limit (or a large hard limit if this is not specified) rows of the query result. If there is only one column in the query, the result is an array; otherwise it is a TableData object.
getSQL()Returns the SQL statement.
getPreparedStatement() Returns the prepared statement (an instance of java.sql.PreparedStatement) if the query has been prepared.
getResultSet Returns the result set (an instance of java.sql.ResultSet) when the query executes.
getResultSetMetaData()Returns an instance of java.sql.ResultSetMetaData when the query executes.

getColumnAttributes() returns TableData, which is a 2D data structure. I discuss it further in Part 2. The following example uses getColumnAttributes() to describe a table structure:

function tableDesc tableName, dbcon {
  if dbcon == null { dbcon = $$con; }
  executeQuery qry use dbcon:
     SELECT * FROM (* tableName *) WHERE 0 > 1
  ;
  println [[*
    ----------------------------------------------------------------------
    Name            Type       Display  Precision   Scale Nullable Class
                    Name          Size                             Name
    ----------------------------------------------------------------------
  *]];
  printTable qry.getColumnAttributes()
         for column('name')               :<16,
             column('type')               :<10,
             column('displaySize')        :>8,
             column('precision')          :>11,
             column('scale')              :>8,
             column('nullable').fmtBool() :>9,
             ' ', column('className'), nl; // Newline
}
// Try it out
connect to dbUrl, dbUser, dbPassword;
tableDesc 'emp';
disconnect();

The result (from an Oracle database) is:

----------------------------------------------------------------------
Name            Type       Display  Precision   Scale Nullable Class
                Name          Size                             Name
----------------------------------------------------------------------
EMP_NO          NUMBER          21         38       0    false
FIRST_NAME      VARCHAR2       100        100       0     true
LAST_NAME       VARCHAR2       100        100       0     true
BIRTH_DATE      DATE             7          0       0     true
SALARY          NUMBER          21          0       0     true

The last four methods in Table 2 return Java objects that can be manipulated or passed into Java methods. The following example uses getResultSetMetaData() to dump the result into a comma-separated value file:

function printResultsetAsCSV outfile, rs, sep, closeOnExit {
  if outfile == null {
    outfile = getSysOut();
    closeOnExit = false;
  }
  rsmd = rs.getResultSetMetaData();
  cnt = rsmd.getColumnCount();
  // Print headers
  for i from 1 to cnt {
    if i>1 { print <outfile> sep; }
    print <outfile> rsmd.getColumnName(i);
  }
  println <outfile>;
  // Print results
  while rs.next() {
    for i from 1 to cnt {
      if i>1 { print <outfile> SEP; }
      print <outfile> rs[i];
    }
    println <outfile>;
  }
  if closeOnExit { outfile.close(); }
}
// Try it out
connect to dbUrl, dbUser, dbPass;
executeQuery qry: SELECT * FROM emp;
printResultsetAsCSV openTextFile('result.csv', 'w'), qry, ',', true;

You have now seen what you can do with query results. The code is intuitive enough even though we have not covered programming yet. I present a case study in Part 2, and you can adapt the code samples for your practical uses.

Calling the result set's methods

As we know, in JDBC, column values are accessed via the result set object's various getXXX() methods. JudoScript allows you to access columns as properties via their names or indices, but you can still call any method. For instance, suppose a table has a LONG column in Oracle:

CREATE TABLE error_log(
  log_id   INTEGER PRIMARY KEY,
  note     LONG,
  encoding VARCHAR(30)
);

You can call getBytes() to get the bytes and convert them to text:

executeQuery qry:
  SELECT * FROM error_log;
;
while qry.next() {
  bytes = qry.getBytes('note');
  println '========== ', qry.log_id, ' ==========', nl,
          encode(bytes, neverEmpty(qry.encoding, 'UTF8'));
}

If we accessed the column with qry.note, the result will differ depending on the RDBMS. By calling qry.getBytes('note'), we know precisely what we are doing. Also, calling the result set's methods is the only way to access nonstandard JDBC-driver-specific features.

Now that we have learned how to handle query results, let's move on to running SQL update statements.

Directly execute SQL updates

SQL updates are simpler than queries, because the only return value is an update count. SQL update statements include UPDATE, INSERT, and DELETE. The following example shows how to execute an UPDATE:

executeUpdate upd:
  UPDATE SET salary = 55000 WHERE salary < 50000
;
println unit(upd.getResult(), 'person has', 'people have'), ' got raise.';

In the code, unit() is a utility function. If the first parameter is 1, it returns the second parameter; otherwise it returns the plural form, which is the third parameter if specified, or the second parameter plus an s.

So far, we have run SQL statements as-is. Next, we will see how to run parametized SQL statements.

Prepare and execute SQL

The JDBC specification supports running parametized SQL statements, that is, SQL statements with question marks as placeholders for parameters. Parametized SQL statements are prepared first and can be run multiple times by binding values to the parameters. Another reason to prepare SQL statements before running is that, if the SQL statement is expected to run repeatedly even without parameters, preparing it first allows the JDBC driver and/or database server to pre-parse and optimize it. In JudoScript, the prepare statement is used for both SQL queries and updates; prepared queries are executed via executeQuery and updates via executeUpdate, respectively, like this:

prepare qry: SELECT emp_no, salary FROM emp WHERE salary<?;
prepare upd: UPDATE SET salary=? WHERE emp_no=?;
// Give a 10% raise for those earning less than 50,000
executeQuery qry with @1:number = 50000;
while qry.next() {
  executeUpdate upd with @1:number = qry.salary * 1.10,
                         @2:int    = qry.emp_no;
}
commit(); // Assuming auto-commit turned off.

The syntax to bind a parameter is the with clause and @n:type, where n is the bind parameter index starting at 1, and type is one of the following: boolean, byte, date, double, float, int, long, number, short, String, varchar, struct, array, bytes, bigDecimal, blob, clob, time, timestamp, numeric, ref, bit, longvarchar, other, java_object, oracle_rowid, oracle_cursor and oracle_bfile. By default, if :type is not specified, it is assumed to be String. In fact, many JDBC drivers, such as the Oracle JDBC driver, can take a string bind value for most types.

Execute database scripts as-is

We saw earlier that executeSQL { ... } can execute multiple SQL statements. SQL statements in the block are delimited by a semicolon (;). Sometimes semicolons are legitimate text to be sent to the database. In such situations, use the executeAny statement; all text is sent to the database server as-is. The following example creates an Oracle stored procedure that we will use in the next section:

executeAny [[*
  CREATE PROCEDURE test_proc(
    param_io IN OUT NUMBER,
    param_i  IN     VARCHAR,
    param_o  OUT    VARCHAR)
  AS BEGIN
    param_o := param_i;
    IF param_io IS NOT NULL THEN
      param_io := param_io + 1;
    ELSE
      param_io := -1000;
    END IF;
  END;
*]];

The [[* *]] syntax is used to quote a chunk of text that may include new lines. The leading spaces in each line are stripped, so you can indent them in the source, making your code look nice. You can also embed expressions with (* *) syntax.

Invoke stored procedures

Major RDBMSs support stored procedures. JDBC defines a standard syntax for calling stored procedure like this:

{ ? = call foo(?,?,?) }

The parameters, unlike regular SQL, can be IN, OUT, and IN OUT.

Let's call the procedure created in the previous executeAny example. It returns a value in parameter param_io and passes the value of param_i to param_o. This is the code:

prepareCall: { call test_proc(?,?,?) };
x = null;
y = 'abcd';
executeSQL with @1:int <=> x,
                @2:varchar =  y,
                @3:varchar => z; // z will be the same as y
println 'x = ', x;  // Prints: x = -1000
println 'z = ', z;  // Prints: z = abcd

Conclusion

We are at the conclusion of Part 1 of this series. You have learned that the idea of using Java for practical uses by the general public—not just Java developers—has led to the inception of the JudoScript language. I have detailed JudoScript's JDBC scripting support, which helps to illustrate the fact that good domain-specific support within a general-purpose language greatly helps users focus on the problems, resulting in code that is more intuitive, productive, and elegant. The topic of JDBC scripting also demonstrates the usefulness and resourcefulness of the Java platform for the general public, and shows that a good tool like JudoScript can unlock Java's huge potentials and possibly extend Java's territory beyond Java programmers.

In Part 2 of this series, I briefly introduce JudoScript's general programming and Java scripting, and list the language's features. After that, I present a case study that uses JDBC scripting, Java scripting, XML scripting, and other goodies to address a real-world problem in a J2EE environment. Lastly, I discuss a few general programming language aspects and compare JudoScript to other languages.

James Jianbo Huang is the author of the JudoScript language. Huang holds an MS in electronics, and favors creating software and solutions. He enjoys music, teaching, and sports but does not practice judo.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies