Java Tip 137: Manage distributed JTables

Efficiently display huge JTables using distributed caching TableModels

In this information-driven age, displaying tables with thousands or tens of thousands of rows is standard practice. The data that populates these tables is generally retrieved from remote databases, which presents a challenging dilemma: Should your program download all table data at once, or should it only download the first chunk of requested data? Downloading all data initially can result in a significant time delay before users can use it, which is especially annoying if they are only interested in the first few results. Downloading a chunk of data is common in Web search results, for example on Google, where results paginate into groups of ten. However, in certain circumstances, you might need to show this data in a table but still minimize the initial hit for users. You might, for instance, require a spreadsheet-type user interface with column sorting, cell selection, and data editing. Thanks to a neat design feature of JTable, we can maintain JTable's rich user interface while optimizing the required data download.

Note: You can download this article's source code from Resources.

A distributed caching TableModel

JTable implements the Model-View-Controller design pattern. In this pattern, the component's data source (the model) is separated from the component's graphical (view) and controller parts into three distinct modules (see Resources). In JTable's case, the model consists of the interface TableModel. This makes JTable very adaptable because you can wrap any data resource in a TableModel implementation, set the TableModel on the JTable, and display the table on screen. For example, you can create a table that displays data from a flat file, a database, or even a message queue using this technique. We take advantage of that adaptability in this tip.

The JTable loads data from its TableModel using a lazy loading process: it reads only data from the model when necessary, rather than step through all rows before displaying. When a JTable needs to render a particular table cell on the screen, it calls the method getValueAt(int row, int column) on its TableModel to retrieve the data value. This is also a lazy process because the data required to service this method can be derived by any means. For example, if we want the data to reside on the server side, we can request a URL that returns the data for this cell as XML. getValueAt() can then parse and return this XML, which ultimately displays in the JTable.

However, remote calls are time-consuming, so why not bundle the request for many data cells into the same remote call and cache those extra values until required? If the cell value at row 0, column 0 is required, we can call a servlet that returns all data for the cells between rows 0 and 50. This means we will have all data at hand to service the getValueAt() calls for all table cells currently in view. More data is retrieved from the servlet only when the user scolls down the table. Furthermore, you can implement a client-side cache so only n rows (e.g., 1,000), are held on the client side. When the user scrolls down the table past the 1,000th row, the data for the first set of rows can be overwritten since it's not currently required. If the user scrolls back up the table, data can be reread from the server as necessary.

The implementation

A number of classes and interfaces were defined to implement this distributed caching JTable (Figure 1).

Figure 1. Class diagram of the classes implementing the distributed caching table model. Click on thumbnail to view full-size image.

A specialized TableModel called DistributedTableModel was written. This implementation satisfies the TableModel interface's methods by delegating all the hard work to the class DistributedTableClientCache, which retrieves data in bulk from the data source and caches it on the client side. DistributedTableClientCache retrieves all data from an object that implements the interface DistributedTableDataSource. This interface isolates all distributed data retrieval logic necessary to efficiently populate a TableModel in the manner we require. For example, the method getTableDescription() returns an object containing the table's descriptive elements—the number of rows and columns, and the column names and class types. The method Object[][] retrieveRows(int from, int to) throws Exception handles the actual data retrieval. Whenever the TableModel requires a row not in the cache at that time, the method retrieveRows() is called, and a set number of rows is retrieved.

To create an instance of a DistributedTableModel, the constructor is called with three parameters:

  • tableDataSource: An implementation of DistributedTableDataSource. It allows you to write a specialized version that, for example, retrieves its data from a Remote Method Invocation (RMI) or CORBA service.
  • chunkSize: The number of rows that should be retrieved at once from the server.
  • maximumCacheSize: The number of rows the cache should store before overwriting rows not required at the time.

Figure 2 shows a sequence diagram for this data retrieval process.

Figure 2. A sequence diagram shows DistributedTableModel's distributed data retrieval process when a JTable displays. Click on thumbnail to view full-size image.

Sorting

One common table requirement is the ability to sort columns in ascending or descending order. The JTable component does not provide built-in sorting functionality; it must be implemented in the TableModel implementation (see Resources). You must have all data on hand to implement a sort. Because all data is on the server side, the sort must occur there. This is easily done by sending a message to the server through the DistributedTableDataSource interface to sort the data on a particular column in an ascending or descending order. You can implement a manual sort, or a database can do your dirty work via the SQL ORDER BY operator. After the sort completes, the client-side data cache will be nullified, and the table will update. The JTable will immediately ask for the data to fill the part of the table currently in view. This will trigger a server-side fetch of the newly sorted data.

One problem presents itself, however, during the sorting process. What happens to users' row selections? Of course it's not acceptable to lose those selections. Selection is implemented in a JTable by the interface ListSelectionModel that registers the selected row indexes. If you change the underlying TableModel data by carrying out a sort, the ListSelectionModel will have no knowledge of the new selection indexes, and the selections will remain set to the old ones. However, you can perform a manual sort if you know the selected rows' new indexes. Before sorting, note the selected rows. After the sort, the table selections clear and then the selected rows' new indexes are set on the ListSelectionModel. Thus, the sort method in the interface DistributedTableDataSource takes the following form:

int[] sort(int sortColumn, boolean ascending, int[] selectedRows) throws Exception

Along with the sort column and ascending flag, you must feed in the selected row indexes and then return the sorted data's corresponding indexes. Three more methods are required to implement sorting fully: setSelectedRowsAndColumns(), getSelectedRows(), and getSelectedColumns(). These methods keep the selections in sync between the client and server.

Figure 3 shows a sequence diagram for the sorting and selection process.

Figure 3. A sequence diagram shows a distributed JTable's sorting process. Click on thumbnail to view full-size image.

Simplified table managment

I've described a method that allows a large table to quickly display without waiting for a bulk data download. If a user scrolls down the whole table, all the data must download sooner or later. But there is always a balance between immediately downloading all data and taking the performance hit later. This tip's method provides the following advantages:

  • The table component quickly displays
  • The client program does not use much memory since the client side only stores a small amount of data
  • A table or spreadsheet's rich user interface is maintained
  • Resources are not wasted on data downloads when the user is uninterested in the results below a certain row

The associated source code provides this process's proof-of-concept implementation. In that demo, a client containing a read-only sortable JTable of 200,000 rows retrieves data from a servlet as XML and stores it in a DistributedTableModel. You can easily extend this demo to implement single cell selection, data editing, and more.

Jeremy Dickson has been writing Java code for more than five years in the domain of bioinformatics and life science. He has worked on everything from JavaBean components for displaying genetic maps to Enterprise JavaBean-based server-side platforms for project management.

Learn more about this topic