Survival of the fittest Jini services, Part 2

Understanding reliability in distributed transactions

1 2 3 4 Page 3
Page 3 of 4
public class BookStoreImpl implements BookStore, TransactionParticipant {
  ....
   /**
    * Calling this method initiates the purchase transaction.
    * Since the purchase itself might be part of a larger transaction, 
    * we allow a Transaction object to be passed in the 
    * method call. An example would be a transactional email service 
    * that guarantees the delivery of the confirmation: If the delivery 
    * fails, the transaction is aborted. In that case, the email service 
    * will be the transaction client. If null is passed 
    * in as the Transaction object, a new transaction will be created, 
    * and the BookStoreImpl becomes the transaction's client.
    */
    public OrderConfirmation buyBook(Book book, 
                                     Account creditCard, 
                                     Customer customer,
                                     Address shipTo, 
                                     int daysToDelivery, 
                                     Transaction txn) 
        throws NoSuchBookException, CreditCardException, ShippingException, 
            BookStoreException, RemoteException, TransactionException {
    
    boolean client = false;
    TransactionManager txMan = null;
    ServerTransaction sTxn = null;
    Lease txnLease = null;
    //If the transaction is null, we'll be the client, and we also 
    //need to create a new Transaction.
    if (txn == null) {
        client = true;
        txMan = discManager();
        //This object is a bundle of the ServerTransaction + its lease.
        //We specify a lease of 3 minutes for the transaction.
        Transaction.Created ct = TransactionFactory.create(txMan, 180 * 1000);
        sTxn = (Transaction)ct.transaction;
        txnLease = (Lease)ct.lease;
      
        //Manage the transaction's lease. The implementation is not shown.
        manageTrLease(txnLease);
    } else {
        //We can only handle ServerTransaction.
        if (txn instanceof ServerTransaction) 
            sTxn = (ServerTransaction)txn;
        else 
            throw new TransactionException("Unknown transaction semantics.");
    }
    //Everything from here will be performed under the transaction. 
    //////////////////////////////////////////////////////////////
    try {
        //Since we need to ensure that we save or print the confirmation, 
        //we also will have to join the transaction.
        sTxn.join(this, 0);
        //Call each method of the other services, passing the transaction object
        //as parameter. Each service must join the transaction as well. 
        //Decrement the inventory count for this book.
        //If no more books are available, a NoSuchBookException will be
        //thrown. We catch exception and, if we are the transaction client, 
        //cause the transaction to abort before returning the exception
        //to the caller.
            int remaining = bookDatabase.decrementInventory(book, sTxn);
          
        //Obtain the best deliver price for the book. The implementation
        //is not show, but is available in the full example.
        //This might return a ShippingException, which we catch and 
        //and handle similarly to the NoSuchBookException.
        PackageDesc packDesc = PackageDesc.createDescription(book);
        PickupConfirmation pickupConf = 
            ShippingSelector.schedulePickup(wareHouseAddress, 
                                                shipTo,
                                                  packDesc,
                                            int daysToDelivery, 
                                              sTxn);
        //Finally, we attempt to charge the credit card. TotalPrice
        //includes the book's price, local tax, and shipping charges.
        //TotalPrice is an implementation of Price.
        //The system determines tax, based on location.
        //If the charge attempt fails, the exception will be handled 
        //similar to the other service-specific exceptions.
        TotalPrice price = CashRegister.computePrice(book, conf);
        Charge chg = new Charge(price);
        ChargeConfirmation chgConf = card.debit(account, chg, sTxn);
        //Now that we have succeeded in all the operations with other
        //services, produce and save the OrderConfirmation. This must 
        //succeed before the transaction can be committed. 
        //saveConfirmation may return a CannotSaveException.
        OrderConfirmation orderConf = 
            new OrderConfirmation(pickupConf, chgConf, book);
        saveConfirmation(orderConf);
        //If we are the client, commit the transaction.
        if (client) 
            ///////////////////////////////////////////
            //Transaction ends here, if we're the client.
            ///////////////////////////////////////////
               sTxn.commit();
        
        //Return orderConf.
        return orderConf;
        } catch (Exception e) {
            ///////////////////////////////////////////
            //Transaction ends here if we abort.
            ///////////////////////////////////////////
            if (client)
               abortTransaction(sTxn);
            if ((e instanceof NoSuchBookException) ||
                (e instanceof ShippingException) ||
                (e instanceof CreditCardException) || 
                (e instanceof TransactionException)) 
                throw e;
            else
                throw new BookStoreException(e.getMessage());
        }
   }
     
 
    /**
     * Discover TransactionManager. This method should really
     * declare more specific exceptions.
     */
    private TransactionManager discManager() throws Exception {
       ServiceDiscoveryManager serviceDiscoveryManager;
       ...
       Class[] trTypes = {TransactionManager.class};
       ServiceTemplate tmpl = new ServiceTemplate(null, trTypes, null);
       ServiceItem item = serviceDiscoveryManager.lookup(tmpl, null);
       TransactionManager tm = (TransactionManager)item.service;
       return tm;
    }
    /**
     * We are not using a transaction for this method.
     */
    public Collection findBooks(Book template) throws RemoteException {
         //Find books matching non-null fields of the specified template.
           ...
    }
  ...
   
}

The following list explains the code in more detail:

  • The bookstore service's buyBook() method is invoked. It consumes the selected Book, an object representing the customer's credit card account (which might be obtained from a smart card or some other portable storage device), some information about the customer (the Customer object; again, this could come from a smart card's onboard storage), the shipping address, and an integer denoting the desired number of delivery days (these last two are probably input via a GUI, such as a service UI). Most important, it also takes a Transaction instance as a parameter.
  • If null is passed in as the transaction parameter, a new transaction is created. The bookstore service discovers a TransactionManager service, then obtains a new ServerTransaction object from the TransactionFactory, passing the TransactionManager and a lease time of one minute as arguments. Essentially, it becomes the transaction's client. If an existing transaction was passed in, the book purchase becomes part of that transaction. In that case, the bookstore service does not act as the transaction's client.
  • Since the bookstore service must ensure that it prints or saves a purchase confirmation, it joins the transaction as a participant (note that it implements TransactionParticipant).
  • Next, the bookstore implementation removes the desired book from inventory, and discovers a credit card service, as well as several shipping services. For the latter, it tries to find all the shipping services that can deliver the package to the specified address within the desired timeframe. It then selects the service that delivers for the least amount of money. This selection is delegated to a helper object inside the bookstore service implementation.
  • The bookstore service now performs method calls on the selected CreditCard and ShippingCompany proxies, passing in the Transaction object as a parameter. These services are then obligated to join the transaction (calling ServerTransaction's join() method).
  • If all goes well, the bookstore service receives confirmations from both the credit card charge and the scheduled package pickup. It then creates the PurchaseConfirmation object. Finally, it saves the PurchaseConfirmation persistently, and possibly even displays it to the user.
  • If the bookstore is also the transaction client, it calls commit() on the transaction. At that point, the 2PC protocol starts: The transaction manager calls prepare() on all the participants, expecting their vote of either PREPARED, NOTCHANGED, or ABORT. If all voted either PREPARED or NOTCHANGED, the transaction manager calls commit() on all the participants. At that point, the transaction is officially completed, and all the services can release the resources and locks held during the transaction. However, if the bookstore is not the transaction client, it should not attempt to finish the transaction. If any participant votes ABORT, the transaction manager will invoke each service's abort() method, instructing them to undo all changes made under the transaction.

You might have noticed an interesting twist here: In some situations, you want to ensure that the customer can actually print or display confirmation. For instance, if the printer or display fails, you'd rather the transaction be aborted. This might also be the case for an airline ticket sale or the filing of a tax return. The challenge with these real-world activities is that it is very difficult to undo them. If, after the confirmation has printed, the credit card service decides to abort the transaction, then the printed confirmation becomes invalid. But it's already in the customer's hands!

The only solution here is to ensure, as much as possible, the success of online activities first, and only then perform the offline actions associated with the transaction. That is why we only saved the confirmation during the transaction, and left it up to the customer to print it at his convenience. When you need physical proof to be part of the transaction, you probably need to print a cancellation note as well when you abort it. Of course, printing that note can fail, too.

This is one area where Jini-enabled devices will simplify life: printers, cell phones, email systems, and storage devices can all become transaction participants along with business-specific services. If you need that ticket to print out, that confirmation number to display on your cell phone, or that email message to be delivered, the transaction will not complete until these physical actions succeed. (Of course, this can also backfire: if you ask your coffee machine and toaster to transactionally prepare a breakfast, when your toaster burns the bread, the coffee machine might feel obligated to undo your coffee. That's an example of a situation in which you shouldn't use transactions!)

In the final part of this series, we'll look at some of the failure conditions that plague real-life networks, and what transactions can do about them.

Undecided voters, deadlocks, and other partial failure evils

Undecided voters are a problem not only for presidential candidates, but also for the 2PC protocol. When the transaction manager calls prepare() on a participant, it expects to receive a PREPARED, ABORT, or NOTCHANGED vote. However, distributed transaction messages must travel through the network, which is inherently unreliable. Thus, the transaction manager might never receive a vote from one or more participants. In addition, one or more of the services might crash during the transaction. For these reasons the 2PC protocol cannot completely guarantee a transaction's commitment (it's sometimes called a weak commitment protocol).

Jini solves the problem of weak commitment with leases. Because a Transaction is a leased resource in Jini, its lease sooner or later expires. When that happens, the transaction manager causes the transaction to abort, calling abort() on all participants it can still contact.

Orphaned transactions are those that are guaranteed to abort. When a participant has already returned a vote, and is waiting for the manager's call to commit() or abort(), it can inquire about the transaction's current state by calling getState() on the transaction. If the transaction replies PREPARED or COMMITTED, the participant can then commit the work done during the transaction. On the other hand, if the manager returns ABORTED, the participant must then exit the transaction by calling abort(). If it cannot contact the transaction manager for a while, then it might decide that the manager crashed, and abort the transaction as well.

When several participants in a transaction compete for the same resources, deadlocks might occur. Recall that during the transaction, all participants must ensure the transaction guarantees (ACID). For example, the transaction isolation requirement mandates that the credit card service should place a lock on the credit card account for the transaction's duration. During this time no other services can access the account. If several services inside that transaction need access to the credit card account, then they need to somehow coordinate their activities so they are not all waiting for each other.

Thus the problems of concurrency control are magnified by the service-oriented Web. The more services that interact on the Web this way, the more chances there are for serious deadlocks. Without lease expirations causing deadlocked Jini transactions to eventually abort, deadlocks could bring the whole service-oriented Web to a halt.

Figure 6. Deadlocked services

In the absance of a central concurrency-control mechanism, one way for transactions to avoid deadlocks is to relax the isolation level, allowing some changes to become visible outside the transaction while the transaction is still in progress. For instance, by being able to read the account balance, other services can possibly determine whether a charge on the account will succeed when they eventually receive a "write" lock on it. Many real-life transaction-processing systems operate with less than full isolation levels to achieve increased transaction throughput.

The data management community has developed an entire repertoire of techniques and tricks to deal with this and related issues. Transactional services teach us that, increasingly, data management problems are becoming problems of distributed computing, and, likewise, distributed computing problems are becoming those of data management. This realization invites us to pursue a more interdisciplinary approach so as to bring about better-informed solutions to these exciting challenges ahead of us.

Words of caution

Let me conclude this article with two notes of caution. First, while transactions are a useful tool to make a computation reliable, there is no magic to their effectiveness. Each service must ensure that it abides by its part of the guarantees the transaction is supposed to provide. How a service might do that is the subject of my next installment in this series.

Second, distributed transactions are expensive in terms of their computational resources. They involve a manager and many messages to facilitate the two-phase commit protocol. In addition, implementing a transaction participant that conforms to the default semantics is a significant undertaking, as you will see next month. However, when you do need guaranteed reliability for a distributed computation, there is no alternative to transactions.

Frank Sommers is founder and CEO of Autospaces, a company focused on bringing Jini technology to the automotive software market. He also serves as VP of technology at Los Angeles-based Nowcom Corp., an outsourcing firm. He has been programming in Java since 1995, after attending the first public demonstration of the language on the Sun Microsystems campus in November of that year. His interests include parallel and distributed computing, the discovery and representation of knowledge in databases, and the philosophical foundations of computing. When not thinking about computers, he composes and plays piano, studies the symphonies of Gustav Mahler, or explores the writings of Aristotle and Ayn Rand. Frank would like to thank Bob Scheifler, a Sun Microsystems distinguished engineer and member of Sun's Jini team, for his comments and clarifications on Jini transactions.
1 2 3 4 Page 3
Page 3 of 4