Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 6 of 6
The precise mechanism for triggering the rollback is unimportant; several are available. The important point is that the commit or rollback happens in the reverse order of the business ordering in the resources. In the sample application, the messaging transaction must commit last because the instructions for the business process are contained in that resource. This is important because of the (rare) failure case in which the first commit succeeds and the second one fails. Because by design all business processing has already completed at this point, the only cause for such a partial failure would be an infrastructural problem with the messaging middleware.
Note that if the commit of the database resource fails, then the net effect is still a rollback. So the only nonatomic failure
mode is one where the first transaction commits and the second rolls back. More generally, if there are n resources in the transaction, there are n-1 such failure modes leaving some of the resources in an inconsistent (committed) state after a rollback. In the message-database
use case, the result of this failure mode is that the message is rolled back and comes back in another transaction, even though
it was already successfully processed. So you can safely assume that the worse thing that can happen is that duplicate messages
can be delivered. In the more general case, because the earlier resources in the transaction are considered to be potentially
carrying information about how to carry out processing on the later resources, the net result of the failure mode can generically
be referred to as duplicate message.
Some people take the risk that duplicate messages will happen infrequently enough that they don't bother trying to anticipate them. To be more confident about the correctness and consistency of your business data, though, you need to be aware of them in the business logic. If the business processing is aware that duplicate messages might arrive, all it has to do (usually at some extra cost, but not as much as the 2PC) is check whether it has processed that data before and do nothing if it has. This specialization is sometimes referred to as the Idempotent Business Service pattern.
The sample codes includes two examples of synchronizing transactional resources using this pattern. I'll discuss each in turn and then examine some other options.
In the sample code's best-jms-db project, the participants are set up using mainstream configuration options so that the Best Efforts 1PC pattern is followed. The
idea is that messages sent to a queue are picked up by an asynchronous listener and used to insert data into a table in the
database.
The TransactionAwareConnectionFactoryProxy -- a stock component in Spring designed to be used in this pattern -- is the key ingredient. Instead of using the raw vendor-provided
ConnectionFactory, the configuration wraps ConnectionFactory in a decorator that handles the transaction synchronization. This happens in the jms-context.xml, as shown in Listing 6:
TransactionAwareConnectionFactoryProxy to wrap a vendor-provided JMS ConnectionFactory<bean id="connectionFactory"
class="org.springframework.jms.connection.TransactionAwareConnectionFactoryProxy">
<property name="targetConnectionFactory">
<bean class="org.apache.activemq.ActiveMQConnectionFactory" depends-on="brokerService">
<property name="brokerURL" value="vm://localhost"/>
</bean>
</property>
<property name="synchedLocalTransactionAllowed" value="true" />
</bean>
There is no need for the ConnectionFactoryto know which transaction manager to synchronize with, because only one transaction active will be active at the time that
it is needed, and Spring can handle that internally. The driving transaction is handled by a normal DataSourceTransactionManager configured in data-source-context.xml. The component that needs to be aware of the transaction manager is the JMS listener container that will poll and receive
messages:
<jms:listener-container transaction-manager="transactionManager" >
<jms:listener destination="async" ref="fooHandler" method="handle"/>
</jms:listener-container>
The fooHandler and method tell the listener container which method on which component to call when a message arrives on the "async" queue. The handler
is implemented like this, accepting a String as the incoming message, and using it to insert a record:
public void handle(String msg) {
jdbcTemplate.update(
"INSERT INTO T_FOOS (ID, name, foo_date) values (?, ?,?)", count.getAndIncrement(), msg, new Date());
}
To simulate failures, the code uses a FailureSimulator aspect. It checks the message content to see if it supposed to fail, and in what way. The maybeFail() method, shown in Listing 7, is called after the FooHandler handles the message, but before the transaction has ended, so that it can affect the transaction's outcome:
maybeFail() method
@AfterReturning("execution(* *..*Handler+.handle(String)) && args(msg)")
public void maybeFail(String msg) {
if (msg.contains("fail")) {
if (msg.contains("partial")) {
simulateMessageSystemFailure();
} else {
simulateBusinessProcessingFailure();
}
}
}
The simulateBusinessProcessingFailure() method just throws a DataAccessException as if the database access had failed. When this method is triggered, you expect a full rollback of all database and message
transactions. This scenario is tested in the sample project's AsynchronousMessageTriggerAndRollbackTests unit test.
The simulateMessageSystemFailure() method simulates a failure in the messaging system by crippling the underlying JMS Session. The expected outcome here is a partial commit: the database work stays committed but the messages roll back. This is tested
in the AsynchronousMessageTriggerAndPartialRollbackTests unit test.
The sample package also includes a unit test for the successful commit of all transactional work, in the AsynchronousMessageTriggerSunnyDayTests class.
The same JMS configuration and the same business logic can also be used in a synchronous setting, where the messages are received
in a blocking call inside the business logic instead of delegating to a listener container. This approach is also demonstrated
in the best-jms-db sample project. The sunny-day case and the full rollback are tested in SynchronousMessageTriggerSunnyDayTests and SynchronousMessageTriggerAndRollbackTests, respectively.
In the other sample of the Best Efforts 1PC pattern (the best-db-db project) a crude implementation of a transaction manager just links together a list of other transaction managers to implement
the transaction synchronization. If the business processing is successful they all commit, and if not they all roll back.
The implementation is in ChainedTransactionManager, which accepts a list of other transaction managers as an injected property, is shown in Listing 8:
<bean id="transactionManager" class="com.springsource.open.db.ChainedTransactionManager">
<property name="transactionManagers">
<list>
<bean
class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="dataSource" />
</bean>
<bean
class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="otherDataSource" />
</bean>
</list>
</property>
</bean>
The simplest test for this configuration is just to insert something in both databases, roll back, and check that both operations
left no trace. This is implemented as a unit test in MulipleDataSourceTests, the same as in the XA sample's atomikos-db project. The test fails if the rollback was not synchronized but the commit happened to work out.
Remember that the order of the resources is significant. They are nested, and the commit or rollback happens in reverse order
to the order they are enlisted (which is the order in the configuration). This makes one of the resources special: the outermost
resource always rolls back if there is a problem, even if the only problem is a failure of that resource. Also, the testInsertWithCheckForDuplicates() test method shows an idempotent business process protecting the system from partial failures. It is implemented as a defensive
check in the business operation on the inner resource (the otherDataSource in this case):
int count = otherJdbcTemplate.update("UPDATE T_AUDITS ... WHERE id=, ...?");
if (count == 0) {
count = otherJdbcTemplate.update("INSERT into T_AUDITS ...", ...);
}
The update is tried first with a where clause. If nothing happens, the data you hoped to find in the update is inserted. The cost of the extra protection from the
idempotent process in this case is one extra query (the update) in the sunny-day case. This cost would be quite low in a more
complicated business process in which many queries are executed per transaction.
The ChainedTransactionManager in the sample has the virtue of simplicity; it doesn't bother with many extensions and optimizations that are available.
An alternative approach is to use the TransactionSychronization API in Spring to register a callback for the current transaction when the second resource joins. This is the approach in
the best-jms-db sample, where the key feature is the combination of TransactionAwareConnectionFactory with a DataSourceTransactionManager. This special case could be expanded on and generalized to include non-JMS resources using the TransactionSynchronizationManager. The advantage would be that in principle only those resources that joined the transaction would be enlisted, instead of
all resources in the chain. However, the configuration would still need to be aware of which participants in a potential transaction
correspond to which resources.
Also, the Spring engineering team is considering a "Best Efforts 1PC transaction manager" feature for the Spring Core. You can vote for the JIRA issue if you like the pattern and want to see explicit and more transparent support for it in Spring.
The Nontransactional Access pattern needs a special kind of business process in order to make sense. The idea is that sometimes one of the resources that you need to access is marginal and doesn't need to be in the transaction at all. For instance, you might need to insert a row into an audit table that's independent of whether the business transaction is successful or not; it just records the attempt to do something. More commonly, people overestimate how much they need to make read-write changes to one of the resources, and quite often read-only access is fine. Or else the write operations can be carefully controlled, so that if anything goes wrong it can be accounted for or ignored.
In these cases the resource that stays outside the transaction probably actually has its own transaction, but it is not synchronized
with anything else that is happening. If you are using Spring, the main transaction is driven by a PlatformTransactionManager, and the marginal resource might be a database Connection obtained from a DataSource not controlled by the transaction manager. All that happens is that each access to the marginal resource has the default
setting of autoCommit=true. Read operations won't see updates that are happening concurrently in another uncommitted transaction (assuming reasonable
default isolation levels), but the effect of write operations will normally be seen immediately by other participants.
This pattern requires more careful analysis, and more confidence in designing the business processes, but it isn't all that different from the Best Efforts 1PC. A generic service that provides compensating transactions when anything goes wrong is too ambitious a goal for most projects. But simple use cases involving services that are idempotent and execute only one write operation (and possibly many reads) are not that uncommon. These are the ideal situations for a nontransactional gambit.
The last pattern is really an antipattern. It tends to occur when developers don't understand distributed transactions or
don't realize that they have one. Without an explicit call to the underlying resource's transaction API, you can't just assume
that all the resources will join a transaction. If you are using a Spring transaction manager other than JtaTransactionManager, it will have one transactional resource attached to it. That transaction manager will be the one that is used to intercept
method executions using Spring declarative transaction management features like @Transactional. No other resources can be expected to be enlisted in the same transaction. The usual outcome is that everything works just
fine on a sunny day, but as soon as there is an exception the user finds that one of the resources didn't roll back. A typical
mistake leading to this problem is using a DataSourceTransactionManager and a repository implemented with Hibernate.
I'll conclude by analyzing the pros and cons of the patterns introduced, to help you see how to decide between them. The first step is to recognize that you have a system requiring distributed transactions. A necessary (but not sufficient) condition is that there is a single process with more than one transactional resource. A sufficient condition is that those resources are used together in a single use case, normally driven by a call into the service level in your architecture.
If you haven't recognized the distributed transaction, you have probably implemented the Wing-and-a-Prayer pattern. Sooner or later you will see data that should have been rolled back but wasn't. Probably when you see the effect it will be a long way downstream from the actual failure, and quite hard to trace back. The Wing-and-a-Prayer can also be inadvertently used by developers who believe they are protected by XA but haven't configured the underlying resources to participate in the transaction. I worked on a project once where the database had been installed by another group, and they had switched off the XA support in the installation process. Everything ran just fine for a few months and then strange failures started to creep into the business process. It took a long time to diagnose the problem.
If your use cases with mixed resources are simple enough and you can afford to do the analysis and perhaps some refactoring, then the Nontransactional Resource pattern might be an option. This works best when one of the resources is read-mostly, and the write operations can be guarded with checks for duplicates. The data in the nontransactional resource must make sense in business terms even after a failure. Audit, versioning, and logging information typically fits into this category. Failures will be relatively common (any time anything in the real transaction rolls back), but you can be confident that there are no side effects.
Best Efforts 1PC is for systems that need more protection from common failures but don't want the overhead of 2PC. Performance improvements can be significant. It is more tricky to set up than a Nontransactional Resource, but it shouldn't require as much analysis and is used for more generic data types. Complete certainty about data consistency requires that business processing is idempotent for "outer" resources (any but the first to commit). Message-driven database updates are a perfect example and have quite good support already in Spring. More unusual scenarios require some additional framework code (which may eventually be part of Spring).
The Shared Resource pattern is perfect for special cases, normally involving two resources of a particular type and platform (such as. ActiveMQ with any RDBMS or Oracle AQ co-located with an Oracle database). The benefits are extreme robustness and excellent performance.
The sample code provided with this article will inevitably show its age as new versions of Spring and other components are released. See the Spring Community Site to access the author's up-to-date code, as well as current versions of the Spring Framework and related components.
Full XA with 2PC is generic and will always give the highest confidence and greatest protection against failures where multiple, diverse resources are being used. The downside is that it is expensive because of additional I/O prescribed by the protocol (but don't write it off until you try it) and requires special-purpose platforms. There are open source JTA implementations that can provide a way to break free of the application server, but many developers consider them second best, still. It is certainly the case that more people use JTA and XA than need to if they could spend more time thinking about the transaction boundaries in their systems. At least if they use Spring their business logic doesn't need to be aware of how the transactions are handled, so platform choices can be deferred.
Dr. David Syer is a Principal Consultant with SpringSource, based in the UK. He is a founder and lead engineer on the Spring Batch project, an open source framework for building and configuring offline and batch-processing applications. He is a frequent presenter at conferences on Enterprise Java and commentator on the industry. Recent publications appeared in The Server Side, InfoQ and the SpringSource blog.
Read more about Enterprise Java in JavaWorld's Enterprise Java section.
javax.transaction from the Java docs for JTA and XAResource.