| CONTENTS | PREV | NEXT | INDEX | Designing Enterprise Applications with the J2EETM Platform, Second Edition |
A transaction is a logical unit of work that either modifies some state, performs a set of operations, or both. An individual transaction may involve multiple data and logical operations, but these operations always occur as an indivisible atomic unit, or they do not occur at all. For example, enrolling a patient in a health care plan may involve first acquiring release forms from the patient, verifying the patient's employment, checking her health and insurance history against remote data sources, and so on. All of the activities described can be subtasks of a single transaction, because failure of any one of these subtasks should cause the entire transaction to fail.
This section provides a brief introduction to basic concepts in conventional and distributed transactional systems. See "References and Resources" on page 277 for references to in-depth treatment of these topics.
Enterprise transactions share the properties of atomicity, consistency, isolation, and durability, denoted by the acronym ACID. These properties are necessary to ensure safe data sharing.
Atomicity means that a transaction is considered complete if and only if all of its operations were performed successfully. If any operation in a transaction fails, the transaction fails. In the health care example described above, a patient can be enrolled only if all required procedures complete successfully, so enrollment is atomic.
Consistency means that a transaction must transition data from one consistent state to another, preserving the data's semantic and referential integrity. For example, if every health care policy in a database requires both a patient covered by the policy and a plan describing the coverage, every transaction in the health insurance application must enforce this consistency rule. While applications should always preserve data consistency, many databases provide ways to specify integrity and value constraints so that transactions that attempt to violate consistency will automatically fail.
Isolation means that any changes made to data by a transaction are invisible to other concurrent transactions until the transaction commits. Isolation requires that several concurrent transactions must produce the same results in the data as those same transactions executed serially, in some (unspecified) order. In the health plan enrollment example, isolation ensures that updates made to a patient record will not be globally visible until those updates are committed.
Durability means that committed updates are permanent. Failures that occur after a commit cause no loss of data. Durability also implies that data for all committed transactions can be recovered after a system or media failure.
An ACID transaction ensures that persistent data always conform to their schema, that a series of operations can assume a stable set of inputs and working data, and that persistent data changes are recoverable after system failure.
An application that uses transactions is called a transactional application. In a J2EE application, a transactional application may consist of multiple servlets, JSP pages, and enterprise beans. A resource manager is an external system accessed by an application. A resource manager provides and enforces the ACID transaction properties for specific data and operations. Examples of resource managers include a relational database (which support persistent storage of relational data), an EIS system (managing transactional, external functionality and data), and the Java Message Service (JMS) provider (which manages transactional message delivery). A transactional application accesses a resource manager through a transactional resource object. For example, a JDBC java.sql.Connection object is used to access a relational database. A resource adapter is a system library that makes the API of a resource manager available to an application server. A Connector is a resource adapter that has an API conforming to the Java Connector architecture, the standard architecture for integrating J2EE applications with EISes.
Transactional programs must be able to start and end transactions, and be able to indicate whether data changes are to be made permanent or discarded. Indicating transaction boundaries for a program is called transaction demarcation.
A program starts a transaction by executing a begin operation. The program may then read or modify data within the scope of the new active transaction. When the program is ready to make its data changes permanent, it executes a commit operation, causing the transaction to persist any data modified or created during the active state. Successful completion of the commit operation results in a permanent change to the transactional resource. If a commit operation fails (for example, due to inadequate resources or data consistency violations), the resource manager executes a rollback, discarding any changes made since the transaction began. An application may also explicitly request a rollback during an active transaction.
Distributed enterprise systems often need to access and update multiple transactional resources in order to accomplish some business goal. Consider, for example, a travel agency application. Creating a typical business travel itinerary with a confirmed and paid plane ticket requires successful completion of user authentication, credit card processing, and flight reservation, as well as local creation of the itinerary itself. Such a transaction, involving independent, cooperating transactional systems, is called a distributed transaction.
Distributed transactions are more complex than non-distributed transactions because of latency, potential failure of one or more resource managers, and interoperability concerns. On a network, a failed transaction can be difficult to distinguish from one that is merely slow. Resource managers that do not "know" about each other cannot coordinate transactions by themselves. A transactional application could itself handle rollbacks and commits for multiple distributed resources, but only at the cost of a great deal of complex, non-reusable logic.
The most common solution to the problem of coordinating distributed transactions is to introduce a third participant, called a transaction manager, into the design. The transaction manager acts as a mediator between applications and the multiple resources the applications use. Figure 8.1 shows the three participants in a distributed transaction: the transactional application, the resource manager, and the transaction manager, which coordinates the transactions of multiple resource managers, providing the application with ACID transactions across multiple resources. In many cases, the transaction manager uses the X/Open XA protocol to communicate with multiple resource managers. In the J2EE platform, the XA protocol is encapsulated by the JTA XAResource interface. Please refer to "References and Resources" on page 277 for more information on the X/Open XA protocol.
At any time during a distributed transaction, the transaction manager maintains an association between each transaction (which has a unique global ID), application threads, and connections to the resource managers. For example, a transaction manager may associate a single transaction ID with a thread of an application, an SQL connection that has updated a table, a JMS provider waiting to transmit a message, and a resource adapter or Connector executing an external business function. A transaction context is the association of a transaction with an application component or a resource manager. The transparent forwarding of a transaction context from one component to another component or from a component to a resource manager is called transaction context propagation.
Resource managers that do not "know" about one another can't cooperate directly in distributed transactions; instead, the transaction manager controls the transaction, indicating to each resource manager whether and when to commit or roll back, based on the global state of the transaction. A transaction manager coordinates transactions between resource managers using a two-phase commit protocol. The two-phase commit protocol provides the ACID properties of transactions across multiple resources.
In the first phase of two-phase commit, the transaction manager tells each resource to "prepare" to commit; that is, to perform all operations for a commit and be ready either to make the changes permanent or to undo all changes. Each resource manager responds, indicating whether or not the prepare operation succeeded. In the second phase, if all prepare operations succeed, the transaction manager tells all resource managers to commit their changes; otherwise, it tells them all to roll back and indicates transaction failure to the application.
A particular resource manager may participate in multiple simultaneous distributed transactions. The ACID properties apply for all resource managers involved in a particular distributed transaction, as well as for all pending transactions within a particular resource manager.