Topics:
Executive Summary: The Tests and Results at a GlanceThe focus of our research was high availability for the Java 2 Platform, Enterprise Edition. We applied a series of benchmark tests to four server configurations, using an industry-standard workload for the J2EE platform. Based on our data, the J2EE platform-based server was able to recover from a single node failure in approximately 1 minute, and the Web server recovered in less than 30 seconds. In a clustered J2EE platform-based environment, some clients had a disruption of service after server failure, but this was minimized by using a multitiered approach for the Web and J2EE platform-based server(s). To achieve high availability, failover of the Web server was required. Best Practices for Developers Using J2EE TechnologyMiddleware plays a crucial role in delivering multitiered enterprise applications. It is critical that the middleware used is both highly scalable and highly available. This Developer's Notebook describes experiments conducted by Sun engineers to provide best practices information on J2EE platform availability for the architects and developers who use that platform. The J2EE platform defines a standard for developing portable, multitiered enterprise applications. The platform can simplify enterprise application development and deployment by basing applications on standardized, modular components; by providing a complete set of services to those components; and by handling many details of application behavior automatically without complex programming. Vendors of J2EE platform-based servers add value by providing services that are not part of that standard. While some of these services promote vendor lock-in and reduce portability, others, like clustering, can add value. High-availability features, such as clustering, are becoming increasingly important. Most vendors of products based on J2EE technology provide some support for clustering J2EE platform-based servers. The following questions arise:
First, we present an overview of the availability features provided by leading J2EE platform-based servers. We then discuss the goals of this research and the strategy used for testing. Finally, we present the findings of the preliminary testing on J2EE platform-based servers. Availability for the J2EE PlatformNo official definition exists for a J2EE platform-based cluster, and each vendor of products for this platform has a different implementation. In this article, we are referring to a set of J2EE platform-based application servers working together to provide high availability and scalability for enterprise applications. Categorizing ClustersGenerally, clusters can be categorized as "shared nothing" or as "shared disk." In shared nothing, all nodes are independent. These clusters add manageability overhead. Shared disk clusters have a single storage device that all J2EE platform-based servers in the cluster use to load applications. This reduces maintenance, but having the shared file system highly available requires the use of devices such as RAID, storage area networks (SANs), or network-attached storage (NAS). Clustering MethodologiesIn most J2EE platform-based servers, clustering is provided at three levels:
In the J2EE 1.2 specification, JMS is not mandatory; thus, our investigation focused on the implementations of
Clustering of EJB Components
The clustering of EJB components is usually implemented by replica-aware stubs that are generated at deployment or runtime. The stubs are aware of all the servers in the cluster and may use a load-balancing algorithm to determine where to retrieve the objects. The way that a stub works depends on the type of EJB technology. Due to the stateless nature of stateless session beans, replica-aware objects are generally free to route requests to any server in the cluster. Some application servers allow automatic failover of calls to stateless session beans, but only when the effect is the same whether a method is called once or multiple times. In this scenario, the methods in the stateless session beans must be marked as being idempotent at deployment. For entity beans and stateful session beans, failover and load balancing are usually supported at the "EJBHome" level. Due to the nature of entity beans, state need not be replicated across nodes; it is read from the database at the beginning of each transaction and is written at the end of the transaction. Availability Test HarnessEJB components from ECperf software were used as the basis of the benchmark application that would stress most components of the J2EE platform and place a heavy load on the system. ECperf software provided a complete workload for testing the scalability and availability of EJB technology-based containers. It has a set of interoperating EJB components and a Java application to drive them. It also has a Web interface to the EJB components by means of JavaServer Pages (JSP) technology. As most EJB technology-based applications are accessed through JSP framework-based components and servlets, the workload for the ECperf software was driven through sample JSP pages. To do this, an HTTP load generator was used. "MDELoad" is a Java platform-based HTTP load generator. Through the use of scripts, MDELoad enabled static HTTP loads to be generated, although it did not allow the response from one request to be used as the basis of the next request. To make the load generator more dynamic, we added the following functionality:
We further enhanced MDELoad through the use of shell scripts, to perform end-to-end availability testing for the J2EE platform. As shown in Figure 1, the functions performed by the shell scripts included:
Methodology A summary of the methodology used while conducting experiments follows:
Our main objective was to determine how to deploy applications based on J2EE technology in the simplest yet most highly available fashion. To begin, we configured a single server. We ran the system under heavy load and recorded average response times, throughput, CPU utilization, and so on. We then introduced failures to assess the implications on all tiers. ResultsThis section discusses the hardware and software configurations tested. First, we present an overview of the hardware platform used for testing. We then discuss each experiment and its results. For the first three experiments, all tiers were deployed on a single 24-CPU Sun Enterprise 6500 (E6500) server, running the Solaris 8 Operating Environment and Java 2 SDK, version 1.3.0. The E6500 was configured such that the J2EE platform-based server(s) ran in a processor set separate from the database and other processes. In the final experiment, four systems were used for the Web/J2EE platform tier. The database was run on a separate system. Each server was dedicated to running only one Web server or J2EE platform-based server. The following configurations were tested:
A Single J2EE Platform-Based Server
Configuration
The load generator was configured to simulate 60 interactive users for 30 minutes and to fail the J2EE platform-based server after approximately 15 minutes. Observations: When the J2EE platform-based server failed, all in-memory state was lost. This included HTTP session state and stateful session beans. Transactions failed to complete, so changes were not committed to the database. Once the server failed, the clients were no longer able to access services provided. If a client had a session in progress, the client received a connection refused or similar error. This did not necessarily mean that the transaction was unsuccessful. In some cases, the transactions were committed prior to the failure. Thus, the client(s) may not have been aware that an order was successful. Recovery: Using shell scripts, we were able to automate completely the recovery of the J2EE platform-based server. We wrote a simple watchdog script to monitor the process ID of the server. When the process was killed, the script restarted it. This ensured that the server was fully functional within 1 minute of failure. Client recovery was completely manual. Because of the chance of transactions being committed to the database, the clients had to log back in to the Web site and check the status of any orders submitted immediately prior to failure. As session state was lost, orders not committed to the database were lost completely. (Shopping carts had to be manually recreated.)
Discussion
Separate Tiers for JSP and EJB Technology-Based Components
We deployed the JSP technology-based components into the Web server and the EJB components into the EJB container. The Web server and EJB technology-based containers ran on the same machine, along with the database. The load generator was configured to run with 30 users for 15 minutes. Failures were introduced approximately 8 minutes into the run. For each run, either the Web server or the EJB technology-based container was killed.
EJB Technology-Based Container Failure
Unlike the configuration for a single J2EE platform-based server, clients still had access to the site, but only to static content. Like the single-server configuration, transactions submitted from the client immediately prior to the failure may have been successful without the client knowing. Recovery: The watchdog script recovered the service automatically in less than 1 minute. Clients had to log in to the Web site to check the status of orders submitted just prior to service disruption. An advantage of this configuration is that shopping carts stored in HTTP session state were maintained. However, state stored in session beans was destroyed and had to be reentered.
Web Server Failure
Clients demonstrated similar behavior to that seen in the configuration for a single J2EE platform-based server, namely, they were no longer able to access the service. It seemed more likely, however, that transactions submitted immediately prior to the failure would be committed to the database. Clients would not know if the transaction completed successfully until they checked when the service restarted. Recovery: The recovery process for the Web server is the same as for the EJB technology-based container and can be automated with the watchdog script. Recovery of the Web server happens in about 30 seconds, one-half the time required to recover an EJB technology-based container. Clients demonstrated similar behavior to that seen in the configuration for a single J2EE platform-based server; however, shopping carts stored in the HTTP session state were lost, but state stored in session beans survived (although extra programming was needed to retrieve them). Clients had to log in to the Web site and redo any non-completed transaction.
Discussion
This setup requires a Java technology-enabled Web server or servlet engine that can handle JSP pages, such as Tomcat, and the Web server must be configured to work with the EJB technology-based container. Also, deploying the application is a bit more time-consuming. In essence, two applications are deployed:
A Load-Balanced Cluster of J2EE Platform-Based Servers
The load generator was configured to run 60 interactive users for a period of about 30 minutes and to fail one of the cluster members approximately 15 minutes into the run. Observations: J2EE platform-based servers in the cluster replicated session state -- for both HTTP and stateful session beans -- to at least one other member in the cluster. Replication was performed using IP multicast and/or direct socket connections. Thus, in the case of a single server failure, client session state survived. However, transactions in process on the failed cluster member were lost, as there was no automatic failover. Clients may or may not experience problems with the service. If client transactions are in process at the time of the failure, clients will receive error responses. If transactions are received just as a failure occurs, data findings suggest that approximately 30 percent of clients will notice a disruption of service. Because session state has been replicated between J2EE platform-based servers, clients may be able to continue as if nothing happened. Occasionally a transaction will be successfully committed before the client receives a response from the server. Recovery: The watchdog script recovers the failed cluster member automatically in less than 1 minute. If a client experiences a service disruption, the client must check the status of the last transaction. Session state is maintained, and in most cases no action is required by the client.
Discussion
One problem exists, however: The load balancer is a single point of failure. If the load balancer fails, access to the back-end J2EE platform-based servers is not possible. The next configuration fixes this problem.
Web Servers Load Balancing to a Cluster of J2EE Platform-Based Servers
We deployed the JSP pages and EJB components into the cluster of J2EE platform-based servers. We configured the Web servers, with a proxy plug-in, to redirect requests for JSP pages to the back-end cluster of J2EE platform-based servers. The load generator was configured to run for 10 minutes, with failures introduced 5 minutes into testing for both the Web and J2EE platform-based servers.
Failure of the J2EE Platform-Based Server
Recovery: The recovery process was the same as for the previous configuration.
Web Server Failure
Recovery: The failed Web server could be recovered automatically with scripts, in less than 30 seconds.
Discussion
ConclusionWith the J2EE technology standard being widely adopted for developing Web-based enterprise applications, it has become increasingly important that J2EE platform-based servers be capable of delivering a reliable service. Most vendors of products based on J2EE technology have added value by providing high-availability features, such as clustering. We have presented our observations from testing a number of topologies. Results of our experiments thus far suggest that a multitiered approach produces the highest availability. To increase availability, automatic failover of Web servers is required. Table of Results
Watchdog Script
See Also | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.
|
| ||||||||||||