|
by Phelim
O'Doherty
August 2003
1 Introduction
This document
outlines the performance considerations and expectations of the JAIN SLEE
specification [1] developed through the Java Community Process [2]. The most
important items to highlight when calculating performance for the JSLEE are:
- The type of transactions being carried out in the performance test.
All performance tests should ensure that transaction executes a read/write
operation and the state is committed to disk and replicated, as opposed to
simple read transaction. The read/write operation provides the performance
overhead within any platform. The new state resulting from the write operation
should also be replicated over more than one node. State keeps track of the
progression of a session and maintains availability when a specific node fails.
- The latency is the delay required to complete an application defined
unit of work. For the purposes of JSLEE, latency is the time that an event
enters the system, until the time that event is processed. Processing of an
event implies the executing of application logic associated with the event,
replicating and committing the state changes performed by the application
logic.
- The fault tolerance guarantees that is in the event that a component
fails, a backup component or procedure can immediately take its place with
no loss of service.
- The high availability guarantee that is the system or component is continuously
operational for a desirably long length of time, measured relative to “100%”.
- The hardware environment of the platform upon which the performance
test is executed.
2 Execution Environment Requirements
The requirements for a high performing execution platform within the core
communications network can be categorized under the following sections.
2.1 Throughput
Requirements
Throughput requirements
define the number of operations a platform must service in a specific timeframe.
The typical operator requirement for throughput is the ability to service
‘one million busy hour call attempts’. A busy hour call attempt typically
consists of between two and ten events per call attempt each requiring its
own unique transaction, for the purpose of this document we will take ‘a
mean of five events per call’. The mean throughput requirements translate
to ‘1389 transactions per second WITH state updates that are transacted
and replicated to survive single node failures’ depending on the service.
It is expected that this throughput could be handled on ‘three '2 to 4-way'
boxes, i.e. 12 CPUs’. These transaction mean and CPU usage statistics
should be used to consider throughput performance of any JSLEE implementation.
2.2 Latency
Requirements
Latency requirements define
the length of time it takes for an event to enter the system, be processed
and update the state changed by that event. Latency is often qualified with
a percentage of time (obviously network latency will grow exponentially as
time progresses). The common time percentage measurement of latency is 'ninety
five percent latency' or ninety five percent of the time. The typical
operator requirement for latency is the ability ‘to setup a call under
five hundred milliseconds’. Setting up a call typically consists of traversing
two to five network nodes with a minimum of two events per node each with
a unique transaction, hence between four and ten transactions must be processed
in under five hundred milliseconds. Taking a mean of ‘seven transactions’
the average latency requirements should be ‘71.5 milliseconds round trip
time per transaction WITH state updates that are transacted and replicated
to survive single node failures for 75 percentile’ depending on the service.
Taking all variables into consideration the ‘95 percentile requirement
for latency should not exceed 200 milliseconds round trip time per transaction’
that is only 5 percent of the latency results should exceed 200milliseconds.
These latency statistics should be used to consider network delay of any JSLEE
implementation.
2.3 High
Availability Requirements
High availability refers to a system or
component that is continuously operational for a desirably long length of
time. Availability can be measured relative to ‘100% operational’ or ‘never
failing’. The typical operator requirement for high availability for a system
or product is ‘five 9s’ that is ‘the system must be available 99.999%
of the time’. This translates to less than ‘six minutes down-time
per year’.
2.4 Hardware
Requirements
JSLEE solutions scale horizontally,
therefore cost is expected to scale close to linearly regardless of number
of nodes and message handling capacity. JSLEE is designed to be deployed on
high volume lower cost symmetric multiprocessor systems (SMP), for example
2 to 4-way systems. This enables more machines to be added to the system as
more capacity is needed without the upfront investment of more expensive multiprocessor
systems such as 8-way systems. More nodes in the system, enables more state
replicas thus increasing system availability, for example a system with fewer
large nodes, means node failure will take out more capacity. Symmetric multiprocessor
systems were chosen over single processor systems to satisfy the need of
increased throughput to counter balance the communications overhead associated
with messaging systems. In summary, ‘2 to 4-way multiprocessor systems’
are the most price competitive systems for JSLEE architectures taking into
consideration communications overhead and processing power per node.
3 References
[1] Sun Microsystems,
"JAIN SLEE Specification - JSR 22", 2002. See http://jcp.org/en/jsr/detail?id=22.
[2] Sun Microsystems, “Java Community Process Program - JSR 171”, 2002.
See http://jcp.org/en/jsr/detail?id=171.
|