|
Articles Index
By Steve Meloan
(July 2000)
The notion of a legacy system typically brings to mind an information
facility that was developed long ago, often using antiquated technology,
and needing to be brought kicking and screaming into the 21st Century.
|
"...with Java technology having now been on the scene for over five
years,
it's increasingly possible for developers to find themselves redesigning
systems
that were written from their inception in the Java programming
language."
|
But
with Java technology having now been on the scene for over five
years, it's
increasingly possible for developers to find themselves re-architecting
systems that were written from their inception in the Java programming
language.
This installment of "Backstage at the JDC," examines the issues
that led up to re-architecting the Java Developer Connection (JDC)
site itself,
along with the solutions implemented, and what advantages they brought
with
them. Future article installments will drill down into the various
enhancements and changes that were made to the site, as well as provide
sample code to illustrate these changes.
The Early Days
As with many sites, simple load issues first signaled the need for a
change
"...simple load issues first signaled the need for a change.... 'Serious
problems
began to appear at about the one-million member range,'.... "
|
at the JDC. "The original membership numbers were in the 10,000 range,"
reports Will Snow, manager of the JDC site's current development team,
"but
we're now over 1.5 million." Snow and his team inherited the site from
an
original group of developers. "Serious problems began to appear at
about
the one-million member range," says Snow. "After that, we were spending
more
and more time supporting the site, fighting these little fires, but
never
really getting to the core of the problem--that the load on the original
design had simply exceeded its capabilities."
Database Problems
Analyzing the site's internal architecture in detail, Snow and his team
found several functionalities in need of redesign. "We looked at the
database code," says Joe Mocker, member of the JDC development team,
"and realized
that everything was synching around a single connection. You might have
registration code that was creating users, and then you might have some
survey, and all of it was using that one connection. The upshot was, if
a
survey function was taking a long time to complete, no one could log
on."
The architecture the group encountered was based on the original JDK 1.x
platform, and the bare-bones database access design may likely have been
driven by feared instabilities in certain ODBC/JDBC drivers of the time.
Of course, given an early JDC membership in the mere tens of thousands,
there had been no real problem at the time in going this single
connection
route. "At that level," says Mocker, "queries were not going to take
much
time, because there were only maybe 100,000 records in the database."
But
it was clearly not an architecture that would translate well into a
membership load in the millions.
Class Hierarchy Issues
Meanwhile, the Solaris Developer Connection (SolDC) had come into
being,
and this second site had essentially been modeled directly after the
JDC.
The SolDC had thus inherited the same capacity limitations as the JDC
site.
But beyond being similar in their architectures, the two sites were also
functionally linked. "They were using one database on the back end to
track both sets of customers," says Mocker. "There can be members of
the
JDC, the SolDC, or both."
Having ultimately inherited two similar but distinct member-driven
systems,
each with unique but similar needs and functionalities, yet both having
capacity issues due to their original architectures, made making
significant changes to the systems a matter of even greater urgency.
"Our
goal was to go back and re-invent things," says Snow, "to build a better
stage that you could more easily add features onto for either system."
And
any new development stage also needed to anticipate future developer
connection sites. "We're now the Developer Connection Engineering
group,"
says Snow, "so we'll not only be working on the JDC and SolDC sites, but
others as they show up."
"Basically wanted to move to a more flexible architecture," adds Greg
Layer, of the Developer Connection team, "one where we could easily
incorporate other new sites, while using the same code base."
Instrumentation Issues
"...group wanted to establish a firm delineation between 'core'
functionality ... and 'applications' ...."
|
In the process of supporting two systems that were experiencing regular
capacity problems, it had also become increasingly apparent that the
sites
lacked any significant degree of "instrumentation" features--the ability
to
view the inner working of the systems in order to properly gauge load,
status, and functionality. This, too, became a primary focal point in
the
new group's re-architecting of the sites.
The original level of instrumentation found in the JDC and SolDC sites
was
limited, to say the least. "The only thing I can recall is that we were
able to peek into the active session cache," says Mocker. "There was a
facility to cause the session servlet to dump out all of its sessions."
"But there was no systematic way to go in and get data from each
component
of the system," says Layer. "It was a matter of looking at log files
and
scanning through the session cache output."
New Site Requirements
To review, the Developer Connection team sought the following in their
redesign of the JDC and SolDC sites:
- An easy-to-work-with class hierarchy, to better facilitate
making changes and enhancements to the JDC and SolDC sites (as well as
whatever new sites might arise in the future).
- A revamped database access scheme, to reduce performance
bottlenecks that had arisen with the increased memberships now found on
the
JDC and SolDC sites.
- A more well-defined, flexible, and informative instrumentation
functionality, to better assess the inner workings of the sites,
as well as to aid in debugging.
Solutions
The re-architected JDC and SolDC sites are now running on an
Apache/Jrun,
Web server/servlet runner combination. "It's one of many such
combinations
that are currently out there," says Snow.
"But the architecture we have now is not really tied to any given
combination," adds Rama Roberts of the Developer Connection team.
Class Hierarchy
One of the major infrastructure changes to the architecture of the
Developer Connection sites was in the redesign of the basic class
hierarchy. And what better name than "Barstow"--any-town USA--to
specify
the generic functionalities of both sites?
"That was a big part of redesigning the system," says Roberts. "We
pulled
things out to the 'Barstow' level--where everything had previously been
"One ... major infrastructure change ... was in the redesign
of the basic class hierarchy."
|
at
the JDC and SolDC level." The Developer Connection team effectively
abstracted out the common functional elements of the JDC and SolDC
sites,
and reduced them to a single, common core. "With Barstow," explains
Roberts, "we took the common functionality of both sites--the ability to
have restricted areas of content, to have session management, with
people
logging in, to have people registering--and pulled them out to the
highest
level."
From there, additional functionality for a given site could be added as
needed. "Instead of re-writing everything," says Roberts, "you can
still
change things or add things, but you get the benefit of having this core
automatically provided.
That was one of the key design goals of Barstow," adds Layer, "to make
it
easy to implement these services and features for other sites."
And the revamped class hierarchy has already begun to pay off--even
beyond
the Developer Connection sites. "Just recently," says Roberts, "I was
approached by another group at Java Software that had content they
wanted to
offer to a select number of people who were defining a new API." Using
the
Developer Connection's revamped class hierarchy, such restricted access
functionality was relatively easy to implement. "They've now got a
tailor-made access-restricted group," says Roberts, "but with no major
new
development required."
The actual class hierarchy is now arranged as follows:
core/
barstow/ -- the generic stubs and impl.
registry/
user
access
session
jdc/ -- JDC specific impl.
soldc/ -- SolDC specific impl.
apps/
barstow/ -- apps that can work in any
impl.
jdc/ -- apps specific to the JDC
soldc/ -- apps specific to the SolDC
|
"We wanted a clean way of maintaining different variants of the code for
each web site," explains Mocker. "The logical way of doing this in the
Java language is to use generic interfaces and abstract classes, along
with
package naming, to maintain class hierarchies for the individual
variants."
The Barstow package houses a generic implementation--the interface and
abstract class definitions, as well as generic services (those that can
be
run by, and, or use classes from any other variant). Meanwhile, the
variant
packages, such as jdc/, implement the generic interfaces
and
abstract classes to augment the behavior best suited to their
individual needs. They may also use generic services as needed.
Finally, the Developer Connection group wanted to establish a firm
delineation between what they saw as "core" functionality--services
essential to any site's implementation (session management, access
control,
registration and logon services), and "applications," which are the
unique
meat of each site, and tend to be site-specific. This was achieved by
establishing separate subdirectories--core/ and
apps/--for the respective sets of code.
The above architecture is more efficient both in terms of maintaining
and
enhancing the current developer sites, and in the development of future
sites. "Now, if we go to develop 'Foo Developer Connection,'" says
Mocker,
"we can more easily grab only the core, and know that we are not
carrying
along any baggage from some other variant. We typically just create a
new
package--say, xdc, start with the generic base components,
and
then fill in new variant components as needed. Meanwhile, if we make
bug
fixes to a piece of the code base, we clearly know what servers need to
be
updated. If we change something in a barstow package, all
the
servers need to be updated. But if it's something in a jdc
package, then only the JDC servers need to be updated."
Database Pooling
A database pool is simply a cache of open connections that can be used
and
reused, thus cutting down on the overhead of creating and destroying
"...a database pool ... helps to mitigate many of the database
performance bottlenecks...."
|
database connections. Such a pool helps to mitigate many of the
database
performance bottlenecks the Developer Connection group had found in the
original JDC and SolDC implementations. "It provides for better
multi-threaded performance," explains Mocker. "As one thread needs to
do
something with the database, it can simply go grab a connection.
Meanwhile, if another thread comes along and needs to do something else,
rather than waiting for the first connection to finish, it can grab its
own
connection and go do its thing. And then when they finish," he adds,
"they
can return those connections to the pool. That saves the overhead of
making the connections again each time you need them."
An alternative to the database pool would have been to simply let each
of
the site's various servlets establish their own separate connections to
the
database. But this would have solved only part of the problem. "Since
there can be multiple clients accessing the same servlet at the same
time,"
says Mocker, "you'd still have problems--because if one customer gets
the
connection before the other, then the second one is locked out. But if
they're going through a pool, the first one grabs its connection, the
second one comes in and grabs its connection--they're both doing their
thing--and then they free themselves up."
The group explored a number of off-the-shelf database pool
implementations,
but none seemed to offer the flexibility they were looking for. A
primary
design goal for the pool facility was to be able to turn it on and off
at
will. But most off-the-shelf pools made this difficult to do, requiring
that one either use the pool all the time, or not at all. "I wanted to
be
able to use the connection pool or not, just by plugging in a different
database URL," notes Mocker.
Ultimately, the group included in their database pool an implementation
a
custom driver set to recognize URLs that indicated a need for pooling.
Using JDBC, the team set the rules for a pool-enabled URL. "I decided
that
URLs that started with jdbc:dpool: would be what my driver
recognized," says Mocker. "Everything after that prefix would be the
real
database URL, specifying where you wanted to connect."
Thus:
jdbc:oracle:thin:user/password@localhost:1521:sid
could be turned into a database pool-enabled URL by instead specifying:
jdbc:dpool:oracle:thin:user/password@localhost:1521:sid
"If the driver sees jdbc:dpool in the URL," says Mocker,
"it
takes the rest of the line as the actual URL, looks it up to see whether
there is already a pool connection available, and if not, gets a new
connection, which is then ultimately returned back to the pool when it
is
finished."
Further details of the JDC/SolDC database pool implementation will be
explored in a future article installment of "Backstage at the JDC."
Instrumentation
The Developer Connection team strove in its redesign to provide not only
a
greater degree of instrumentation information, but also to make access
"...our management class could request instrumentation data from
any registered object in the system."
|
to
the instrumentation data more centralized. "We came up with a design
for
an abstract class," says Layer, "that each of the components in our
system
could implement. That way, our management class could request
instrumentation data from any registered object in the system."
The redesign took the responsibility of providing instrumentation data,
and embedded it in the actual components themselves. "We've taken
instrumentation to a more generic place," says Roberts. "It isn't a
central location for storing your data, it's a central location
for
knowing how to obtain the data."
"Exactly," adds Layer. "The data is now maintained and produced by the
components themselves. What's centralized, is the facility to say,'show
me your data.' It's really a small distinction, but it's an important
one--the fact that there's really nothing in the system that knows about
every piece, or even knows what it's keeping track of. It simply knows
that the components are keeping track of something, and they all
implement, say, the get_data method, and then they all
return
some piece of information when you call that method."
Summary
The fruits of the re-architecting of the JDC and SolDC sites are now
seen
on a daily basis. In spite of the membership of the JDC having risen
from
1.1 million to over 1.5 million since the site was redesigned, the
Developer
Connection team has seen a significant decrease in both the load on the
system's servers, and in time spent servicing the site. "Our members
don't
see anything different," says Roberts, "they just experience a more
reliable system!"
Future installments of "Backstage at the JDC," will more fully
explore the changes that were made in redesigning the JDC and SolDC
sites.
About the author
Steve Meloan, frequent contributor to the JDC, and java.sun.com, is a journalist, and former software developer. His work has appeared in Wired, Rolling Stone, BUZZ, San Francisco Examiner, ZDTV's "The Site," and American Cybercast's "The Pyramid."
|
|