Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Backstage at the JDC: Redesigning the Architecture

 
 

Articles Index


The notion of a legacy system typically brings to mind an information facility that was developed long ago, often using antiquated technology, and needing to be brought kicking and screaming into the 21st Century.

"...with Java technology having now been on the scene for over five years, it's increasingly possible for developers to find themselves redesigning systems that were written from their inception in the Java programming language."

But with Java technology having now been on the scene for over five years, it's increasingly possible for developers to find themselves re-architecting systems that were written from their inception in the Java programming language.

This installment of "Backstage at the JDC," examines the issues that led up to re-architecting the Java Developer Connection (JDC) site itself, along with the solutions implemented, and what advantages they brought with them. Future article installments will drill down into the various enhancements and changes that were made to the site, as well as provide sample code to illustrate these changes.

The Early Days

As with many sites, simple load issues first signaled the need for a change

"...simple load issues first signaled the need for a change.... 'Serious problems began to appear at about the one-million member range,'.... "
at the JDC. "The original membership numbers were in the 10,000 range," reports Will Snow, manager of the JDC site's current development team, "but we're now over 1.5 million." Snow and his team inherited the site from an original group of developers. "Serious problems began to appear at about the one-million member range," says Snow. "After that, we were spending more and more time supporting the site, fighting these little fires, but never really getting to the core of the problem--that the load on the original design had simply exceeded its capabilities."

Database Problems

Analyzing the site's internal architecture in detail, Snow and his team found several functionalities in need of redesign. "We looked at the database code," says Joe Mocker, member of the JDC development team, "and realized that everything was synching around a single connection. You might have registration code that was creating users, and then you might have some survey, and all of it was using that one connection. The upshot was, if a survey function was taking a long time to complete, no one could log on."

The architecture the group encountered was based on the original JDK 1.x platform, and the bare-bones database access design may likely have been driven by feared instabilities in certain ODBC/JDBC drivers of the time. Of course, given an early JDC membership in the mere tens of thousands, there had been no real problem at the time in going this single connection route. "At that level," says Mocker, "queries were not going to take much time, because there were only maybe 100,000 records in the database." But it was clearly not an architecture that would translate well into a membership load in the millions.

Class Hierarchy Issues

Meanwhile, the Solaris Developer Connection (SolDC) had come into being, and this second site had essentially been modeled directly after the JDC. The SolDC had thus inherited the same capacity limitations as the JDC site. But beyond being similar in their architectures, the two sites were also functionally linked. "They were using one database on the back end to track both sets of customers," says Mocker. "There can be members of the JDC, the SolDC, or both."

Having ultimately inherited two similar but distinct member-driven systems, each with unique but similar needs and functionalities, yet both having capacity issues due to their original architectures, made making significant changes to the systems a matter of even greater urgency. "Our goal was to go back and re-invent things," says Snow, "to build a better stage that you could more easily add features onto for either system." And any new development stage also needed to anticipate future developer connection sites. "We're now the Developer Connection Engineering group," says Snow, "so we'll not only be working on the JDC and SolDC sites, but others as they show up."

"Basically wanted to move to a more flexible architecture," adds Greg Layer, of the Developer Connection team, "one where we could easily incorporate other new sites, while using the same code base."

Instrumentation Issues

"...group wanted to establish a firm delineation between 'core' functionality ... and 'applications' ...."

In the process of supporting two systems that were experiencing regular capacity problems, it had also become increasingly apparent that the sites lacked any significant degree of "instrumentation" features--the ability to view the inner working of the systems in order to properly gauge load, status, and functionality. This, too, became a primary focal point in the new group's re-architecting of the sites.

The original level of instrumentation found in the JDC and SolDC sites was limited, to say the least. "The only thing I can recall is that we were able to peek into the active session cache," says Mocker. "There was a facility to cause the session servlet to dump out all of its sessions."

"But there was no systematic way to go in and get data from each component of the system," says Layer. "It was a matter of looking at log files and scanning through the session cache output."

New Site Requirements

To review, the Developer Connection team sought the following in their redesign of the JDC and SolDC sites:

  • An easy-to-work-with class hierarchy, to better facilitate making changes and enhancements to the JDC and SolDC sites (as well as whatever new sites might arise in the future).
  • A revamped database access scheme, to reduce performance bottlenecks that had arisen with the increased memberships now found on the JDC and SolDC sites.
  • A more well-defined, flexible, and informative instrumentation functionality, to better assess the inner workings of the sites, as well as to aid in debugging.

Solutions

The re-architected JDC and SolDC sites are now running on an Apache/Jrun, Web server/servlet runner combination. "It's one of many such combinations that are currently out there," says Snow.

"But the architecture we have now is not really tied to any given combination," adds Rama Roberts of the Developer Connection team.

Class Hierarchy

One of the major infrastructure changes to the architecture of the Developer Connection sites was in the redesign of the basic class hierarchy. And what better name than "Barstow"--any-town USA--to specify the generic functionalities of both sites?

"That was a big part of redesigning the system," says Roberts. "We pulled things out to the 'Barstow' level--where everything had previously been

"One ... major infrastructure change ... was in the redesign of the basic class hierarchy."
at the JDC and SolDC level." The Developer Connection team effectively abstracted out the common functional elements of the JDC and SolDC sites, and reduced them to a single, common core. "With Barstow," explains Roberts, "we took the common functionality of both sites--the ability to have restricted areas of content, to have session management, with people logging in, to have people registering--and pulled them out to the highest level."

From there, additional functionality for a given site could be added as needed. "Instead of re-writing everything," says Roberts, "you can still change things or add things, but you get the benefit of having this core automatically provided. That was one of the key design goals of Barstow," adds Layer, "to make it easy to implement these services and features for other sites."

And the revamped class hierarchy has already begun to pay off--even beyond the Developer Connection sites. "Just recently," says Roberts, "I was approached by another group at Java Software that had content they wanted to offer to a select number of people who were defining a new API." Using the Developer Connection's revamped class hierarchy, such restricted access functionality was relatively easy to implement. "They've now got a tailor-made access-restricted group," says Roberts, "but with no major new development required."

The actual class hierarchy is now arranged as follows:

core/
     barstow/  -- the generic stubs and impl.
             registry/
                      user
                      access
                      session
                        

     jdc/    -- JDC specific impl.
                        

     soldc/     -- SolDC specific impl.
                        

apps/
     barstow/  -- apps that can work in any
impl.

     jdc/  -- apps specific to the JDC

     soldc/ -- apps specific to the SolDC

"We wanted a clean way of maintaining different variants of the code for each web site," explains Mocker. "The logical way of doing this in the Java language is to use generic interfaces and abstract classes, along with package naming, to maintain class hierarchies for the individual variants."

The Barstow package houses a generic implementation--the interface and abstract class definitions, as well as generic services (those that can be run by, and, or use classes from any other variant). Meanwhile, the variant packages, such as jdc/, implement the generic interfaces and abstract classes to augment the behavior best suited to their individual needs. They may also use generic services as needed.

Finally, the Developer Connection group wanted to establish a firm delineation between what they saw as "core" functionality--services essential to any site's implementation (session management, access control, registration and logon services), and "applications," which are the unique meat of each site, and tend to be site-specific. This was achieved by establishing separate subdirectories--core/ and apps/--for the respective sets of code.

The above architecture is more efficient both in terms of maintaining and enhancing the current developer sites, and in the development of future sites. "Now, if we go to develop 'Foo Developer Connection,'" says Mocker, "we can more easily grab only the core, and know that we are not carrying along any baggage from some other variant. We typically just create a new package--say, xdc, start with the generic base components, and then fill in new variant components as needed. Meanwhile, if we make bug fixes to a piece of the code base, we clearly know what servers need to be updated. If we change something in a barstow package, all the servers need to be updated. But if it's something in a jdc package, then only the JDC servers need to be updated."

Database Pooling

A database pool is simply a cache of open connections that can be used and reused, thus cutting down on the overhead of creating and destroying

"...a database pool ... helps to mitigate many of the database performance bottlenecks...."
database connections. Such a pool helps to mitigate many of the database performance bottlenecks the Developer Connection group had found in the original JDC and SolDC implementations. "It provides for better multi-threaded performance," explains Mocker. "As one thread needs to do something with the database, it can simply go grab a connection. Meanwhile, if another thread comes along and needs to do something else, rather than waiting for the first connection to finish, it can grab its own connection and go do its thing. And then when they finish," he adds, "they can return those connections to the pool. That saves the overhead of making the connections again each time you need them."

An alternative to the database pool would have been to simply let each of the site's various servlets establish their own separate connections to the database. But this would have solved only part of the problem. "Since there can be multiple clients accessing the same servlet at the same time," says Mocker, "you'd still have problems--because if one customer gets the connection before the other, then the second one is locked out. But if they're going through a pool, the first one grabs its connection, the second one comes in and grabs its connection--they're both doing their thing--and then they free themselves up."

The group explored a number of off-the-shelf database pool implementations, but none seemed to offer the flexibility they were looking for. A primary design goal for the pool facility was to be able to turn it on and off at will. But most off-the-shelf pools made this difficult to do, requiring that one either use the pool all the time, or not at all. "I wanted to be able to use the connection pool or not, just by plugging in a different database URL," notes Mocker.

Ultimately, the group included in their database pool an implementation a custom driver set to recognize URLs that indicated a need for pooling. Using JDBC, the team set the rules for a pool-enabled URL. "I decided that URLs that started with jdbc:dpool: would be what my driver recognized," says Mocker. "Everything after that prefix would be the real database URL, specifying where you wanted to connect."

Thus:

jdbc:oracle:thin:user/password@localhost:1521:sid

could be turned into a database pool-enabled URL by instead specifying:

jdbc:dpool:oracle:thin:user/password@localhost:1521:sid

"If the driver sees jdbc:dpool in the URL," says Mocker, "it takes the rest of the line as the actual URL, looks it up to see whether there is already a pool connection available, and if not, gets a new connection, which is then ultimately returned back to the pool when it is finished."

Further details of the JDC/SolDC database pool implementation will be explored in a future article installment of "Backstage at the JDC."

Instrumentation

The Developer Connection team strove in its redesign to provide not only a greater degree of instrumentation information, but also to make access

"...our management class could request instrumentation data from any registered object in the system."
to the instrumentation data more centralized. "We came up with a design for an abstract class," says Layer, "that each of the components in our system could implement. That way, our management class could request instrumentation data from any registered object in the system."

The redesign took the responsibility of providing instrumentation data, and embedded it in the actual components themselves. "We've taken instrumentation to a more generic place," says Roberts. "It isn't a central location for storing your data, it's a central location for knowing how to obtain the data."

"Exactly," adds Layer. "The data is now maintained and produced by the components themselves. What's centralized, is the facility to say,'show me your data.' It's really a small distinction, but it's an important one--the fact that there's really nothing in the system that knows about every piece, or even knows what it's keeping track of. It simply knows that the components are keeping track of something, and they all implement, say, the get_data method, and then they all return some piece of information when you call that method."

Summary

The fruits of the re-architecting of the JDC and SolDC sites are now seen on a daily basis. In spite of the membership of the JDC having risen from 1.1 million to over 1.5 million since the site was redesigned, the Developer Connection team has seen a significant decrease in both the load on the system's servers, and in time spent servicing the site. "Our members don't see anything different," says Roberts, "they just experience a more reliable system!"

Future installments of "Backstage at the JDC," will more fully explore the changes that were made in redesigning the JDC and SolDC sites.

Coffecup Logo

About the author

Steve Meloan, frequent contributor to the JDC, and java.sun.com, is a journalist, and former software developer. His work has appeared in Wired, Rolling Stone, BUZZ, San Francisco Examiner, ZDTV's "The Site," and American Cybercast's "The Pyramid."