Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Clip2.com Gets Supercharged With Java Technology

 
 
Technology


The crew at Clip2.com hope to reinvent the way you connect with interesting and informative web content. Part of their success formula is Java technology. This article tracks Clip2's transition from a Microsoft-based architecture to the open world of JavaServer Pages (JSP) and the Apache Web server. It follows their porting effort and looks into why they chose JSP. Then the article traces the flow of information through their new architecture. At the end of the ride, it checks out some JSP code with embedded method calls.

Clip2.com's concept is different than the typical laundry list of links based on Boolean searches. Clip2 offers recommendation "topic guides," which are commentary and links on specific subjects ranging from Next Generation Technologies to Brittany Spears. According to Clip2's Marketing VP Anthony Lee, guides are rated using Clip2's proprietary guide engine. Each guide is assigned a daily numeric value, providing a snapshot of how popular a guide has become with site visitors. Competition for ratings leads publishers to constantly upgrade their guides with links to photos, MP3 files, downloads, or whatever may be appropriate for the subject. Since these ratings are based on click-throughs, they're much more difficult to spam than a traditional five-star rating system. Clip2 users can also save or "subscribe" to guides, thereby constructing a community around various interests. Subscribed guides are automatically updated when the publisher modifies or adds to the information.

A Clip2.com guide to Java resources is shown below. The green up-arrow next to the 113 rating indicates that the guide's popularity is on the rise.

Click here to see the complete set of Java links in the guide.

In the Beginning

Clip2 started out on the Microsoft NT platform, using the architecture depicted below:

  • Microsoft's Internet Information Server (IIS)
  • Coupled with Active Server Pages (ASP)
  • Active Data Objects (ADO)
  • SQL database server back end.
  • The architecture is depicted in the diagram below.

    The C++ code is the guide engine that enables users to create guides and calculate ratings based on page views and click-throughs to links listed in the guide.

    Motivation for Change

    Early on in the development effort, Engineering Director, Ziad El Kurjie, began to have concerns about scalability and reliability issues involved with the Microsoft-based platform. Then the development testbed began to exhibit deadlock problems resulting in a complete system freeze.

    "We started off with a generic web database application architecture. During beta testing, we had an interaction between ADO libraries and the database, which created locks on tables," El Kurjie says. "These locks would persist, and started locking out incoming users, especially in the situation where users were requesting big collections of data (large XML sets). It created a serious impact on the database, involving a variety of table lock types. One would think that locks should only affect specific users, or user accounts, but it seems that under this architecture, SQL server locks can propagate and escalate to become a global lock."

    El Kurjie explained that their C++ code implements Mutex locks for managing multithreaded access. The interaction of Active Data Objects (ADO), Mutex locks, and the SQL server-locking scheme created a situation that eventually led to system crash.

    These problems, in addition to concerns about scalability, paved the road to Java technology. "We all had familiarity and positive experiences with the Unix environment. And we were also concerned about reliance on a single vendor. A major selling point for Java technology, is that it is being developed by many vendors, and there's a very supportive developer community. There is more adoption of the technology every day. So we decided to make the transition to Java," says El Kurjie.

    The New Configuration


    Clip2 Software Architecture

    Clip2 decided on Solaris, JavaServer Pages (JSP), and the Apache web server, currently the most popular web server on the Internet. Apache JServ's implementation is compliant with the Java Servlet API 2.0. Clip2 also uses GNUJSP 1.0, which is compliant with the JSP 1.1 specification.

    The Apache web server is coupled with a Java Virtual Machine1 (JVM), JServ, the C++ guide engine, communicating through Java Native Interface (JNI), and Oracle Call Interface (OCI), to interact with the Oracle back end.

    This architecture allows for very flexible configurations of expansion. Webservers and JServ machines can be added whenever traffic or application demands dictate that more processing power is needed.

    Hardware Configuration:

  • Sun Ultra 250, running Solaris 2.7 for the Apache and JServ machines
  • Sun Ultra 450, running Solaris 2.7 for the Oracle database machines.
  • Apache Facts

    Apache is a product of the Apache Software Foundation (ASF), a non profit corporation that provides an infrastructure for open, collaborative software development.

    Apache JServ, is a web server, implemented as two separate components, one written in C, and the other in the Java programming language. The two sides communicate via the Apache JServ Protocol (AJP).

    The Java side is called the Servlet Engine, where most of Apache JServ's functionality resides. It implements the Servlet API, so that any web server can be attached, provided a communication module is written using AJP.

    The C side, called mod_jserv, passes requests and information between the Apache web server and the Java side, and can also be used to automatically start and stop the JVM in which the Java side is executed.

    Oracle Call Interface

    "There are many options to chose from, for interfacing C++ code with a database," says El Kurjie. He chose Oracle Call Interface (OCI). "It's the fastest and most efficient, but not necessarily the easiest to code." XML data is stored using character large objects (Clobs), which puts large demands on the database. These operations are the heart of Clip2's system, so efficiency is key.

    The Porting Effort

    "We thought setting up the communication between our C++ code and the Java environment would be difficult, but it actually turned out to be quite easy," says El Kurjie. "There were some good resources on the Web regarding the use of Java Native Interface (JNI)."

    Sun provided telephone support during setup and configuration of the Solaris machines, and the team downloaded a wide array of GNU tools and utilities from the Web.

    The entire porting effort took four engineers six weeks, including the work with C++ code, ASP, and the Oracle database. Another two weeks were added for system integration and testing. All-in-all, according to El Kurjie, things went very smoothly.

    Pitfalls

    El Kurjie and his team did encounter some problems configuring the Apache server. The difficulties were mainly related to ensuring that various machines had equivalent JSP and HTML directory structures.

    "We discovered that the directory structure on the Apache web server machine must be duplicated, in an absolute sense, on the JServ machine (see previous system block diagram). The documentation in this area was very sketchy. We searched the Web and looked at various news groups, and finally had to figure it out ourselves," El Kurjie says.

    Why JavaServer Pages?

    A typical Web application generates large amounts of user interface content in the form of HTML or XML. In general, some of this content is dynamic, depending on the results of computations or database access, but most is static. The basic servlet mechanism can be used to generate interactive web page content including both the static and dynamic aspects. Specifying the static part of a response page within a servlet, however, can be cumbersome. The user interface design and the program logic should ideally be separated, and evolve as separate developer skill sets. In addition, it is difficult to understand and maintain servlet code that generates complex user interfaces. The solution is JavaServer Pages.

    JSP technology is an extension servlet technology. Not only are JSP pages actually compiled into servlets, but JSP pages can include/forward servlets, and servlets, in turn, can include/forward JavaServer Pages.

    If you're already using servlets in web-based applications, JSP technology can provide faster deployment, easier maintenance, and better page authoring support. By separating the interactive logic from the page design, JSP pages allow for a division of labor between maintaining individual pages and core application logic components. Using customized tag libraries, application functionality can easily be distributed to a wide range of page authors, freeing up more time to focus on application architecture and new applications.

    In practice, JSPs are HTML or XML pages extended with a number of mechanisms to allow dynamic content to be added and then sent on to the client. All processing of JSP XML or HTML extensions is done on the server, and each extension is either removed or replaced before the page is sent to the client.

    JSP supports the following extensions:

  • Include Actions--These tags can be used to include fragments of HTML/XML that need to be shared across a number of response pages such as standard prologues and epilogues, or standard notices such as copyright and legal disclaimers. Isolating such shared content into an include file allows for a simple, single point of update. Expressions take the form
    <%@ include file=filename%>
  • JavaServer Includes--HTML/XML tags that allow a servlet to be called during the processing of the page. The servlet is evoked as if it was the target of the page's HTTP/XML request including all execution content. Includes differ from top level servlets in that they produce only a fragment of an HTML/XML page rather than a whole page. During page processing, the HTML/XML fragment will be substituted for the tag. Example syntax follows:
  •  <jsp:include.../>    
    
  • JavaServer Page Scripting--While JavaServer Includes offer a very simple way to separate dynamic and static content, they require writing servlets. JSP scripting provides a way to add server-side scripting logic directly into the HTML/XML of a response page. JSP can be used to do fine-grain control of the content such as conditionally making a word plural based on a dynamic value such as the number of items added to a shopping basket. JSP also provides a simple mechanism for writing a smaller amount of program logic than a full blown servlet, because JSP pages don't require standard programming steps such as compilation. JSP provides the full power of Java in the simple form of scripting within an HTML/XML page. In fact, developers can implement an entire application using only JSP, without ever writing a single explicit servlet, since JSP files are automatically translated into Java servlets when executed.
  • The basic JSP methodology is simple. A set of HTML/XML tags are defined that allow the developer to indicate inline Java code. The scripted page is converted into a Java servlet on the first reference, thus JSP is very efficient, not requiring any runtime parsing as is required for server-side includes. And it is open-ended because the scripter has complete access to the Java execution environment, including the execution context, database access, and so on. This concept also preserves the integrity of the Java environment, as it does not require a new set of object libraries or a new execution model just to support scripting.

    The JSP Advantage

    Although you can certainly create dynamic, interactive pages with servlets alone, JSP technology makes the process easier.

  • With JSP pages, it is simple to combine static templates, including HTML or XML fragments, with the code to generate dynamic content.
  • JSP pages compile dynamically into servlets when requested. JSP pages can also be precompiled into servlets.
  • The JSP tags for invoking JavaBeans components manage these components completely, shielding the page author from complexity.
  • The JSP page structure also supports authoring tools, which are now becoming more and more available.
  • JSP authors do not have to know the Java language or be able to write servlets. By separating content generation from page layout and design tasks, JSP technology supports a tiered development approach that delivers faster and more efficient application development and deployment. A few web developers can concentrate on the application functionality, creating cross-platform components and customized tag libraries for page authors, who in turn design and maintain the individual pages.

    With this model, JSP technology provides many advantages:

  • Developers can offer customized JSP tag libraries that page authors access using an XML-like syntax. Because the JSP container takes care of compiling the page on demand, the page author can handle updates easily.
  • JSP pages can also access JavaBeans and Enterprise JavaBeans components that encapsulate sophisticated application logic, access to legacy data sources, and so on. These components, once written, are portable across platforms and servers. Re-using existing or off-the-shelf components (such as beans or customized tag libraries) speeds new application development.
  • Page authors can change and edit the fixed template portions of pages as frequently as needed--without affecting the application logic. Similarly, developers can make logic changes at the component level without editing the individual pages that use the logic.
  • Overall, using JSP technology along with servlets provides the tools to increase productivity by delegating page authoring tasks to a wide range of individuals with different skill sets.

    Simple JSP Implementation

    The simple JSP architecture illustrated below is a generalized implementation suitable for relatively low-volume applications. Clip2.com's architecture reflects the need for greater flexibility and scalability.


    Simple JSP Architecture



    Functional Flow Through Clip2's Processing Architecture

    The Clip2 system diagram is presented again below:


    Clip2 Software Architecture



    A Typical Client Request Transaction

    Take a look at how information moves through the various environments of the system.

    In the Web Server (Apache)

  • HTTP request is sent to the Apache web server for a JSP page, such as load.jsp
  • Apache passes the call to the mod_jserv module
  • mod_jserv "translates" the HTTP request into a AJP servlet class call into a jserv-url, using the specs of the ajpv11 protocol, which is set in the Apache JServ configuration file as ApJServDefaultProtocol.
  • The translated, path specific, call is then directed to the JServ's address, including hostname and port number where the JServ runs and listens.
  • In the Servlet Engine (Apache JServ)

  • Next, the call is compiled from the ajpv11/jsp translated call to a Java class. The servlet class loader checks whether class needs recompiling (such as, if load.jsp has been changed)
  • gnujsp (JSP10Compiler) takes care of the compilation of the page into a .java file and places the resulting class file in the servlets_scratch directory.
  • The .java file is then compiled and a .class file is run in the servlet environment resulting in the HTTP response to be sent back to the requesting client, via the servlet's open socket.
  • A JSP Page

    Remember that Java method calls were embedded in the JSP code. For example load.jsp would contain:

    
    <%@ page import = "com.clip2.jsp.*" %>
    
    
    <%-- The above statement imports
    the session and its related objects (in Java) of 
    which UserMgr is a class under com.clip2.jsp.*.  
    As a result, the methods of such classes can be 
    called now --%>
    
     
    <% if (status) { %> 
    <html> 
    <head> 
    <title>load user</title> 
    </head> 
    <body> 
    <% 
    String uname = "mytest"; 
    String email = UserMgr.getUserInfo(uname); 
    %> 
    </body> 
    </html> 
    <% } 
    else{ 
    response.sendRedirect("Error.html"); 
    } 
    %> 
    

    As the generated Java class is executed, UserMgr.getUserInfo calls a native Java method. The native Java method is implemented in C++ using the JNI API. The C++ code is then invoked and the user information is retrieved from the database and sent back to the application.

    In the C++ code, the JNI/C method implementation would take this form:

    extern "C" 
    JNIEXPORT jboolean 
       JNICALL Java_UserMgr_getUserInfo(JNIEnv *env, 
                        jobject obj, jstring email){ 
    UserObj* peer = 
         (UserObj *)JavaPeer::getCPeer(env,obj) 
    ... 
    return peer->getUInfo(email); 
    } 
    

    getUInfo executes the fetch from the database as well as other application logic. A connection is open to the database if none is available, and an OCI method is executed to retrieve the user information, including the Large Object XML data.

    OCI methods such as: OCIStmtExecute, OCILobRead and OCIGetLength are used to iterate and stream through the XML and operate on it in the application specific logic.

    Here is a guide created by Ziad El Kurjie containing resources and references on the Web for Apache JServ. It contains FAQs, installation and configuration instructions, scalability and load-balancing information, and so on.

    Conclusion

    "By some estimates, there are 500,000 niche topical communities on the Internet," says Anthony Lee. "We believe the real experts in those areas are the enthusiasts themselves. At Clip2, we're providing the easiest way for these people to publish and share their expertise, coupled with a rating system that helps cut through the noise."

    Lee also points out that the Clip2 concept is constantly evolving. Communities are springing up on the site around various interests, generating mailing lists, subscribed guides, and private guides with password access. This requires very flexible software and hardware resources.

    "Sometimes our users come up with an innovation before we think of it. For instance, affiliate links within guides developed spontaneously," Lee observes. "Affiliate links can be established with a vendor, so that the guide publisher gets a percentage of sales coming through his or her guide. We never know where the next innovation will come from, but we have to be prepared to handle it."

    "Our challenge is to build out capability as quickly as possible," says Lee. "We need a system capable of scaling up to a million impressions per day, that's why we chose Java technology. Our key concerns were reliability, scalability, and maturity. We're quite happy with the outcome."

    Reference URLs

    Clip2

    Java Developer Guide courtesy of Sun Microsystems

    Apache Guide

    The Apache Foundation

    Powering the Web Experience with Dynamic Content

    JavaServer Pages Technology for Servlet Developers

    Coffecup Logo

    About the Author

    Michael Meloan, a frequent contributor to the Java Developer Connection, began his professional career writing IBM mainframe and DEC PDP-11 assembly languages. He went on to code in PL/I, APL, C and Java. In addition, his fiction has appeared in WIRED, BUZZ, Chic, L.A. Weekly, and on National Public Radio.

    _______
    1 As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.