Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Java Technology and XML-Part Two

 
 

Articles Index


Neither Java nor XML Technology need an introduction, nor the synergy between the two: "Portable Code and Portable Data". With the growing interest in web services and e-business platforms, XML is joining Java in the developer's toolbox. As of today, no less than six extensions to the Java Platform empower the Java developer when building XML-based applications:

  • Java API for XML Processing (JAXP)
  • Java API for XML/Java Binding (JAXB)
  • Long Term JavaBeans Persistence
  • Java API for XML Messaging (JAXM)
  • Java API for XML RPC (JAX RPC)
  • Java API for XML Registry (JAXR)

This article is the second in a series of three. The first one gave an overview of the Java API for XML Processing (JAXP), and the technologies that it directly or indirectly provides to the Java developer or technologies that rely on it in order to process XML documents. It illustrated the use of different APIs with some sample programs. This second article focuses on the relative performance of these APIs as obtained by running the sample programs presented in the first article. This series will conclude with a third article which gives tips on how to improve the performance of XML-based applications from a programmatic and architectural point of view.

The purpose of the tests presented in this paper is primarily to highlight the respective performance of different XML processing techniques: SAX, DOM, XSLT, and the impact of validation against a DTD or an XML Schema. The performances of different API implementations: Xerces, Crimson, Xalan, Saxon, XSLTC, and so on when run on different Java runtimes JDK 1.2 and JDK 1.3 (Client and Server) are also compared. The results presented here don't claim to cover all the API implementations available today but underline that the tradeoff between ease of use and performance of a chosen processing models can be biased by the implementation of the underlying parser, document builder or style sheet engine.

Methodology

The sample programs presented in the first article demonstrated the use of the different APIs available to the Java developer. In order to be compared, we applied them to solving the same problem and providing an identical solution. The problem was kept simple to accommodate the different capabilities of those technologies and therefore it may not have revealed the richness and power of the more sophisticated ones like XSLT and XPath. In this respect, the performance tests presented here may be considered as micro-benchmarks.

The sample programs are based on different XML processing APIs,

  • SAX, the Simple API for XML
  • DOM, the Document Object Model API from W3C
  • XSLT, the XML Style Sheet Language Transformations from W3C
  • XPath, the XML Path Language from W3C
  • JDOM, the "Java optimized" document object model API from jdom.org

applied to the same set of documents to provide the exact same outputs. The XML documents processed using those different techniques conform to the same Document Type Definition or XML Schema. Those schemas specify the representation of a set of chessboard configurations.

illustration 1

Figure 1: A chessboard configuration

In the schemas, the number of chessboard configurations is unbounded, allowing any number of chessboard configurations to be defined in a single document. In order to measure the performance of the different sample programs, documents with 10, 100, 200, 300, 400, 500, 1000, 2000, 3000, 4000 and 5000 chessboard configurations have been created.

XML documents conforming to the Chessboards.dtd and Chessboards.xsd schema have identical structures and contents, the only exception being the replacement of the Document Type Declaration by a reference to the XML Schema. For the test involving XML Schemas, only a document with 1000 chessboard configurations was used.

Code Sample 1: An XML document conforming to the DTD (Chessboards-[10-5000].xml)

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE CHESSBOARDS SYSTEM "dtd/Chessboards.dtd">

<CHESSBOARDS>
 <CHESSBOARD>
  <WHITEPIECES>
   <KING><POSITION COLUMN="G" ROW="1" /></KING>
   <BISHOP><POSITION COLUMN="D" ROW="6" /></BISHOP>
   <ROOK><POSITION COLUMN="E" ROW="1" /></ROOK>
   <PAWN><POSITION COLUMN="A" ROW="4" /></PAWN>
   <PAWN><POSITION COLUMN="B" ROW="3" /></PAWN>
   <PAWN><POSITION COLUMN="C" ROW="2" /></PAWN>
   <PAWN><POSITION COLUMN="F" ROW="2" /></PAWN>
   <PAWN><POSITION COLUMN="G" ROW="2" /></PAWN>
   <PAWN><POSITION COLUMN="H" ROW="5" /></PAWN>
  </WHITEPIECES>
  <BLACKPIECES>
   <KING><POSITION COLUMN="B" ROW="6" /></KING>
   <QUEEN><POSITION COLUMN="A" ROW="7" /></QUEEN>
   <PAWN><POSITION COLUMN="A" ROW="5" /></PAWN>
   <PAWN><POSITION COLUMN="D" ROW="4" /></PAWN>
  </BLACKPIECES>
 </CHESSBOARD>
 <CHESSBOARD>
  ...
 </CHESSBOARD>
</CHESSBOARDS>

The different implementations based on SAX, DOM, XSLT, XPath, and so on, when applied to the same input documents, provide the same outputs: a simple human-readable text format representation of a chessboard configuration.

Code Sample 2: A simple human-readable text format representation of a chessboard configuration

White king: G1
White bishop: D6
White rook: E1
White pawn: A4
White pawn: B3
White pawn: C2
White pawn: F2
White pawn: G2
White pawn: H5
Black king: B6
Black queen: A7
Black pawn: A5
Black pawn: D4

All the sample programs use the Java API for XML Processing (JAXP) to interface with different underlying implementations of the SAX parser, DOM document builder and XSL Transformation engines, and they share a common structure that allows their respective performance to be compared. They all follow the typical steps of using JAXP, and additionally include two loops to process the same document multiple times and in multiple runs. For each run, a factory is used to create a parser, a style sheet processor, and so on, which is in turn used to process an XML source document several times. The validation of the source document against its declared DTDs or XML Schemas may be specified when invoking the program and is implemented by configuring the factory through its setValidating method so that it creates a validating or non-validating parser.

In the code sample (ChessboardSAXPrinter.java) below, after each run the averaged elapsed time to process a document is outputted. The system and user times may be measured by launching the Java virtual machine (JVM1) with the ptime command.

Code Sample 3: Example of the common structure of the sample programs (ChessboardSAXPrinter.java)

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import javax.xml.parsers.*;

public class ChessboardSAXPrinter {
 private SAXParser parser;

 public ChessboardSAXPrinter(boolean validating)
   throws Exception {
  SAXParserFactory factory 
    = SAXParserFactory.newInstance();
  factory.setValidating(validating);
  parser = factory.newSAXParser();
  ...
  return;
 }

 public void print(String fileName, PrintStream out)
   throws SAXException, IOException {
  ...
  parser.parse(fileName, ...);
  return;
 }
  
 public static void main(String[] args) {
  ...   
  for (int k = 0; k < r; k++) {
    // r: number of runs
   ChessboardSAXPrinter saxPrinter
     = new ChessboardSAXPrinter(validating);
   long time = System.currentTimeMillis();
   for (int i = 0; i < n; i++) {
     // n: number of document processed per run
    saxPrinter.print(args[0], out);
   }
   // print out the average time (s) 
   // to process a document during the current run
   System.err.print( 
     (((double) (System.currentTimeMillis()
                 - time)) / 1000 / n) + "\t");
  }
  ...
 }
}

Depending on the performance tested, two different measurements are performed:

  • The average elapsed time per processed document, as measured for each run by the program itself with two calls to the System currentTimeMillis method. Only the last few runs may be taken into account for this measurement allowing all the classes to be loaded and the VM to compile and optimize the byte code before starting the measurement. This measurement is typically used to highlight the differences between the VMs and especially the optimization improvement over time. This measurement is sensitive to external factors especially non-related activity on the system.

  • The average system+user time per processed document, based on the output of the command ptime. The sum of the system and user times for the complete execution including the startup of the VM is measured and averaged per processed document. This measurement is more reliable and becomes significant when the startup time is negligible compared to the intended workload.

The code fragment from a makefile (below) shows that for the target perf1 the ChessboardSAXPrinter program will be run for each of the source documents (../samples/Chessboards-{10, 100, 1000, 2000, 3000, 4000, 5000}.xml) as specified by the variable SAMPLING and each of these documents will be processed 10 times (variable PROCESSINGS) for each of the 10 runs (variable RUNS). The system and user times will be measured by the ptime command and properly averaged and formatted by an awk script (variable FORMATTER). The OPTIONS variable allows the passing of properties selecting the XML parser, DOM builder or XSLT engine implementations to be used. The XML.validation property is used by the sample program itself to turn on or off the validation. The output from the sample program is redirected to /dev/null in order to minimize the impact of disk IO on the performance results.

Code Sample 4: Code fragment from a makefile

SAMPLING=10 100 1000 2000 3000 4000 5000
MAXTIME=3000
RUNS=10
PROCESSINGS=10
TIMER=ptime

OPTIONS= \
  -Djavax.xml.parsers.SAXParserFactory=\
     org.apache.xerces.jaxp.SAXParserFactoryImpl \
  -Djavax.xml.parsers.DocumentBuilderFactory=\
     org.apache.xerces.jaxp.DocumentBuilderFactoryImpl \  
  -Djavax.xml.transformer.TransformerFactory=\
     org.apache.xalan.processor.TransformerFactoryImpl 

FORMATTER= \
  /usr/xpg4/bin/awk -v runs=$${RUNS:=$(RUNS)} \
    -v processings=$${PROCESSINGS:=$(PROCESSINGS)} ' \
      BEGIN { \
        count=0; m=split("1:60:3600", f, ":"); \
      } { \
        if ($$1=="sys" || $$1=="user") { \
          n=split($$2, a, ":"); \
          for (i = 1; i <= n; i++) { \
            count += a[n - i + 1] * f[i]; \
          } \
        } \
      } END { 
        printf ":%.4f", count / runs / processings; \
        if (count > $(MAXTIME)) exit 1; \
      }'

perf1:
        echo "SAX/Validating (Xerces)\t\c"; \
    for i in $(SAMPLING); \
    do
      $(TIMER) $(JAVA) -DXML.validation=true -Xms128m \
        -Xmx128m -classpath $(CLASSPATH) $(OPTIONS) 
        ChessboardSAXPrinter ../samples/Chessboards-$$i.xml 
        /dev/null $(PROCESSINGS) $(RUNS) 2>&1 \
      | tee -a $(LOGS) \
      | $(FORMATTER) || break; \
    done; \
    echo

To select the XML parser, DOM builder or XSLT engine implementation to be tested, the following JAXP properties were passed accordingly to the JVM (OPTIONS variable in the code sample above):

APIPropertyValueImplemen-tation
SAXjavax.xml.parsers.
SAXParserFactory
org.apache.crimson.jaxp.
SAXParserFactoryImpl
Crimson
org.apache.xerces.jaxp.
SAXParserFactoryImpl
Xerces
DOMjavax.xml.
parsers.Document
BuilderFactory
org.apache.crimson.jaxp.
DocumentBuilderFactoryImpl
Crimson
org.apache.xerces.jaxp.
DocumentBuilderFactoryImpl
Xerces
XSLTjavax.xml.
transformer.
TransformerFactory
org.apache.xalan.processor. TransformerFactoryImplXalan
com.icl.saxon.
TransformerFactoryImpl
Saxon
org.apache.xalan.
xsltc.runtime.
TransformerFactoryImpl
Xalan-XSLTC

Configuration

Each sample program is tested in different configurations: with different sizes of processed documents conforming to either a DTD or an XML Schema, with or without validation, with different underlying parser or style sheet engine implementations and with different JVM versions.

 Products and Versions
WorkStationSolaris Operating Environment (SPARC Platform Edition), Ultra-30
Java RuntimesJDK1.2.2_06, JDK1.3.1
XML Parsers, XSLT engineXerces 1.3.1 and 1.4.4 (for XML Schemas)
Crimson 1.1 (as included in JAXP 1.1)
Xalan-Java 2.1 (including XSLTC)
Saxon 6.3
JAXP 1.1
JDOM beta6
Sample Programs/APIChessboardSAXPrinter.javaSAX
ChessboardDOMPrinter.javaDOM
XSLTTransformation.java and ChessboardPrinter.xslXSLT
ChessboardXPathPrinter.javaXPath
ChessboardJDOMPrinter.javaJDOM

Table 1: Configurations tested

Performance Measurements

Comparing SAX, DOM and XSLT Performances

This test compares the performance of SAX, DOM and XSLT when applied to process the XML documents defined earlier. Two different SAX and DOM parser implementations have been used: Xerces and Crimson.

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations. Each document has been processed 10 times for each of the 10 runs. The measured time was the sum of the user and system times, as returned by the ptime command, divided by the total number of processed documents.

Measurements

illustration 2

Figure 2: Time to process different sizes of XML documents using SAX, DOM and XSLT with Crimson and Xerces (JDK 1.2)

Analysis

We can make three observations from these measurements:

  1. As expected, the classification, from the fastest to the slowest, is: SAX, DOM, then XSLT. This can be explained by the fact that typically an XSLT engine has to compile the style sheet and then internally build the entire input source tree before starting the transformations. It may use a DOM builder to do so. The DOM builder itself may be using a SAX parser to parse both the input document and the style sheet. As those techniques stack up, they add more functionality, ease of use and maintainability, but also more performance penalty.

  2. Crimson performs better than Xerces. Crimson is a straightforward implementation of an XML parser and has a small footprint: around 200KB (jar file size) while Xerces is more sophisticated and includes many additional features like XML Schema support. Xerces also comes with support for WML and HTML DOMs which significantly increase the size of the jar file to 1.5MB. The Apache organization is in the process of re-factoring Xerces and addressing performance in Xerces2.

  3. The time to process a document increases linearly with the size of the document. The linearity of the performance demonstrates that the test did not exhaust any system resource, especially the memory. Taking this linearity into account, for most of the next charts we will only present results for the processing of 1000-Chessboard XML documents as follows:
illustration 3

Figure 3: Time to process 1000-Chessboard XML documents using SAX, DOM and XSLT with Crimson and Xerces (JDK 1.2)

Comparing XSLT and XSLTC Performances

This second test compares the performance of regular XSLT engines and the XSLTC engine which compiles the style sheets into Java byte code. XSLTC is now part of Xalan and can be invoked through JAXP. No modification of the sample program was necessary in order to use XSLTC.

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations (though only the result for a 1000-Chessboard XML document is presented here). Each document has been processed 10 times for each of the 10 runs. The measured time was the sum of the user and system times, as returned by the ptime command, divided by the total number of processed documents.

Measurements

illustration 4

Figure 4: Time to process 1000-Chessboard XML documents using XSLT and XSLTC (JDK 1.2)

Analysis

This test shows that Xalan and Saxon are performing equally and that compiling style sheets to Java bytecode may improve performances up to 40% when compared to the classic implementations, such as Xalan or Saxon on Crimson. When compared to the previous tests, compiled style sheets may perform better than DOM and close to SAX.

Comparing Validation and Non-Validation Performances

This benchmark compares the performances of SAX and DOM with and without validation. It's using both Crimson and Xerces' SAX and DOM implementations. The intent here is to measure the impact of validation on the processing time. Per the XML specifications, a non-validating parser is not required to read external entities (including external DTD); therefore external entities referenced in the document may not be expanded and attributes may not have their default value substituted. Thus, the information passed to the invoking application may not be equivalent when using a validating parser and a non-validating parser. In the context of this document, we are only considering parsers which even when non-validating, load and parse the DTD and the entities referenced in the document. This allows, for example, the entities to be substituted, the attribute values to be normalized and their default value properly substituted, so that the application can run unchanged when switching from validation to non-validation of the input document.

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations (though only the result for a 1000-Chessboard XML document is presented here). Each document has been processed 10 times for each of the 10 runs. The measured time was the sum of the user and system times, as returned by the ptime command, divided by the total number of processed documents.

Measurements

illustration 5

Figure 5: Time to process a 1000-Chessboard XML document using Crimson and Xerces' DOM and SAX implementations with and without validation (JDK 1.2)

Analysis

SAX performs better than DOM and the validation of the XML input documents against their DTD incurs an overhead of up to 12% in the case of Xerces' SAX implementation. With the exception of Xerces' SAX implementation, the cost of validation is relatively independent of the API used. The validation is performed internally by the parser. If a DOM builder is implemented on a SAX parser as a custom DocumentHandler (ContentHandler in SAX 2), then the difference between the DOM and SAX processing performance can mostly be accounted for in the construction of the DOM tree from the SAX events fired by the parser.

Comparing Alternative API Performances (JDOM, Xpath, YAXA)

This test compares the performances of DOM to JDOM and XPath. The JDOM sample program uses a SAXBuilder to parse and load the XML source document in memory. It walks through the resulting JDOM document and outputs the result in a text format. The XPath sample program uses the DOM API to parse and load an XML source document in memory. It then evaluates XPath expressions to locate the elements to be processed and outputs the result in text format.

Additionally, this test compares YAXA to SAX and DOM. YAXA is an experimental API which works on top of SAX and adds the classical Java Event/Source/Listener paradigm. The YAXA sample program uses a custom SAX2 DefaultHandler to register event listeners which catch the events fired by the underlying SAX parser. YAXA also provides an event stream editor based on the previous model which directly edits the flow of events fired by a SAX parser. The YAXA-SE sample program uses this stream editor to apply an editing script to the flow of events and generates the same output as the other sample programs. YAXA's stream editor was inspired by the the sed UNIX command.

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations (though only the result for a 1000-Chessboard XML document is presented here). Each document has been processed 10 times for each of the 10 runs. The measured time was the sum of the user and system times, as returned by the ptime command, divided by the total number of processed documents.

Note: Other alternative APIs (available to the Java developer) could be listed here, including DOM4J which is comparable in term of functionalities to JDOM.

Measurements

illustration 6

Figure 6: Time to process a 1000-Chessboard XML document with DOM, JDOM and XPath, using Crimson and Xerces with validation (JDK 1.2)



illustration 7

Figure 7: Time to process a 1000-Chessboard XML document with SAX, DOM, YAXA and YAXA-SE (event stream editor), using Crimson with validation (JDK 1.2)

Analysis

JDOM is slightly faster (around 10%) than DOM using the same underlying parser implementation. XPath on the other hand performs quasi-identically, independent of the underlying parser implementation. This is due to the fact that most of the time goes into locating the relevant elements to be processed with the XPath expressions. Since we are using the same XPath implementation (from Xalan), the overall processing time is the same.

YAXA barely performs better than DOM. Because the elements to be processed constitute 90% of the overall document, it ends up creating a large number of event objects in much the same way as DOM creates node objects. It also keeps track of the current context with a stack which ultimately may be equivalent to building the document object model tree. On the other hand, with a simpler but still comparable (in the given example) pattern-matching mechanism, YAXA and YAXA's stream editor perform better than XPath and XSLT for this simple task.

Comparing Different JVM Versions

This test compares the performances of three different Java runtime implementations/configurations, JDK 1.2, JDK 1.3 server, and JDK 1.3 client. They were used to process the XML documents defined earlier with two different SAX and DOM implementations (Xerces and Crimson).

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations (though only the result for a 1000-Chessboard XML document is presented here). Each document has been processed 10 times for each of the 60 runs. The measured time was the average of the elapsed times per document for the last 30 runs.

Measurements

illustration 8

Figure 8: Time to process 1000-Chessboard XML documents using SAX and DOM with Crimson and Xerces on JDK 1.2, JDK 1.3 Client, and JDK 1.3 Server.



illustration 9

Figure 9: The JDK 1.3 Server runtime performance improvement over the 60 runs of the DOM sample program with Xerces. A sensitive optimization was performed by the end of the test that could not be shown in the previous chart.

Analysis

JDK 1.2 and JDK 1.3 Client runtimes perform quite similarly. With the exception of the DOM sample program using Xerces, the JDK 1.3 Server runtime is able to greatly improve the performances over JDK 1.2 from 33 to 38%. For the DOM sample programs using Xerces on the other hand, the performances were down 5%; but as shown on the curve chart, by the end of the run period the JDK 1.3 Server runtime was finally able to optimize it and gain 17% (from 6 to 5).

Comparing DOM Access Method Performance

The DOM API offers different methods to access an element of an input document:

  1. By name (using getElementsByTagName), all the sub-elements with a given tag name will be returned in the order in which they are encountered in a pre-order traversal from the current document or element node
  2. By walking down the tree structure (using getChildNodes), all the immediate children of the current document or element node will be returned.

The first method is more expensive because the sub-tree is totally traversed for each call to the getElementsByTagName method, but it may require less knowledge of the document structure since an exhaustive search is done. Searching for different element names may require traversing the tree several times. The second method is less expensive because the traversal of the tree is controlled: only relevant parts of the sub-tree may be traversed and they may be traversed only once; but on the other hand, it requires a deeper knowledge of the document structure.

So far, the DOM test was only using the second method. This test measures how well different DOM implementations deal with this problem.

Measurements

illustration 10

Figure 10: Time to process different sizes of XML documents using different methods from the DOM API and different implementations (JDK 1.2)

Analysis

As expected, the method relying on the getElementsByTagName method is slower than the others. But depending on the implementation (specifically the implementation of the tree walker algorithm), the difference may be small (Xerces) or dramatically large/explosive (Crimson).

Comparing DTD and XML Schema Validation Performance

This test compares the performance of validations against DTD and XML Schema. It uses Xerces's SAX implementation (the only one of the two implementations that supports XML Schemas).

This test measured the time to process XML documents describing 1000, 2000, 3000, 4000 and 5000 chessboard configurations (though only the result for a 1000-Chessboard XML document is presented here). Each document has been processed 10 times for each of the 10 runs. The measured time was the sum of the user and system times, as returned by the ptime command, divided by the total number of processed documents.

Measurements

illustration 11

Figure 11: Time to process a 1000-Chessboard XML document using Xerces's SAX implementation with validation against a DTD and an XML Schema (using Xerces 1.4.4).

Analysis

With Xerces, validating against an XML Schema is more expensive than validating against an equivalent DTD. When no validation is performed, using XML Schema appears to be more performant than using DTD. This is due to the fact that in that particular implementation of Xerces (1.4.4), the XML Schema referenced in the input document is not even read (as revealed by monitoring the calls to the EntityResolver). The XML Schema instance document is therefore treated as a DTD-less document (since there is not even a Document Type Declaration - DOCTYPE).

When running the benchmark with a more recent version of Xerces (2.0b4), when no validation is performed, using XML Schema is less performant than using DTD, quasi-mirroring the results obtained when validating. In that particular instance, monitoring the calls to the EntityResolver showed that the XML Schema was indeed loaded and parsed. Incidentally, we can also notice a sensible improvement of performance with this newer version of Xerces.

illustration 12

Figure 12: Time to process a 1000-Chessboard XML document using Xerces's SAX implementation with validation against a DTD and an XML Schema (using Xerces 2.0b4).

Conclusion

In this second article, we have tested the different sample programs presented in the first article and analyzed their respective performance when run in different configurations: with different sizes of processed documents conforming to either a DTD or an XML Schema, with or without validation, with different underlying parser or style sheet engine implementations and with different JVM versions. Taking into account the results presented in this document, the next article will attempt to give tips on how to improve the performance of XML-based applications from a programmatic and architectural point of view.

Resources

Java Technology & XML - Part 1 -- An Introduction to APIs for XML Processing
Java Technology & XML
Java APIs for XML Processing (JAXP)
The Simple API for XML (SAX)
Document Object Model (DOM)
Extensible Stylesheet Language (XSL)
XML Path Language (Xpath)
Xerces - Apache XML Parser for Java
Crimson - Apache XML Parser for Java
Xalan - Apache XSLT Style Sheet Engine
JDOM
YAXA
eMobile End-to-End Application using the Java 2, Enterprise Edition - Part II

About the Author

Thierry Violleau is a staff engineer at Sun Microsystems where he works on the J2EE BluePrints program. Previously, he worked in Market Development Engineering - Enabling Technologies group where he helped ISVs integrate Java and XML technologies in their products and solutions.

1 As used on this web site, the terms Java virtual machine or Java VM mean a virtual machine for the Java platform.

Have a question about programming? Use Java Online Support.