|
The Java Developer Connection welcomes Espresso Man and Little Grasshopper whose Q & A sessions have been a long-running feature in Sun publications. In this session, they discuss the more important implications of the sudden emergence of the eXtensible Markup Language (XML) and the tremendous opportunities this provides to developers of Java applications.
Q. XML? Isn't that an extension to HTML? A. Well Little Grasshopper, the two languages certainly do have the same "look and feel". Actually HTML is an application of Standard Generalized Markup Language (SGML) technology to a specific problem space (document presentation), whereas XML is a subset (or simplification) of SGML itself, adopted for the Web. Aside from its obvious document markup capabilities, XML is also fast becoming the standard for specifying business to business (B2B) data interchange across the Internet. Q. Why XML and not HTML? A. As a result of sharing a common heritage, both XML and HTML use tag names to "tag" text strings, and both surround these tag names with angle brackets (<...>). However HTML tag names are limited to a predefined set and are primarily used to indicate how the enclosed text data is to be DISPLAYED. The set of XML tag names on the other hand is unbounded, and they are used to indicate what the enclosed data MEANS. So for example, while a typical HTML tag might be a command to the presentation layer to "display the enclosed data in bold font:" <b>John Jones 1234</b> the equivalent XML tags might be commands to the message recipient to "treat this data as the customer name and ID": <cname>John Jones</cname> As a result, XML tag names define and identify data fields in an XML message in the same way that a schema does for a database, except that with XML, these tag names are carried along as part of the message data itself. Q. So XML messages are self descriptive! But what good does transmitting a tag name to identify a datafield do, unless the recipient was already expecting to receive this data? A. You raise an important point. Before a pair of applications can exchange and correctly interpret a set of XML data messages, they must first agree on the type of data that the message will contain, and the tag names used to identify this data. In an exactly analogous manner, two CORBA programs must first agree on the interface to a service before the client can invoke methods on the server object that implements that interface. In the case of an XML message, transmitter/recipient agreement is achieved via publicizing a message schema, typically defined according to one of two widely recognized standards:
Either schema can be located in the front of the XML message, or referenced from an external location commonly identified via a URL embedded within the message itself. This allows the recipient to automatically "validate" that the actual contents of the XML message are as promised. In the case of DTD's for example, the following lines would specify the presence of the XML tag names above: <!ELEMENT cname (#PCDATA)> <!ELEMENT cid (#PCDATA)> <!ATTLIST cid DataType CDATA #REQUIRED> XML Schema is an improvement in that:
The important point is that whether an XML Schema or a DTD is used to specify the data fields in an XML message, the actual contents of the resulting message are identical. Q. Sounds confusing! You have to have separate schemas defined for each kind of XML message exchanged by two applications, and the schemas can be scattered all over the web. Is this really useful? A. Extremely useful, oh dubious one! Such a collection of schemas might define the complete set of messages exchanged by subsystems within a typical application space (example: Retail, Education, Hospitality), and thus constitute an XML standard for an entire industry. Rather than being scattered therefore, the entire collection of message schemas are usually gathered together and then published and maintained on a web site owned by the industry trade group that developed the standard. There are two other places such a standard might be published:
Q. So an industry standard body simply defines a collection of XML message schemas, registers them with say xml.org, and then any two industry applications can use this new standard to intercommunicate, despite the language they are written in, and the operating system and hardware they are deployed on. Wow! A. Ah ... it's not quite that simple. While an XML message is universally interpretable, taken by itself it doesn't guarantee a "platform-neutral" wire. To do that, XML needs to be coupled with a platform-neutral transport protocol, for the same reason that HTML has to be coupled with HTTP before it can be received and interpreted by any browser connected to the Internet. Q. A "platform-neutral wire"? A. A platform-neutral wire is one that does not impose proprietary restrictions on those applications that use it to intercommunicate. To create such a wire, an industry XML standard must specify that:
Only by specifying BOTH the XML message formats AND the underlying transport protocol, can the standard ensure that all compliant applications will share information seamlessly, despite (as you noted) having possibly been written in two different languages and run on top of two different operating systems. This in turn guarantees end users who deploy such standard-conformant applications across a "platform-neutral wire", that they:
Q. So what are the logical choices for the transport protocol of such a "platform-neutral wire" industry XML standard? A. There are actually several candidates for such a protocol. As we shall see, while each has its strengths and weaknesses, one clear choice does emerge. 1. IIOPThis is the transport protocol underlying the CORBA architecture. It is independent of both the language a communicating application is written in, and the operating systems that such an application runs on. Requiring all XML messages to be sent over IIOP will create a level playing field for competing system suppliers, thereby increasing the end-user's choice of suppliers. IIOP is however an object transport protocol and as such, imposes a fair amount of design conformance between communicating applications. In addition, IIOP suffers from the following restrictions:
Additionally, the goal of an XML standard is to provide "loose coupling" between applications developed and deployed by different organizations, particularly where these organizations are communicating across the Internet. Therefore using the richness of an object transport to convey what are essentially a series of text messages does not represent an optimal solution (see Table 1). 2. Proprietary Message Service (MOM)Incorporating the use of a commercially available message oriented middleware (MOM) product such as JMQ, MSMQ, MQSeries or objectEvents in a standard can provide significant additional XML messaging functionality for the developers of the standard. However such a solution suffers from the following restrictions:
Specifying a MOM product as the transport protocol for XML data messages thus destroys the level playing field that is essential for wide adoption of any standard, because it anoints one vendor as being "more equal than others" (see Table 2). Such a situation sets the stage for the fragmentation of the market between conformant (but incompatible) solutions based around different message-service providers. This is EXACTLY what most XML industry standards are designed to prevent! 3. HTTP/HTTPSHTTP is a message-oriented rather than an object-oriented transport protocol, and so does not dictate any design constraints on the applications that use it. HTTP is also:
As a result, requiring XML to be sent over HTTP/HTTPS offers clear advantages if the primary goal of the standard is the interoperability of all compliant applications, because it enforces a truly platform-neutral wire between them. The School Interoperability Framework (SIF) XML standard for Education, grades K-12 now mandates use of HTTPS for just these reasons. See: Schools Interoperability Framework Q. But doesn't HTTP provide a "lower level" protocol than either of the other two? How is the missing functionality supplied? A. It is true that such messaging features as encryption, guaranteed error-free delivery, authentication, event publish/subscribe, automatic message queueing, and support for disconnected operations are all important to provide for an XML-based standard, and an application developer should be shielded from as much of the detail as possible. When HTTP/HTTPS is used as the underlying XML transport, it must either provide these features directly or they must somehow be supplied by data fields within the XML, usually in a "header" placed in front of all messages defined by the standard. The following functionality can be provided by HTTPS:
The remaining messaging functionality is normally provided by a MOM product located at the destination, which relies on the "header" XML fields in the arriving message. In other words, once the XML data arrives over the platform-neutral wire, it is immediately passed along to platform-specific messaging middleware, which supplies the missing functionality. As an example, the single XML element: <Authentication Type=X.509> Shhhhhh is used by the SIF standard to optionally convey a digital signature for the XML message, which can provide the receiver with absolute assurances of:
Though proprietary middleware at the destination may be supplying key messaging functionality, this fact remains TRANSPARENT to the sender, because the wire itself is platform-neutral. Q. So XML over HTTP/HTTPS can provide a feature rich platform-neutral wire. But you are Espresso Man! How can we have made our way through an entire "Ask Espresso Man" column without once mentioning "Java?" A. We can't and we won't. But prior to explaining why the Java platform is the best one on which to implement XML, we had to provide some background on XML first. Now consider how far we have come. We have demonstrated how defining an industry XML standard around a platform-neutral wire (XML/HTTP) allows for any two compliant subsystems to interconnect despite the OS and messaging middleware they use. Now let's consider the advantages from the developer's viewpoint. Implementing such a subsystem using Visual Basic or C++ components constricts the eventual product deployment to a single OS and a single messaging service. Writing an equivalent set of components in the Java programming language provides the ability to deploy the resulting product on any OS. Further, by conforming to the Java Messaging Service (JMS) API, these components will be able to transparently utilize a wide range of existing MOM products, providing a major competitive advantage for the Java-based products. See: Vendors Finally, packaging these Java components as Enterprise JavaBeans (EJBs) enables the developer to reduce complexity by simplifying most multi-threaded, transactional and persistence issues. So an additional benefit for the EJB developer is a reduction in "time to market". Q.What about the end user? What's the value proposition there? A. The end user gets to deploy simplified platform-neutral EJB components across a platform-neutral wire. This represents the first true realization of "plug & play" components for Business to Business (B2B) Internet applications, and is the ONLY way to deliver on XML's promise of universal interoperability. Therefore you can expect to see such EJB components appearing soon, for use in a variety of vertical industries. Several industries. Very soon. Table 1:
Table 2:
About the AuthorRon Kleinman is the Chief Technical Evangelist for Sun Developer Relations, and serves as Sun's representative on multiple industry-wide Java and XML standards committees. He has extensive experience consulting with developers who are trying to "java-tize" their existing applications. He has prepared and delivered numerous presentations on Java technologies both in the U.S and overseas. His particular areas of expertise include Java on the Server (EJBs and server-side APIs), Jini, Java-based device access control and management, and more recently, XML. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||