CONTENTS | PREV | NEXT | INDEX Designing Enterprise Applications
with the J2EETM Platform, Second Edition



10.3 Web Tier Internationalization

This section presents some design considerations for internationalizing Web-tier components.


10.3.1 Tracking Locales and Encodings

An internationalized Web application must be able to determine the encoding of an incoming request and ensure that the response is properly encoded. This section discusses locale and encoding management for Web-tier components.

10.3.1.1 Determining HTTP Request Locale and Encoding

Runtime locale determination is simple and automatic in J2SE applications. An application developer can use J2SE internationalization APIs to set an application's locale programmatically.

Locale semantics for J2EE applications are more complex than for J2SE applications. For example, the system default locale for a Web component is the Web container's default locale. In a distributed environment, this default locale may differ among containers, making the locale dependent on the container servicing the request. Using Locale.getDefault in Web applications is not recommended, because the value returned represents the Web container's locale, not the client's locale.

An internationalized enterprise application's Web tier must somehow determine the encoding of incoming request parameters. As explained previously in this chapter, an encoding defines the relationship between a character set's code points and a data stream's unit size and serialization rules. Correct data interpretation requires knowing how the data are encoded. Unfortunately, there's no standard way to determine HTTP request parameter encoding. An HTML browser encodes each request using the encoding of the page that was the source of the request, but that knowledge is only useful if the original page's encoding is known.

There are several approaches to determining and tracking HTTP request locale:

The method ServletRequest.setCharacterEncoding (Java Servlet specification version 2.3 and above only) overrides a servlet request's default encoding with a given encoding, which thereafter is used to interpret request parameters. This method must be called before any data is retrieved from the request object.

In summary, the BluePrints recommendation is to standardize on a single encoding, preferably UTF-8, to provide consistent request encoding, efficient data transmission, broad character set coverage, and wide browser support. When a consistent encoding cannot be used (because of noncompliant browsers, for example), consider storing locale and encoding in session state, or use separate URLs for each locale or encoding as described in the next section.

10.3.1.2 Storing Locale and Encoding at Runtime

Instead of determining locale and encoding for each request, locale can be stored for use by subsequent requests. There are several ways to accomplish this:

10.3.1.3 Setting HTTP Response Locale and Encoding

Response encoding of JSP pages and servlets determines both the format of characters in the response and the request encoding of any subsequent request from the served page.

An HTTP server indicates content encoding using part of the Content-type HTTP header. This header's value is either TYPE or TYPE;charset=CHARSET where TYPE is the content type (RFC 1049) and CHARSET is the name of the encoding as registered with the Internet Assigned Names Authority (IANA). The default value for TYPE is text/html; the default value for CHARSET is ISO-8859-1. A reference to the IANA registry of values for charset is listed in Section 10.9 on page 345.

There are two ways to set encoding of a servlet's HTTP response:

Set the locale or content type before calling Servlet.getWriter to ensure that the resulting Writer is configured for the correct encoding.

Two attributes of a JSP page's page directive can control encoding:

A JSP container may issue a runtime error if the encoding for the page is inappropriate for the content type. It may produce a translation-time error when a JSP page specifies an unsupported encoding.

The content type and encoding of a JSP page is fixed at page translation time when they are set using a directive. Use either a custom tag or a servlet to set encoding at runtime.

An application can use a servlet filter to set response encoding to a single value before a servlet or JSP page receives the request. This technique provides a single point of control for enforcing standardized encoding and ensures that encoding is correct before a servlet uses its response object. The sample application enforces response encoding with a servlet filter. The servlet filter can also serve as a guard, logging an error message if any client makes a request using an unsupported encoding.

Automatic selection of language, character set, and encoding selection make things easy for users. But it's important always to provide a way for users to change languages manually as well. Page headers or footers are a good place for hyperlinks or dropdown boxes for manual language selection. When you offer users a choice of languages, the name of each language should be in the language to be chosen, rather than the language of the current page.


10.3.2 Presentation Component Design

Internationalization and localization are important concerns when designing presentation components such as JSP pages, JavaBeans helper components, and custom tags. Examples of localizable Web-tier components include:

All of these components may use the J2SE internationalization APIs. Remember always to consider internationalization when designing presentation components.

10.3.2.3.1 Example

This example from the sample application presents a localizable custom tag that displays currency values in a format appropriate for a locale.

The sample application includes a presentation component called a list tag, which formats a list of items from a java.util.Iterator. The list tag evaluates and outputs its body text for each value the iterator returns. Each value is a JavaBeans component that exposes its values as get and set property accessors.

The example presentation component, a listItem tag, formats and displays the current item within a list tag's body text. A sample usage of this tag looks like this:

<waf:listItem property="unitCost" formatText="currency" 
locale="ja_JP" precision="0"/>

The tag's attributes control its behavior in the following ways:

Another example usage of the listItem tag might look like this:

<waf:listItem property="unitCost" formatText="currency" 
locale="en_US" precision="2"/>

The locale in this example is en_US, so the CurrencyFormatter will use appropriate localization for United States English. Because this currency amount is in dollars, the precision is two (to display cents).

Note that this tag does not actually convert currency between Yen and dollars. Rather, it simply formats the value that getUnitCost returns for the specified locale.

Components other than custom tags can also be internationalized. For example, unitCost in the above example is a property of a JavaBean component, which itself could be localized. The component could return one price, in Yen, for locale ja_JP, and a different price, in dollars, for locale en_US. In such a case, the JavaBeans component (part of the application MVC model) would produce a unit cost appropriate to the locale, while the presentation tag (part of the application MVC view) would format the value in a way appropriate for the locale. (This scenario is hypothetical, as the sample application does not provide this functionality in quite this way.)

The sample application contains other examples of locale-aware presentation components. Localizable presentation components greatly simplify internationalization.


10.3.3 Internationalizing and Localizing JSP Pages

Because locale is primarily about how to present data, localization is most appropriately implemented in MVC views. In the Web tier, MVC views are usually JSP pages. J2EE Web applications should localize content in the Web tier with JSP pages.

Two common approaches exist for localizing JSP pages: creating JSP pages that can be localized with resource bundles or maintaining separate JSP pages for each locale. Each approach has strengths and drawbacks. The next two sections discuss the tradeoffs between these two options.

Use separate JSP pages for each locale when content structure and display logic differ greatly between locales or when messages depend on the target locale. Resource bundles are recommended for error and logging messages (see Section 10.7 on page 341), and when content varies between locales only in data values and not in structure.

10.3.3.4 Localizing JSP Pages with Resource Bundles

A common way to localize JSP pages is to assemble chunks of localized text using locale-aware custom tags. Each time the page is served, the custom tags select text from a resource bundle for the current locale.

Figure 10.2 Localizing JSP Pages with Resource Bundles

Figure 10.2 shows a single internationalized JSP page that is localized with resource bundles for several locales. Benefits of this approach include:

The consistency provided by this approach is also a major drawback. While changing the content of this page is easy, customizing its structure to locales is harder, because one JSP page produces content for all locales.

This approach shares a single JSP page across locales, so the page encoding must be compatible with the encodings of all application character sets. The JSP directive setContentType specifies the content type and the encoding for the page at page translation time, so all pages produced using this directive must use the same encoding. For reasons explained earlier in this chapter, standardizing on UTF-8 encoding is recommended.

The recommended way to implement a single JSP page customized to multiple locales is to use resource bundles. Access resource bundles from custom tags in the pages instead of using resource bundles from scriptlet or expression code. Custom tags improve the readability and maintainability of JSP pages, and reduce duplicated code.

10.3.3.5 Locale-Specific JSP Pages

Another approach to localizing JSP pages is to provide a separate JSP page for each locale.

Figure 10.3 Localizing by Creating Separate JSP Pages for Each Locale

Figure 10.3 shows a directory tree of an internationalized JSP page, index.jsp. There are actually four separate versions of the file, each in a separate directory in the server's namespace. In this approach, a servlet or servlet filter forwards each request for a JSP page to the appropriate file based on the requesting client's locale. The names of the directories that separate the different file versions use the standard resource bundle suffix naming convention. An alternative is to use file naming conventions instead of directories. For example, the name of a file for the default locale is index.jsp, the Japan Japanese localized JSP file would be called index_ja_JP.jsp, the Swiss German file would be index_de_CH.jsp, and so on. While this approach will work, applications with a large number of files and locales might easily become difficult to manage.

Grouping JSP pages, static pages, and other resources such as graphics files in one directory per locale is a BluePrints best practice. Note that JSP pages can be localized selectively with this scheme. The logic for determining which file to forward to is in a dispatching servlet or servlet filter, which can implement the same naming convention scheme as do resource bundles. The forwarding component can always choose the most specific file available and use a default file (with no localization suffix) as a fallback.

The page-per-locale approach has the following benefits:

At the same time, this approach has some drawbacks. Maintaining a consistent look-and-feel between locales is more difficult with separate JSP pages than with resource bundles. Separate files must be created and maintained consistently for several locales. This means more maintenance than does the resource bundle approach.

The Web-tier framework and tools you select for creating your application may influence your decision in how to support internationalized content.

The sample application uses a templating mechanism, providing both structural consistency between locales and the flexibility of page-per-locale localization. The templating mechanism uses an XML "screen definitions file" for each locale to assemble localized JSP pages into a single page. The screen definitions file for a locale specifies a template file, and maps localized JSP pages to symbolic names such as "header," "footer," and so on. The template file defines the overall structure for a page, and uses custom tags to include localized JSP pages, which it references by symbolic name. Because the screen definitions file specifies the template, both page layout and "look and feel" can be unified across locales (by using a single template) or customized for particular locales (by using separate templates).

Regardless of which option you choose, setting the JSP page response encoding correctly is crucial. The sample application standardizes all page encoding to UTF-8, and enforces this encoding with a servlet filter for all JSP pages it serves.



CONTENTS | PREV | NEXT | INDEX
Copyright © 2002 Sun Microsystems, Inc. All Rights Reserved.