CONTENTS | PREV | NEXT | INDEX J2EE BluePrints



4.5 Internationalization and Localization

Internationalization may sometimes be overlooked when developing a Web application targeted at a particular enterprise or localized market. However, it is becoming increasingly important when developing a Web application that may be used in more than one country or region that you consider internationalization from the outset. This section presents approaches to developing an internationalized Web application.

Internationalization is the process of preparing an application to support various languages, while localization is the process of adapting an internationalized application to support a specific language or locale. A locale is a language or subset of a language that includes both regional and language-specific information. Internationalization involves identifying and isolating portions of the application that present strings of data to the user so that the strings can be acquired from a single source such as a file. Localization involves translating these strings into a specific language and assembling them in a file that the application can access. Thus internationalizing an application allows it to be easily adapted to new languages and markets while localization provides the adaptation of an internationalized application to a particular country or region. Neither the Web nor EJB container need be running in the same locale as the client's Web browser.

Internationalization should not be an afterthought when developing a Web application. It is easier to design an application that is capable of being internationalized than to retrofit an existing application, which can be both costly and time consuming. A great deal of time and money can be saved by planning for internationalization and localization at the outset of a project.

An application written in the Java programming language is not automatically internationalized and localizable. Though a developer of a Web application can deal with many different character sets by using the J2SE platform, the platform's support for Unicode 2.0 is only as good as the data that is input into the application.

With a Web application, the presentation layer is the focus of internationalization and localization efforts. This includes the JSP pages and supporting helper JavaBeans components.


4.5.1 Internationalization

Data handling is one part of a Web application most affected by internationalization, with impact in three areas: data input, data storage, and locale-independent data presentation.

4.5.1.1 Data Input

Data is typically input to a Web application by posts from a form on an HTML page. We assume that the client's platform will provide a means for inputting the data.

The browser running in the client's native locale is responsible for encoding the form parameter data in the HTTP request so that it reaches a Web application in a readable format. By the time the application receives the data it is in Unicode format and a developer should not have to worry about character set issues. If you need to do any type of word breaking or parsing it is recommended that you look at the BreakIterator class in the java.text package.

4.5.1.2 Data Storage

Setting your database to a Unicode 2.0 character encoding (such as UTF-8 or UTF-16), allows data to be saved correctly in many different languages. The content you are saving must be entered properly from the Web tier and the JDBC drivers must also support the encoding you choose. Refer to your data storage vendor for the best means of providing data persistence.

4.5.1.3 Enabling Locale-Independent Data Formatting

An application must be designed to present localized data appropriately for a target locale. For example, you must ensure that locale-sensitive text such as dates, times, currency, and numbers are presented in a locale-dependent way. If you design your text-related classes in a locale-independent way, they can be reused throughout an application. The following methods are used to format currency in locale-specific and locale-independent ways.

Code Example 4.3 illustrates how to format currency in a locale-specific manner. The NumberFormat class obtained will be the default NumberFormat for the system. Note that the string pattern contains a "$" character. This method will only display correctly for countries that use dollars. There is not much value with this approach because it is tied to a specific locale.


public static String formatCurrency(double amount){
		NumberFormat nf = NumberFormat.getCurrencyInstance();
		DecimalFormat df = (DecimalFormat)nf;
		df.setMinimumFractionDigits(2);
		df.setMaximumFractionDigits(2);
		df.setDecimalSeparatorAlwaysShown(true);
		String pattern = "$###,###.00"; 
		df.applyPattern(pattern);
		return df.format(amount);
}
Code Example 4.3 Locale-Specific Currency Formatting

Code Example 4.4 shows how to format currency in a locale-independent manner. The user can specify any supported locale and the resulting String will be formatted for that locale. For best results, the string pattern should be obtained from a resource bundle.


public static String formatCurrency(double amount, Locale locale){
		NumberFormat nf = NumberFormat.getCurrencyInstance(locale);
		DecimalFormat df = (DecimalFormat)nf;
		df.setMinimumFractionDigits(2);
		df.setMaximumFractionDigits(2);
		df.setDecimalSeparatorAlwaysShown(true);
		String pattern = "###,###.00"; 
		return df.format(amount);
}
Code Example 4.4 Locale-Independent Currency Formatting

In a JSP page, the functions described in Code Example 4.3 and Code Example 4.4 for formatting currency can be used by including the following code:


<%=JSPUtil.formatCurrency(cart.getTotal(), Locale.JAPAN)%>

This expression uses the method formatCurrency which is located in a class named JSPUtil. The total that is returned from the cart.getTotal method is a double. Note that when using this code you will need to import the java.util.Locale and com.sun.estore.util.JSPUtil classes.


4.5.2 Localization

Once an application has been internationalized it can be localized. This section focuses on techniques for delivering localized content to clients. It also reviews techniques for delivering localized content through the use of resource bundles and language-specific JSP files.

4.5.2.1 Delivering Localized Content

Care must be taken to ensure that the application being developed handles data in code sets other than the default ISO 8859-1 (Latin-1). Many Java virtual machines will support code sets other than English. A detailed listing of character sets supported by Sun's Java virtual machine can be found at:


http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html

Depending on what content is delivered to the users, localization can be done in a few different ways. Web applications can be designed to deliver localized content based on a user preference or to automatically deliver localized content based on information in the HTTP request.

When an application allows users to select a language, the preferred language can be stored in the session. The selection can occur through a URL selection or a form post that sets an application-level language preference. The posted preference data can be maintained as part of a user profile as a cookie on the client's system using a cookie or in a persistent data store on the server. Giving users the ability to select a language ensures that the user gets the content that they expect.

Applications can also automatically deliver localized content by using Accept-Language attribute in header information of the HTTP request and mapping it to a supported locale. The Accept-Language attribute is set in the user's Web browser and differs slightly between browsers. When using automatic application-level locale selection, it is prudent to also provide a mechanism to let the user override the automatic selection and select a preferred language. Automatic locale selection also depends on application support for different locales. Care needs to be taken to ensure that unsupported languages are handled properly.

4.5.2.2 Localized Messages

The Java programming language provides facilities for localization. This section discusses methods of providing localized data in a Web application.

In some cases an application may need to support multiple languages on the same JSP page. List resource bundles are also useful when using servlets. Code Example 4.5 shows how to deliver content from a user-specified locale using a ListResourceBundle.


public class WebMessages extends java.util.ListResourceBundle{
	public Object [][] getContents(){
		return contents;
	}
	static final Object[][] contents = {
		//Messages
		{"com.sunw.messages.welcome",
			"Welcome to Java(TM) Pet Store Demo"},
		{"com.sunw.messages.any_message", 
			"Untranslated message},
		{"com.sunw.messages.come_back_soon", "Come Back Soon"}
	}
}
Code Example 4.5 English Resource Bundle

In this example, localized content for messages in each supported language is contained in separate files. Code Example 4.6 demonstrates a similar resource bundle file that contains Japanese messages.


public class WebMessages_ja extends java.util.ListResourceBundle{
	public Object [][] getContents(){
		return contents;
	}

	static final Object[][] contents = {
		//Messages
		{"com.sunw.messages.welcome", 
			"Japanese welcome Java(TM) Pet Store Demo"},
		{"com.sunw.messages.come_back_soon", 
			"Japanese Come Back Soon"}
	}
}
Code Example 4.6 Japanese Resource Bundle

Inside a servlet or JSP page, the messages contained in a resource bundle can be obtained with the code shown in Code Example 4.7.


// set the user's desired locale
session.setValue("preferredLocale", Locale.JAPAN);
// load preferred locale
ResourceBundle messages = ResourceBundle.getResource("WebMessages",
		(Locale)session.getValue("preferredLocale");
Code Example 4.7 Getting Messages From a Resource Bundle

Note that the Japanese resource bundle's class file name ends with "_ja". When loading resources, the Japanese version of the resource bundle file will be loaded if Locale.JAPAN is specified in the request or the default application is running in a Japanese locale. Also note that this file contains only the messages that you want to appear in translation. All messages not defined in this file will be used from the default file, which has no extension following its name.

This example shows how to specify and store a user's preferred target language and load messages for that language. Once the resource bundle is loaded a message can be obtained by using the command:


messages.getString("com.sunw.messages.welcome");

In this example, messages refers to the name of the resource bundle and welcome refers to the message that you would like to load. You need to ensure that the contentType of the page is set to an encoding that supports multiple languages (the next section provides details on setting the contentType). UTF-8 encoding allows you to display multiple languages on a single Web page. Moreover, UTF-8 encoding is supported by the most commonly used Web browsers.

It may be useful to create a JavaBeans component to assist in loading and managing the messages for an application to save resources. The details of how to create this type of component aren't covered in this document.

Resource bundles are useful for providing localized content as long as the logic for displaying internationalized text is not going to be greatly changed by the target locale. If the logic changes, it is recommended to use separate JSP files for the content, as described in the following section.


Localized Content in JSP Pages

Where you need to provide messages that vary depending on the target locale, or where the content and display logic are drastically different, it is better to use a completely different JSP file.

Since JSP pages are responsible for the presentation of a Web application's user interface, they provide an ideal place to put locale-specific information. It is important that the JSP pages and the supporting JavaBeans components and tag libraries be able to deal with localized content. This section discusses how to design a localized page and how to integrate this page into a Web application.

The encoding of a JSP page must be specified in order for the Web container to process it. An Application Component Provider sets the encoding of a JSP page using the contentType attribute of the page directive. This attribute sets the encoding for both the JSP page and the subsequent output stream. The value of contentType should be "TYPE" or "TYPE;charset=CHARSET" followed by a ";" and a valid IANA registry value. The default value for TYPE is text/html; the default value for the character encoding is ISO-8859-1. The IANA registry values can be found at:


http://www.isi.edu/in-notes/iana/assignments/character-sets

If you are using the contentType attribute of the page directive, the resulting output stream should not be a problem; otherwise, you will need to ensure the output stream is set properly. Keep in mind that when using the page directive you can only set the content type once, because a page directive is set at page compile time. If it is necessary to change the content type dynamically, you can do so with a servlet.

When using servlets it is important to set the response encoding correctly. The ServletResponse interface contains a setLocale method which should be used to ensure that data is set to the proper locale. The Servlet specification indicates that the locale should be set before calling the getWriter method. For more details, refer to the Servlet specification.

To prepare an application for localization, you should follow these steps:

  1. Separate the display logic from the content in the presentation layer (JSP) of the Web application. This makes localizing content easier and prevents integration errors which could occur if portions of the display logic were localized by accident.
  2. The J2EE programming model recommends that you deliver locale-specific files that follow the naming convention used by resource bundles. This naming convention is the base file name followed by an underscore (_) and the language variant. A country and a variant can also be used:
    1. Language
      	jsp + _ + language
      	
    2. Language and country
      	jsp + _ + language + _ + country
      	
    3. Language with country and a variant
      	jsp + _ + language + _ + country + _ + variant
      	
  3. Ensure that the character encoding of the localized files is supported by the Java virtural machine of the system running the Web application. Also, be sure that the correct encoding is listed in the contentType tag included in the page directive of the JSP page.

A properly internationalized application can be quickly localized for any number of languages without any modifications to the code of the Web application. It is much easier to internationalize an application the beginning of a development cycle when application design is first specified.



CONTENTS | PREV | NEXT | INDEX
Copyright © 2001 Sun Microsystems, Inc. All Rights Reserved.