|
Customers expect products to conform to their cultural preferences, especially when it comes to language and data formats. You've probably been involved in creating applications in C, C++, or a 4GL that accommodate those expectations, but do you know how to write great global applications in the Java programming language? Creating a global application isn't particularly difficult, but it does require you to become familiar with the most common international problems and their solutions. The problems associated with creating an international application are basically the same from one computing environment and language to any other. Solutions are roughly equivalent as well, although their implementations obviously differ among the various computing environments and programming languages. This article gives an overview of internationalization topics and concepts in a Java programming environment, and covers the following features available in the Java Development Kit 1.1.
|
| Language Code | Language |
en |
English |
fr |
French |
zh |
Chinese |
ja |
Japanese |
The Locale's country identifier is also specified by an ISO standard,
ISO
3166, which describes valid two-letter codes for all countries.
ISO 3166 defines these codes in uppercase letters. The following figure
lists a few countries that are part of the standard. Although the Locale
constructor allows lowercase letters, it promptly converts the code to uppercases to
create the correct internal representation. The country code provides more
contextual information for a locale and affects a language's usage, word
spelling, and collation rules.
| Country Code | Country |
| US | United States |
| FR | France |
| CA | Canada |
A variant is an optional extension to a Locale. It identifies a
custom Locale that is not possible to create with just language
and country codes. Variants can be used by anyone to add additional context
for identifying a Locale. The locale en_US represents English
(United States), but en_US_CA represents even more information
and might identify a locale for English (California, U.S.A). OS or software
vendors can use these variants to create more descriptive Locales
for their specific environments.
Locale-sensitive objects have methods that use a Locale parameter.
These objects behave differently depending on Locale, and they
often format information for the user in ways that are culturally sensitive.
These objects try to accommodate the presentation preferences of the various
locales defined in the system. For example, a DateFormat class
would format a date differently depending upon locale. Also, text and other
user interface (UI) elements can be searched and applied in a locale-sensitive
manner. Locale objects are used throughout a properly
internationalized Java application; they are used by all other classes that
have adaptable behavior or representation based on cultural, language, or
geographic preferences.
Locales are defined in the java.util package and have numerous
constructor and access methods. Each of the following methods returns a
String:
getLanguagegetCountrygetVarianttoStringLocale class has a couple constructors:
Locale(String language, String country)
Locale(String language,
String country, String variant)
You can use either of the constructors to create a Locale object:
Locale myLocale = new Locale(
"en", "US"
); "en", "US",
"VENTURA");
The en represents English, and US is an abbreviation for United States. The
second line shows how to create a Locale with an optional variant,
which can be used to create a more specific Locale than what's
possible with just language and country codes.
Although the Java compiler and run-time environment won't complain if you make up your own language and country identifiers, you should use the valid codes defined by ISO standards. By constraining yourself to the ISO definitions, you'll ensure compatibility with other Java applications and coding standards.
Once created, the Locale provides access to its individual
components. getLanguage() and getCountry() return
the ISO language and country codes that comprise a Locale object.
These codes, however, aren't exactly user-friendly. They probably won't
mean a lot to your customers, so if you want to display language and
country information in the application, you should probably use other
methods.
Locale myLocale;
String language;
String country;
myLocale = new Locale(
"en", "US");
language = myLocale.getLanguage();
country = myLocale.getCountry();
System.out.println(language);
System.out.println(country);
OUTPUT:
en
US |
getDisplayLanguage() and getDisplayCountry() will
return String objects that are suitable for display to the customer.
These methods are locale-sensitive, meaning that you can provide a
Locale parameter to ask for a language or country string
in a target language.
Locale myLocale = Locale.getDefault(); System.out.println( myLocale.getDisplayLanguage(); System.out.println( myLocale.getDisplayCountry(); System.out.println( myLocale.getDisplayLanguage(Locale.FRENCH)); System.out.println( myLocale.getDisplayCountry(Locale.FRENCH)); OUTPUT: English United States anglais États-Unis |
The Locale class provides some static final
Locales that are commonly used. If you don't provide an
overriding Locale parameter, both getDisplayLanguage
and getDisplayCountry will return their information in the
language of the default locale. Some other examples are provided in the
following figure.
| Locale Identifier | Meaning |
| en_US | English (U.S.) |
| fr_CA | French (Canadian) |
| fr_FR | French (France) |
| ja_JP | Japanese (Japan) |
| en_US_MAC | English (U.S., Macintosh) |
The string representation of a Locale can be created with the
following:
String strLocale = myLocale.toString();
The method toString() will return a String in the form
<language code>_<country code>[_<variant code>]. In the
above example, toString() will return en_US. Notice
that an underscore character separates each Locale component.
When the Java1 Virtual Machine (JVM) starts up, it queries the underlying OS
for a default-locale setting. You can discover your default locale
programmatically. You can even change the default locale if you want to.
Both of these operations are accomplished via static methods within the
java.util.Locale class.
myLocale = Locale.getDefault(); System.out.println(myLocale.toString()); Locale.setDefault(Locale.GERMANY); myLocale = Locale.getDefault(); System.out.println(mylocale.toString()); OUTPUT: en_US de_DE |
Note: As recently as JDK 1.1.6,
Locale.setDefault()causes a security exception in applets, so you might want to avoid this call in applets. As a workaround, instead of relying on the default locale, you can explicitly pass aLocaleobject to every locale-sensitive object you use. It's inconvenient, but it's a relatively easy fix to implement, especially if you're creating applets. You don't need to worry about the problem in applications, because you have more security rights on the local machine.
There are two additional methods that might interest you. They are
getISO3Language and getISO3Country. When
creating Locales, you always use the two letter ISO codes,
but if you want to see them, you can use these methods to retrieve ISO's
three letter codes for the same information.
After declaring the Locale as the core of Java
internationalization, it might sound contradictory to say that this class
doesn't do a lot on its own. A Locale's power comes from the
classes that use it. In a Java application, each locale-sensitive object is
responsible for its own locale-dependent behavior. A Locale
object doesn't enforce this behavior, it simply acts as an indicator to
other objects. Those objects are then responsible for using the
Locale appropriately. By design, locale-sensitive classes are
independent of each other. That is, the set of supported Locales
in one class does not need to be the same as the set in another class.
In practice, however, the current JDK 1.1.6 provides support to a single,
shared set of locales.
In traditional operating systems and localization models, one locale
setting is active at a time. You programmatically set the locale. Thereafter,
all locale-sensitive functions use the specified locale selection. The
specified locale is active throughout the application as a global locale.
It changes when there is another global locale activation via a
setlocale or similar call. Java technology, however, treats
locales a little differently. A Java application can have multiple
locales active at the same time. That is, it's possible to use a French
date format and a U.S. number format in the same application. Nothing
limits you from creating truly multicultural and multilingual Java
applications.
What number does 1,234 represent? Of course, the answer depends on locale. In the U.S, this string of digits represents one thousand two hundred and thirty four. However, in France this represents one and two hundred thirty four one-thousandths. Significant difference? Absolutely! Imagine you're a chemical manufacturer that just received an order for 1,234 kilograms of a certain chemical. Your interpretation of this number will definitely affect your sales quotas for the month.
Numbers are represented differently around the globe. When an application shows a number to the user, it must represent that number in a way that is sensitive to the cultural expectations regarding decimal point symbol, group separators, number of digits after the decimal, and leading zeros.
The java.text.NumberFormat class performs locale-specific
formatting for both general purpose numbers. To instantiate a
NumberFormat object, use the factory method
getInstance, which returns a NumberFormat object
suitable for your default locale. You can, of course, ask for an object with a
specific locale in mind. To specify a locale other than your default, use
getInstance(Locale locale).
If you are curious about what locales are supported, you can use the class
method getAvailableLocales. This method returns an array of
Locales.
Formatting a number couldn't be easier. Call the instance methods
format(long number) or format(double number) to
produce a String object that's suitable for displaying to the
user. Other methods allow you to customize the format by turning various
options on or off.
Each locale has its own preferences for currency symbols, negative amount
format, leading zeros, group separators, decimal point symbol, and currency
symbol position. Currency and numbers have a lot in common. In fact, they even
use the same basic format class, NumberFormat, to instantiate
new objects.
Although you still use NumberFormat, you call a different factory
method to get a currency format object, getCurrencyInstance. This
method will return a currency format object for the default locale. You can use
this factory method just like you used the number factory method; call
getCurrencyInstance(Locale locale) to specify a specific locale.
Again, use the format method to produce a user visible
String object. The currency formatter will handle all the details
of selecting the correct currency symbol, placing that symbol in the string,
and applying grouping rules. Also, like the number formatter, you can override
several options to customize the format.
A date helps to uniquely identify a point in time. Like other locale-sensitive structures, dates have many representation details. You must consider long and short date formats as well as date separator symbols. You have to worry about whether the year is displayed before the day and month or after. Again, the Java class libraries accommodate these needs.
The java.text.DateFormat class provides the getDateInstance method that creates a formatter for your default locale. The format method works in the same way as the other format methods covered so far, and applies the specific format rules for your chosen locale.
The java.text.Calendar class is closely related to
Date, and lets you extract year, month, week, and day information
from a Date. You won't use Calendar directly.
Instead, use getCalendarInstance to get a calendar object for your
locale. The Gregorian style calendar is the only one provided at this time;
however, you can create your own by subclassing Calendar.
This internationalization feature of the JDK provides a mechanism for separating user interface (UI) elements and other locale-sensitive data from the application logic in a program. Separating locale-sensitive elements from other code allows easy translation. It allows you to create a single code base for an application even though you may provide 30 different language versions. Although you might be predisposed to think of text only, remember that any localizable element is a resource, including buttons, icons, and menus.
The JDK uses resource bundles to isolate localizable elements from the rest of the application. The resource bundle contains either the resource itself or a reference to it. With all resources separated into a bundle, the Java application simply loads the appropriate bundle for the active locale. If the user switches locales, the application just loads a different bundle.
Resource bundle names have two parts: a base name and a locale suffix. For
example, suppose you create a resource bundle named MyBundle.
Imagine that you have translated MyBundle for two different
locales, ja_JP and fr_FR. The original MyBundle will be your
default bundle, the one used when others cannot be found, or when no other
locale-specific bundles exist. However, in addition to the default bundle,
you'll create two more bundles. In the example these bundles would be named
MyBundle_ja_JP and MyBundle_fr_FR. The
ResourceBundle.getBundle method relies on this naming convention
to search for the bundle used for the active locale.
The java.util.ResourceBundle class is abstract, which means you
must use a subclass of ResourceBundle. The JDK provides two
subclasses: PropertyResourceBundle and
ListResourceBundle. If these don't meet your needs, you can create
your own subclass of ResourceBundle.
PropertyResourceBundle
The PropertyResourceBundle is the most convenient bundle to use.
To use this bundle, create a property file that contains key/value pairs
in the form <key>=<value>. List each key/value pair on the same
line of the file, and separate each pair with a new-line character. The
following figure shows an example of PropertyResourceBundle.
# MyResource.properties
# <key>=<value>
TEXT_NOT_FOUND=The file could not be found.
TEXT_HELLO=Hello, world!
TEXT_WARNING=
There are {0} warnings in the file {1}.
TEXT_INSERT_PAPER=Please insert more paper.
TEXT_DISREGARD=
Please disregard the man behind the {0}.
|
Place these key/value pairs into a file with a .properties extension.
For example, you might name the file MyResource.properties, and you'd
load this bundle by calling ResourceBundle.getBundle("MyResource") and load
individual elements with the getString method. By default getBundle
searches for a .class file, but uses the .properties
file, if it exists, instead of the .class file.
A PropertyResourceBundle is quite easy to create and use. However,
it has one significant limitation. All values are limited to string objects.
In other words, you can only place text strings in a PropertyResourceBundle.
This may not be important to you, but if it is you must use a different
type of bundle. The ListResourceBundle may be more appropriate if you need
more complex key/value pairs.
ListResourceBundle
The ListResourceBundle is a little more complex than
PropertyResourceBundle, but offers more features. For example,
although a PropertyResourceBundle can only store text, a
ListResourceBundle can contain any type of Java object.
ListResourceBundle is abstract, so you must subclass it to create
a usable class. See the following figure.
Like a PropertyResourceBundle, your ListResourceBundle
contains a list of key/value pairs. However, these pairs are arranged as
elements in a two-dimensional array of java.lang.Object. Your
subclass must provide a single method getContents, as well as an
Object array that lists your key/value pairs.
// MyResource.java
import java.util.ListResourceBundle;
public class MyResource
extends ListResourceBundle {
public Object[][] getContents() {
return contents;
}
public static Object[][] contents = {
{ "TEXT_NOT_FOUND",
"The file could not be found." },
{ "TEXT_HELLO",
"Hello, world!" },
{ "TEXT_WARNING",
"There are {0} warnings in the file{1}." },
{ "TEXT_INSERT_PAPER",
"Please insert more paper." },
{ "TEXT_DISREGARD",
"Please disregard the man behind the{0}." },
};
}
|
The Java language has simplified the storage, manipulation, and representation of characters by using Unicode to represent text. Unicode is a 16-bit character set, which simply means that it can define 216 characters. Each character is uniquely identified within the set. When using regional character sets, you often had to store the character-set identifier along with the character or stream of characters so that you could distinguish among the different characters with the same code point across the various sets. Using Unicode, you no longer need to worry about overlapping code points.
Although you may be unfamiliar with Unicode, you needn't worry too much about
how to use it. It is freely available in Java. If you do nothing at all, your
application will use Unicode to represent text. The String class
uses Unicode so you don't need to do anything special to get support in strings.
However, if you have to maintain legacy data in a regional character set, you
can use the numerous character converters that Java technology provides.
Using the character converters, you can convert your Unicode text to a regional character set. You can also convert from a regional character set to Unicode. So, although the Java language uses Unicode, it also allows you to maintain your older data if necessary.
Layout managers are important in an international application because they compensate for two frustrating problems associated with translated user interfaces:
First, translated text is often shorter or longer than the original text. Layout managers are important in an international application because they expand and shrink component size depending on the length of the text used for labels.
Second, a layout manager relieves the frustration associated with trying to position components as a result of text length differences. If you usually lay out UI components on an X-Y grid, you have no doubt noticed that those positions must change after translations. However, using a layout manager, you position components relative to each other, not necessarily by hard-coded pixel positions. This means you can write your UI code once and run it anywhere.
The Java Development Kit (JDK) 1.1.6 supplies at least five layout managers, and you can pick up quite a few different ones from the Internet. And of course you can create your own. For more information about layout managers, please see Exploring AWT Layout Managers.
The Java class libraries provide many tools to help you create excellent global applications. By supplying international solutions in the base class libraries, Sun helps developers create reliable, stable products. The solutions are used and tested by everyone that uses the product. Developers are not burdened with the task of solving these problems over and over again.
If you commit to using these features now, you'll save yourself lots of headaches later. In general,these JDK features are easy to use, but more importantly, they are easier to learn and use than to retrofit or fix applications that don't attempt to address the issues at all. If you're interested in updating an existing application for an international audience take a look at A Checklist for Internationalizing an Existing Program. The best way to learn about the JDK's international features is to use them.
So start writing some code, experiment, and have fun.
_______
1 As used on this web site,
the terms "Java virtual
machine" or "JVM" mean a virtual machine
for the Java platform.
John O'Conner teaches software internationalization topics and consults for global development projects. He also enjoys speaking Japanese, playing softball, and spending time with his family.
Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.
|
| ||||||||||||