|
Norbert Lindenberg 日本語: JavaServer Pages 技術による多言語 Web アプリケーションの開発 JavaServer Pages (JSP) technology has become a favorite tool for developers of web applications. With JSP pages, developers can design dynamic web pages without the need for other programming knowledge. At the same time, web developers can use an extensible tag mechanism to harness the power of underlying software components. An extension developed through the Java Community Process provides enhanced support for the development of multilingual applications. The JavaServer Pages Standard Tag Library defines, among other functionality, a set of tags that enable localization and locale-sensitive formatting. For context, this article starts with a brief introduction to the JavaServer Pages technologies, so you can better understand how to use them to approach internationalization issues. I then discuss several core problems intrinsic to the development of multilingual web applications, and describe how to solve them using JavaServer Pages technologies: locale determination and localization, character encodings, and formatting and parsing. The JavaServer Pages TechnologiesJavaServer Pages (and several related technologies) form the presentation layer of web applications. With JSP pages, the developer can create dynamic web pages that interact with business logic, databases, and other services available on the network. JavaServer PagesPages developed using JSP technology combine HTML, XML, or other static content with XML-like tags that connect to underlying software libraries, which are typically written in the Java programming language. Java technologies that are particularly important in this context are the JavaBeans components architecture (as a general-purpose interface between JSP and Java classes), the Java Database Connectivity (JDBC) APIs for access to SQL databases, and various libraries for XML processing. JSP pages themselves are compiled to Java code in the form of servlets for execution. Servlets are web server extensions that are compiled and linked into the server, thus enabling faster execution than scripting languages. Servlets directly programmed in the Java programming language and JSP pages are often used together, with servlets acting as controllers and JSP pages as views of the application. JavaServer Pages and the underlying servlet technology provide extensive support for handling HTTP request and response information as well as for session maintenance using cookies or URL rewriting. An important reason for using JSP technology is that it allows the work of page authors and application developers to be separated. While it is possible to embed Java statements directly into JSP pages, developers have realized that this is best avoided and now prefer custom tags. JavaServer Pages Standard Tag LibraryThe JavaServer Pages Standard Tag Library (JSTL) contains a collection of custom actions covering several areas of functionality commonly used in JSP pages. The library builds upon the experience that many of its contributors have gained from developing their own libraries, and provides a standard interface that applications can rely upon, independent of the servers they run on. Besides the custom tags, JSTL also introduced an expression language, which further reduces the need to use scripting language expressions on JSP pages, and tag library validators to enforce constraints on the use of scripting and tag libraries on JSP pages. An enhanced version of the expression language, and the ability to suppress scripting, have subsequently been integrated into the JSP 2.0 specification, so JSTL is only required for them when using JSP 1.2. The main areas covered by the custom actions are:
Locale Determination and LocalizationWhen designing a multilingual web application, you must first decide how to determine the user's language and locale preferences, and how to match those preferences against the set of locales that the application and the underlying Java runtime environment support. This section first describes the external environment and requirements web applications have to deal with. Next, we'll take a look at the functionality provided by the underlying Java 2 Standard Edition (J2SE) platform, and finally see how JavaServer Pages Standard Tag Library tags connect the environment and J2SE. Determining the User's PreferencesA web application has two ways to determine the user's language preferences: First, it can use
language and locale preferences that are transmitted from the browser to the server using the HTTP
request header field It's worth noting that the In many cases, web applications are assembled from several components, which may be localized for different language sets. One particularly interesting component is the Java runtime environment, which in some locale-sensitive areas of functionality (such as date formatting) may support over 100 locales in over 40 languages, far more than typical web applications. Thus, the developer of an application has to decide whether to restrict localized functionality to the languages supported throughout the application, or take advantage of the capabilities of each component. The first approach has the advantage that the user sees pages that use the same language throughout, while the second may result in pages that mix different languages -- one language for most of the text, but a different one for, say, formatted dates. Localization in the Java 2 Standard Edition PlatformTo understand how JSTL determines which locales are supported by an application, let's take a look
at how localization is done in the underlying Java 2 Standard Edition platform. Two classes in the
Localization Approaches for JavaServer Pages ApplicationsTo localize JavaServer Pages-based applications, two approaches are commonly used. The first uses internationalized pages that obtain locale-dependent content through custom tags, often from resource bundles. This approach is generally preferred if the pages have complicated structure that needs to be kept in sync between all locales. The second approach uses separate locale-specific pages, and a servlet that dispatches to the appropriate page, depending on the user's preferred locales. This approach may be preferred if the pages contain mainly text, or if the structure should differ significantly between locales. Locale Determination and Localization in JSTLJSTL builds on the facilities of J2SE to provide locale determination and localization. The locale determination capabilities can be used with either JSP localization approach (described earlier), while the localization functionality is intended to support internationalized pages. JSTL supports both ways of determining the user's locale preferences that were described above. An
application can specify a fixed locale (usually one that the user has explicitly selected from the
list of supported languages), using JSTL's Here are a few code snippets you could use on the start page of a web application. Together, these
code snippets let the user choose his or her locale in a very simple way. The code assumes that it is
part of a page <%-- Interpret user's locale choice --%> <c:if test="${param['locale'] != null}"> <fmt:setLocale value="${param['locale']}" scope="session" /> </c:if> <%-- Offer locale choice to user --%> <a href="locale-choice.jsp?locale=en-US">USA</a> - <a href="locale-choice.jsp?locale=de-DE">Deutschland</a> - <a href="locale-choice.jsp?locale=ja-JP">日本</a> <%-- Use URL rewriting to ensure proper session tracking --%> <form method="get" action="<c:url value='/locale-choice.jsp' />"> <input type=submit value="Stay in session"> </form> The first section (which must come before any content for the generated HTML page) interprets the
user's locale choice, which reaches the JSP page as a request parameter. If the The second section (which is part of the content of the generated HTML page) offers the user links
that return to the same page, but with the The last section shows how to use the If the locale has been chosen from a web application's own user interface, and then set using
To decide which locales are supported, JSTL looks at the resource bundles that the application
uses. There are two actions that provide access to resource bundles: The resource bundle lookup used by the Here are some examples. Let's assume an application has bundles for
Connoisseurs of the Now, why are there two separate actions for looking up resource bundles? The difference is in the
way they're used: the One JSTL tag that takes advantage of the localization context is the
<fmt:setBundle basename="Errors" var="errorBundle" /> <fmt:bundle basename="Messages"> <%-- Localization context established by <fmt:bundle> tag --%> <fmt:message key="greeting" /> <p> <%-- Localization context established by <fmt:setBundle> tag --%> <fmt:message key="emptyField" bundle="${errorBundle}" /> </fmt:bundle> Next, why is there a request locale associated with the localization context? This locale is JSTL's
way of restricting the formatting tags to the languages that the application supports, so that the
pages presented to the user use the same language throughout. Formatting actions that are nested
within a <jsp:useBean id="now" class="java.util.Date" /> <fmt:formatDate value="${now}" timeStyle="long" dateStyle="long" /> <p> <fmt:bundle basename="Messages"> <fmt:formatDate value="${now}" timeStyle="long" dateStyle="long" /> </fmt:bundle> If the HTTP Finally, why does the localization context use the request locale and not the locale of the resource bundle found? The answer is that it avoids the loss of important information that may be needed for some formatting tags. Many applications don't make a distinction between different variants of the same language, and provide, for example, only English resource bundles, hoping that the text can be understood equally well in England, Australia, and Singapore. For date formatting, however, the country is critical -- "2/6/02" means "2 June 2002" to British readers, but "February 6, 2002" to readers used to US conventions. So, in many cases, if the request locale (rather than the resource bundle locale) is used, country information will be preserved. Character EncodingTwo distinct models for representing text for storage in computers or for transmission over networks are in use today: the old model of character encodings that are specific to small sets of languages, countries, and/or operating systems (which includes, for example, the ISO 8859 series, Windows code pages, and EUC encodings); and the new model of Unicode-based encodings that can (at least theoretically) represent all languages and be used anywhere. The old model has significant disadvantages:
Current versions of the main software systems involved in creating, distributing, and interpreting web content favor the new model; they typically use Unicode for internal processing, or at least know how to work with UTF-8, the Unicode-based encoding used on the web. Unicode-based encodings have clear advantages: they allow multilingual pages and cleanly separate the issues of locale handling from character encoding. Also, there is little risk of information loss due to encoding conversion, and Unicode-based encodings fit in well with modern server and client systems. Despite this, many web developers are still reluctant to use UTF-8. Reasons given may include support for old browser versions that don't work well with this encoding, or the lack of tool support for it. The JavaServer Pages technologies support both models. We'll now take a look at the various areas where character encoding issues come into play, and see how JSP technology and JSTL handle them. Handling Source Page EncodingsThe encoding of JSP source files is often determined by available editing tools, so a country and operating system-specific encoding may be used. There are a number of ways to communicate the character encoding to the JSP runtime environment (the "container"), and the mechanisms and rules have evolved somewhat over time. It also becomes relevant that there are two syntaxes for JSP source files: the standard syntax and a newer XML-based syntax. The JSP 2.0 specification distinguishes between the two syntaxes when detecting the character
encoding. For files in XML syntax, the encoding is detected as described in the XML specification;
this means that UTF-8 or UTF-16 are the default and any other encoding must be declared in the XML
declaration at the beginning of the file. For files in standard syntax, containers look at two primary
sources of information: First they look in the deployment descriptor of the application for an element
Here are some simple recommendations for applications based on JSP 2.0: For files in XML syntax,
make sure that files that are not encoded in UTF-8 or UTF-16 properly identify their character
encodings. For files in standard syntax, if you use UTF-8 for all your source files, just use a single
element <jsp-property-group> <url-pattern>/ko/KR/*</url-pattern> <page-encoding>EUC-KR</page-encoding> </jsp-property-group> If source files in your application can't be organized this way, add a The JSP 1.2 specification didn't clearly distinguish between files in standard syntax and files in
XML syntax with regards to source file character encodings. It also didn't provide a way to identify
character encodings in the deployment descriptor. To ensure correct character encoding detection,
applications designed for JSP 1.2 containers should therefore always identify the character encoding
of each source file using the JSTL defines a Handling Web Page EncodingsA web application has to select the character encoding to be used for generated web pages, the
"response character encoding", based on the capabilities of the targeted browsers, the writing
system(s) and language(s) of the page content, and possibly the browser's host operating system.
According to the HTTP specification, the character encoding is specified in the If all targeted browsers support UTF-8, it is generally best to use this encoding, so that multilingual documents are supported and information loss due to character conversion is avoided. If UTF-8 cannot be used, the application has to be careful to match a character encoding with the language(s) being used, including special characters. To avoid mishaps, it may be necessary to enforce the use of the same language throughout the page, as discussed in the "Locale Determination and Localization" section earlier in this article. It may also be necessary to avoid the use of the "€" character. Web applications can either specify the character encoding of a page explicitly, or let the JSP technologies determine it implicitly from locale information.
Implicit determination of the character encoding is fine, as long as old character encodings are
acceptable, and the page uses the same language throughout and avoids special characters that may not
be supported in commonly used character encodings. To take advantage of UTF-8, however, explicit
specification is required. Since the Servlet 2.4 specification gives explicit specifications
precedence over implicit specifications, setting the character encoding as part of the
Handling Request Parameter EncodingsJSP pages can not only generate web pages, they can also receive and interpret the parameters that come with an HTTP request -- typically input from a form that was part of a previously generated web page. The character encoding used for these parameters is not specified anywhere, but the de facto standard is that browsers use the same encoding as for the page containing the form. This means that a web application needs to keep track of the encoding used for previously generated pages. One commonly used mechanism is to store the name of the encoding in a hidden field in the form itself, extract it as the first parameter from the next request, and then use it to decode the other parameters. However, JSP pages can also use session management to keep track of information between requests. Applications can use the JSTL custom action Formatting and ParsingPresenting data such as numbers and dates in localized formats is a common task in any kind of application, as is the interpretation of input that the user has provided. The formats for different languages and cultures vary widely, so this would be a non-trivial task for developers if they couldn't rely on existing libraries. Fortunately, such libraries do exist. The Java 2 Standard Edition platform (J2SE) provides a set of
classes for formatting and parsing common data types in the The JavaServer Pages Standard Tag Library provides custom actions that make this functionality directly available to JSP pages. Locale Determination for Formatting and Parsing ActionsYou can use the formatting and parsing actions for numbers and dates with a predefined localization
context (for example, if the tags are nested inside an Number Formatting and ParsingThe JSTL custom actions for number formatting and parsing, Of particular interest is their support for currency formatting. Traditionally, many formatting libraries assumed that the currency symbol can be derived from the locale -- for example, if the locale is China, the currency is the RMB. In a world of cross-border transactions, this doesn't make much sense. If a company is British and calculates its prices in pounds, but the web application then displays them in RMB, there are two problems: first, the RMB is worth much less than the pound; second, the RMB may be difficult to convert back to pounds. Since choice of currency is really a business decision, currency must be treated as part of the value, instead of as part of the format. The <fmt:formatNumber type="currency" value="${price.value}" currencyCode="${price.currency}" /> If the JSP page specifies a currency code, the underlying Date and Time Formatting and ParsingThe JSTL custom actions for date and time formatting and parsing --
One interesting issue is that the displayed date and time depend not only on a locale-specific
format, but also on knowledge about a time zone. The server time zone is generally of no interest to
the user, but on the other hand, there's no simple way to find out which time zone the user is in.
Applications may be able to find out about the offset of the user's current time zone from GMT by
using some client-side JavaScript code, or they may let the user specify the current time zone as part
of a user profile. The JSTL actions don't solve this problem, but they provide two custom actions that
can be used to tell the date and time formatting and parsing about the time zone:
Message FormattingThe For example, if the JSP page contains the following text: <jsp:useBean id="now" class="java.util.Date" /> <fmt:bundle basename="Messages"> <fmt:message key="greeting"> <fmt:param value="${now}" /> </fmt:message> </fmt:bundle> and the resource bundle found is German and provides the value "Willkommen! Heute ist der
{0,date,long}." for the key ConclusionAs this article has shown, the JavaServer Pages technologies -- in particular the JavaServer Pages Standard Tag Library -- provide you with a solid foundation on which you can build multilingual applications. There are a few design choices that you should carefully consider: how to determine the user's language and locale preference, how to structure your JSP pages for localization, whether to enforce single-language pages or use existing locale support to the fullest, and which character encoding model to use. The JSP technologies enable you to implement either choice, so you can reach your worldwide audience in the most appropriate manner -- and, most importantly, in their own language. ReferencesR. Fielding et al.: Hypertext Transfer Protocol -- HTTP/1.1. RFC 2616. The Internet Society, 1999. Dave Raggett et al. (ed.): HTML 4.01 Specification. World Wide Web Consortium, 1999. Tim Bray et al. (ed.): Extensible Markup Language (XML) 1.0 (Second Edition). World Wide Web Consortium, 2000. Java 2 Platform, Standard Edition, v 1.4.2 API Specification. Sun Microsystems, 2002. Danny Coward (ed.): Java Servlet Specification. Version 2.3. Sun Microsystems, 2001. Danny Coward, Yutaka Yoshida (ed.): Java Servlet Specification. Version 2.4. Sun Microsystems, 2003. Eduardo Pelegrí-Llopart (ed.): JavaServer Pages Specification. Version 1.2. Sun Microsystems, 2001. Mark Roth, Eduardo Pelegrí-Llopart (ed.): JavaServer Pages Specification. Version 2.0. Sun Microsystems, 2003. Pierre Delisle (ed.): JavaServer Pages Standard Tag Library. Version 1.0. Sun Microsystems, 2002. Pierre Delisle (ed.): JavaServer Pages Standard Tag Library. Version 1.1. Sun Microsystems, 2003. | ||||||||||||||||||||||||||
Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.
|
| ||||||||||||