| CONTENTS | PREV | NEXT | INDEX | Designing Enterprise Applications with the J2EETM Platform, Second Edition |
Because data in an enterprise information system can vary by locale, localization issues can reach all the way into the EIS tier. This section discusses some issues regarding persistent data and schema in databases.
A J2EE application requires internationalization support in its persistence layer as well as in component source code. Persistence layer design should always address internationalization concerns.
Both container-managed persistence and bean-managed persistence require that an application's JDBC driver and back-end data store support all character sets and all encodings used to represent persistent data. UTF-8 encoding is advised because it is widely supported by JDBC drivers and databases, and supports many character sets.
10.4.1.0.1 Value Conversion, Value Representation, and Information LossUniform value representations in a database simplify database access and application code, but improper localization can cause subtle flaws in application logic. Value representations in a database should be as independent of locale as possible, if the conversion from the original representation can be performed without information loss. Where such a conversion cannot be performed, a data value should include a locale and unit designator. The key distinction to make is between the data value, which usually should not be modified, and the way that the value is represented, which usually should be uniform for all database records.
The following examples illustrate the difference between a data value and the way the value is represented:
- Fixed decimal numbers--English-speaking countries often format decimal numbers as
1,234.56, whereas people in many other countries format the same number as1.234,56. Rather than maintain the original punctuation, a database attribute for such a value should be a coded decimal type that can later be presented in any format or encoding. Where there is a business reason to do so, a locale should be stored along with the data value.- Strings--The sequence of characters in a string, not the string's encoding, determines the string's value: For example, any number of different byte sequences can represent the string
abc. Saving strings in a database in a variety of encodings, even if the encoding is stored with the value, can complicate processing the strings. The recommended approach for persisting strings received in multiple encodings is to use a universal encoding such as UTF-8 as the database attribute type, and convert from the received encoding to the database encoding before storing the value. The string can later be converted to other encodings for display.Where there is a business reason to do so, store a string along with its original encoding and/or locale, so that the original string can be recovered by encoding conversion. For example, a multilingual Customer Relationship Management (CRM) application might use a stored locale to route each customer request to a service representative who speaks that customer's language. The application could use the stored encoding to encode the response to the customer.
- Currency--It is impossible to overemphasize the importance of properly handling currency values. Your organization's business rules, not the user's locale, determine the values of quantities such as prices in a catalog. If your application quotes a price in Yen to a Japanese customer, for example, the application should persist the value in Yen, not a value converted to U.S. dollars. (If business rules mandate conversion to dollars at the time of the quote, then the value should be displayed in dollars to avoid misleading the customer.) The application must always record currency values denominated in the currency mandated by business rules. When currency is converted, an audit trail often also requires storing the conversion rate and the value and denomination before conversion. An application's handling of currency values should always be checked by someone who understands the business's accounting rules. Extensive testing with audits can also uncover currency conversion errors. The J2SE platform version 1.4 class
java.util.Currencyrepresents ISO 4217 currency codes, and can be used for currency formatting; see the J2SE javadoc documentation for details.- Physical properties and dimensions--Some value conversions for physical properties can cause information loss, others do not. For example, the conversion formula from degrees Fahrenheit to degrees Celsius
![]()
can introduce rounding errors that may or may not be significant for your application. Whether to store the original value with a dimension or to store a converted value with an implied dimension depends on the application's precision requirements.
- Time and date--Some global distributed applications standardize on a universal time coordinate (UTC) for all representations of dates and times, plus (optionally) an indication of time zone. Because UTC can be determined from local date and time for any geographic point, no data is lost in the conversion. As with currency, this determination depends on the organization's business rules.
There are many more situations where data value and data representation may vary by locale. Uniform value representation in a database simplifies application coding, but should never cause information loss.
The effect of internationalization on an application's data model is one of the more important reasons to consider internationalization in an application's design phase. Many internationalized data sets cannot be represented reasonably as resource bundles or as static JSP pages, either because the data set is too large or the data change too fast, or both. Such data sets are usually stored in and accessed from databases.
Data model entities often include locale-dependent attributes such as descriptive text, images, or resource references. In an internationalized application, an entity has a one-to-many relationship with these items. For example, each item in a non-internationalized catalog has a single descriptive text string, whereas an internationalized catalog item requires a descriptive text string for each supported locale.
Consider the example of internationalizing the description of a catalog item. Three alternative ways to model an internationalized attribute appear in Figure 10.4.
Figure 10.4 Internationalized Attribute Modeling Alternatives
One way to internationalize an attribute is to add a new attribute to the entity for every supported locale. The leftmost example in Figure 10.4 shows an item table with a description column for each locale. But that approach would require both code changes and the addition of a column to every internationalized table each time a new locale is added.
Another approach is to place the attribute in a separate entity for each locale. The middle example in Figure 10.4 shows an item table that has no descriptive text but joins to a separate catalog description table for each locale. But this approach still requires schema and code modifications to add a locale to the application.
The third option (recommended) is to include locale in the data model, making it part of the identity of the entity representing the localized resource. The rightmost example in Figure 10.4 shows an Item table that joins with an ItemDetails table. The primary key of the ItemDetails table includes both the ID of the item being described and the locale for the description and other resources. The application code for this approach contains no hard-coded locale information, so adding a new locale is as simple as adding localized data to the table.
The sample application models internationalize data in exactly this way. Figure 10.5 shows a part of the sample application's data model. It contains a hierarchical categorization of items by product, and products by category. The category, product, and item tables each have an associated detail table that contains locale-specific data. Application code retrieves localized resources from this table, looking up descriptive text, images, names, and so on, by both locale and ID.
Note that, in these tables, details tables contain all localized data. The primary key (that is, the identity) of all details tables is the locale and an ID. A properly-designed schema will support future language additions with no changes to either code or database schema. Adding a new locale in this design is as simple as adding new localized data to the details tables. An internationalized database schema requires more up-front design work, but provides a great deal of flexibility for supporting localized content later on.