|
Tech Tips Archive
January 9, 2001
WELCOME to the Java Developer Connectionsm (JDC) Tech Tips, January 9, 2001. This issue covers:
These tips were developed using Java 2 SDK, Standard Edition, v 1.3.
This issue of the JDC Tech Tips is written by Glen McCluskey.
USING THE JAVA.LANG.CHARACTER CLASS
java.lang.Character is a wrapper class for the primitive type
char. Like other wrappers such as Integer, it is used to
represent primitive values in object form, so that collection
classes that only know about Object references can manipulate
char values. The Character class is also used to group together
methods and constants used in handling Unicode characters.
This tip looks at some of the ways you can use Character.
The first example shows how the class is used as a wrapper:
import java.util.*;
public class CharDemo1 {
public static void main(String args[]) {
List list = new ArrayList();
list.add(new Character('a'));
list.add(new Character('b'));
list.add(new Character('c'));
for (int i = 0; i < list.size(); i++) {
system.out.println(list.get(i));
}
}
}
|
In this example, three Character objects representing the letters
a, b, and c are added to an ArrayList. Then the contents of the
list are displayed.
The Character class contains a lot of "isX" methods, such as
"isDigit". You might think that these methods aren't really
necessary, because it's simpler to say something like:
if (c >= '0' && c <= '9')
...
if you want to test whether a character is a digit. This code
actually works in some contexts, but it has a big problem. It
doesn't account for the fact that Java uses the Unicode
character set rather than the ASCII character set. For example,
if you run this program:
public class CharDemo2 {
public static void main(String args[]) {
int dig_count = 0;
int def_count = 0;
for (int i = 0; i <= 0xffff; i++) {
if (character.isdigit((char)i)) {
dig_count++;
}
if (character.isdefined((char)i)) {
def_count++;
}
}
system.out.println("number of digits = " + dig_count);
system.out.println("number of defined = " + def_count);
}
}
|
it reports that the Unicode character set contains 159 characters
that are classified as digits.
This example also illustrates another interesting point: not all
possible Unicode character values have meaning. The program
reports that Character.isDefined returns true for 47400 of 65536
characters.
Another place where the Character class is useful is in
converting from upper case characters to lower case characters.
Here's an example:
public class CharDemo3 {
public static void main(String args[]) {
char cupper = 'A';
char clower;
// convert to lower case using the ASCII convention
clower = (char)(cupper + 0x20);
System.out.println("cupper #1 = " + cupper);
System.out.println("clower #1 = " + clower);
System.out.println();
// convert to lower case using Character.toLowerCase()
clower = Character.toLowerCase(cupper);
System.out.println("cupper #2 = " + cupper);
System.out.println("clower #2 = " + clower);
}
}
|
If you've used the ASCII character set, it's common to convert to
lower case by adding 0x20 (decimal 32) to an upper case letter.
This approach works in the demo program, but again fails to take
into account the Unicode character set. The key obstacle is that
in Unicode, upper and lower case equivalents aren't guaranteed to
be exactly 0x20 values apart. So in this situation, it's
preferable to use the toLowerCase method of the Character class.
The Character class also contains several methods for converting
between character and integer values. These are used, for example,
in Integer.parseInt, to convert number strings in a specified base
into integers, as is the case in the following statement:
Integer.parseInt("-ff", 16) == -255
Here's a program that illustrates these methods:
public class CharDemo4 {
public static void main(String args[]) {
// return the numeric value of 'z' considered as
// a digit in base 36
int dig = Character.digit('z', 36);
System.out.println(dig);
// return the character value for the
// specified digit in base 36
char cdig = Character.forDigit(dig, 36);
System.out.println(cdig);
// return the numeric value of \u217c
int rn50 = Character.getNumericValue('\u217c');
System.out.println(rn50);
}
}
|
Character.digit returns the numeric value of a character
considered as a digit in a given radix. So, for example, in base
36, digits have the values 0-9 and a-z, and thus 'z' has the
value 35.
Character.forDigit reverses the process; the appropriate digit
as a character for the value 35 in base 36 is 'z'.
Character.getNumericValue returns the numeric value of
a character digit, using the value specified in an internal table
called the Unicode Attribute Table. For example, the Unicode value
\U217C is the Roman Numeral "L", which has a value of 50.
The Unicode Attribute Table is also used to specify the type of
a Unicode character. Types are categories such as punctuation,
currency symbols, letters, and so on. Here's a simple program that
displays the hexadecimal values of all the characters classified
as currency symbols:
public class CharDemo5 {
public static void main(String args[]) {
for (int i = 0; i <= 0xffff; i++) {
if (character.gettype((char)i) ==
character.currency_symbol) {
system.out.println(integer.tohexstring(i));
}
}
}
}
|
There are 27 such symbols. The first one listed, 0x24, corresponds
to the familiar '$' character.
A final example of how you can use the Character class has to do
with Unicode character blocks. These blocks are used to group
related characters. Some examples are BASIC_LATIN, ARABIC,
GEORGIAN, ARROWS, and KANBUN. Here's a demo program that prints
all character values in the GREEK character block:
public class CharDemo6 {
public static void main(String args[]) {
for (int i = 0; i <= 0xffff; i++) {
if (character.unicodeblock.of((char)i) ==
character.unicodeblock.greek) {
system.out.println(integer.tohexstring(i));
}
}
}
}
|
To learn more about java.lang.Character, see section 11.1.3
Character, and Table 7 Unicode Character Blocks in Appendix B
Useful Tables in "The Java Programming Language Third Edition"
by Arnold, Gosling, and Holmes"
(http://java.sun.com/docs/books/javaprog/thirdedition/).
HANDLING UNCAUGHT EXCEPTIONS
If you've done much programming in the Java programming
language, you've probably encountered applications that terminate
abnormally with an uncaught exception. Here's a program that does
just that:
public class ExcDemo1 {
public static void main(String args[]) {
int vec[] = new int[10];
vec[10] = 37;
}
}
|
In this example, the program terminates abnormally due to an
uncaught exception. The program throws an exception because of an
illegal array access to vec[10] (vec has valid array indexes
of 0-to-9).
Before examining some techniques for handling uncaught exceptions,
let's look at the rules for how the Java Virtual Machine*
terminates a program. The first rule is that an uncaught
exception terminates the thread in which it occurs. The second
rule is that a program terminates when there are no more user
threads available. Here's an example:
class MyThread extends Thread {
public void run() {
try {
Thread.sleep(5 * 1000);
}
catch (InterruptedException e) {
System.err.println(e);
}
System.out.println("MyThread thread still alive");
}
}
public class ExcDemo2 {
public static void main(String args[]) {
new MyThread().start();
int vec[] = new int[10];
vec[10] = 37;
}
}
|
The main thread shuts down almost immediately, due to an unhandled
exception. But there's an instance of MyThread that remains active
for approximately five seconds, and completes normally.
So how do you handle uncaught exceptions? The first approach is
very simple -- you put a try...catch block around the top-level
method that invokes the application:
public class ExcDemo3 {
static void app() {
int vec[] = new int[10];
vec[10] = 37;
}
public static void main(String args[]) {
try {
app();
}
catch (Exception e) {
System.err.println("uncaught exception: " + e);
}
}
}
|
All the exceptions that an application typically tries to catch
are subclasses of java.lang.Exception, and so are caught by this
technique. This excludes exceptions like OutOfMemoryError,
which are descendants of java.lang.Error. If you really want to
catch everything (not necessarily a good idea), you need to use
a "catch (Throwable e)" clause.
What if you want to extend this technique to multiple threads?
An obvious approach is to say:
class MyThread extends Thread {
public void run() {
int vec[] = new int[10];
vec[10] = 37;
}
}
public class ExcDemo4 {
public static void main(String args[]) {
try {
new MyThread().start();
}
catch (Exception e) {
System.err.println("uncaught exception: " + e);
}
}
}
|
Unfortunately, this approach doesn't actually work -- it simply
catches exceptions caused by the start method itself, namely
IllegalThreadStateException which is thrown when the thread has
previously been started.
So it's necessary to get a little more sophisticated, and
override the uncaughtException method in the ThreadGroup class.
A ThreadGroup object represents a group of threads. There is
a method in ThreadGroup that is called when a thread within the
group is about to die because of an uncaught exception.
Here's what the code looks like:
class MyThreadGroup extends ThreadGroup {
public MyThreadGroup(String s) {
super(s);
}
public void uncaughtException(Thread t, Throwable e) {
System.err.println("uncaught exception: " + e);
//super.uncaughtException(t, e);
}
}
class MyThread extends Thread {
public MyThread(ThreadGroup tg, String n) {
super(tg, n);
}
public void run() {
int vec[] = new int[10];
vec[10] = 37;
}
}
public class ExcDemo5 {
public static void main(String args[]) {
ThreadGroup tg = new MyThreadGroup("mygroup");
Thread t = new MyThread(tg, "mythread");
t.start();
}
}
|
The code example creates a subclass of ThreadGroup, and
overrides the uncaughtException method. This overridden method is
called for a dying thread; the thread object and exception are
passed as parameters to the method.
By default, uncaughtException invokes the uncaughtException method
on the thread group's parent group object. If there is no such
group, the exception's printStackTrace method is called to display
a stack trace. You can see what the default behavior looks like by
commenting the "System.err.println" line and uncomment the
"super.uncaughtException(t, e)" line.
Further reading: sections 10.12 Thread and Exceptions, and
18.3 Shutdown in "The Java Programming Language Third Edition"
by Arnold, Gosling, and Holmes"
(http://java.sun.com/docs/books/javaprog/thirdedition/).
Note
Sun respects your online time and privacy. The Java Developer
Connection mailing lists are used for internal Sun MicrosystemsTM purposes only. You have received this email because you elected to subscribe. To unsubscribe, go to the Subscriptions page
(https://softwarereg.sun.com/registration/developer/en_US/subscriptions), uncheck the
appropriate checkbox, and click the Update button.
Subscribe
To subscribe to a JDC newsletter mailing list, go to the
Subscriptions page (https://softwarereg.sun.com/registration/developer/en_US/subscriptions), choose the newsletters you want to subscribe to, and click Update.
Feedback
Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster@sun.com
Archives
You'll find the JDC Tech Tips archives at:
http://java.sun.com/jdc/TechTips/index.html
Copyright
Copyright 2001 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.
This document is protected by copyright. For more information, see:
http://java.sun.com/jdc/copyright.html
* As used in this document, the terms "Java virtual machine"
or "JVM" mean a virtual machine for the Java platform.
JDC Tech Tips
January 9, 2001
|