|
December 16, 1997 This issue presents tips, techniques, and sample code for the following topics:
Performance -- using Object to represent disparate types. The application is one where a very large tree structure, consuming millions of bytes, is built up. Some of the nodes in the tree reference child nodes (non-terminals), while others are leaf nodes (terminals) and have no children, but contain String information. The application involves parsing a large Java program and representing it internally via a tree. One simple approach to this problem is to define a Node class such as the following:
If the node is a leaf node, then info is used. Otherwise, child refers to the children of the node, and child.length to the number of children. This approach works pretty well, but uses a lot of memory. Only one of child and info are used at any one time, meaning that the other field is wasted. child is an array, with attendant overhead, for example, in storing the dimensions of the array for subscript checking. For certain large inputs, the parser program runs out of memory. The first refinement of this approach is to collapse child and info:
In this scheme, info can refer to either a String, for a leaf node, or to a child node array. Object is the root of the Java class hierarchy, so that for example, the following:
implicitly means:
An instance of a subclass of Object, such as String, can be assigned to an Object reference. An array of Nodes can likewise be assigned to an Object. The instanceof operator can be used to determine the actual type of an Object reference. In the parser application, using Object to represent both data types is not good enough because it still takes up too much memory. So a further change has been implemented. After doing some research, it was found that the child array consisted of a single Node element about 95 percent of the time. So it's possible to represent one-child cases directly using an Object reference to the child node, rather than a reference to a one-long array of child nodes. This representation is complicated, and it's useful to define a method for encapsulating the abstraction as in the following example:
getChild returns the i-th child, or null for leaf nodes. If there is exactly one child, then info is of type Node, referencing that child. If there is more than one child, info is of type Node[], and a cast to Node[] is done, followed by a retrieval and return of the child reference. In the parser application, this change is enough to tip the scales, so that the application would not run out of memory. The internal representation in this example is tricky, but it can be hidden via methods such as getChild. In general, it's wise to avoid tricky coding, but useful to know how to do it when the need arises. The example also illustrates the utility of using one Object reference to represent several different data types. In C/C++ similar techniques would use void* pointers or unions. Please see Tech Tips: January 20, 1998 for a followup on this topic.
Reflection. You can use reflection to set object fields or invoke particular object methods by name. For example, given an object instance "obj," and a method name "f5" specified at program execution time, the method can be invoked on the instance. To see how this works, look at this simple example:
For an invocation of:
the output is:
That is, the method names of class java.util.Stack are listed, along with their fully qualified return types and parameter types. This program loads the specified class using class.forName, and then calls getDeclaredMethods to retrieve the list of methods defined in the class. The class java.lang.reflect.Method represents a single class method. The reflection feature may not seem like much at first, but it's impossible to do in other languages such as C, C++, or Fortran. The names and properties of functions in these other languages are gone by the time the program is executed. In technical terms, these other languages have "earlybinding," whereas the Java language has "late binding." | |||||||||
|
| ||||||||||||