mu·ta·ble (myoo2 t@ b@l), adj. 1. liable or subject to change or alteration
2. given to changing; inconstant
Random House Webster's Dictionary
While most objects are mutable, some are not. For example, any bean that provides asetXXX method is mutable. Immutable objects can be used to define
values or attributes that you don't want to be changed. For example, the class in
Listing 7-1 could be used to define mathematical concepts such as pi or the speed
of light in a vacuum. A simulation might set up these values in a method called
bigBang; once they are set, the immutability of the MathematicalConstant class
prevents them from being modified.
public class MathematicalConstant {
private double value;
public MathematicalConstant(double value) {
this.value = value;
}
public double getValue() {
return value;
}
}
Immutable objects
Even though this example is somewhat academic, there are many cases where immutable objects are used in everyday programming. (The primary example of a class with immutable instances is String, which is discussed in Section 7.2.)
The choices you make when handling objects must take into account their mutability. With both mutable and immutable objects, it's possible to create numerous, useless, intermediate objects with seemingly benign usage. The allocation, initialization, and collection of these short-lived useless objects can cause major inefficiencies in your software, even when running on an advanced runtime such as the HotSpot VM.
When you allocate a Java object with the keyword new, you are causing many things to happen. First, space is allocated on the heap for the object. Then, the class's constructor is called, and the class's fields are initialized. The object's status is then tracked so the garbage collector can determine if it should remove the object from the heap. (For a more detailed explanation of the lifecycle of an object, see Appendix A, The Truth About Garbage Collection.)
While there are obviously costs associated with creating objects, the situation is improving. Modern JVMs, such as the HotSpot VM, provide much faster object allocation and improved collection mechanisms. However, there will always be costs associated with object allocation.
It is important to note that while creating objects can be an issue, it isn't always a problem. Objects are a key part of the Java programming language. You can't write a program without creating objects. You just want to be cautious when the number of objects you're allocating becomes very high-for example, when allocating objects inside loops. As with other optimization decisions, you should let your profiler be your guide. If your profiling tools show that a large amount of time is being spent allocating a particular type of object, then you can use the techniques discussed in this chapter to reduce the number of objects used.
See Section 7.6 for more information about object allocation and collection in the HotSpot VM. For information about the technical details of HotSpot's GC system, see Section B.2.1 in Appendix B.
String processing. The String class
is typically used to represent text and offers many convenient methods that help
with basic text processing tasks. For heavy-duty text processing, however, some
uses of the String class can become major performance bottlenecks.
Most of the problems with using String stem from the fact that String objects are immutable. Once they've been created, they cannot be changed. Operations that might appear to modify String objects actually generate completely new ones.
This is one of the reasons that the java.lang.StringBuffer class exists. The String and StringBuffer classes are meant to be used together. This relationship even extends to the implementation of Java language compilers such as javac. For example, when javac encounters the code snippet
It automatically transforms the code toString xyz = "x" + y + "z";
String xyz = new StringBuffer().append("x")
.append(y)
.append("z")
.toString();
This gives you an idea how String concatenation actually works. Note that two objects are created to perform the transformation: A new StringBuffer is created explicitly and a new String is returned from toString. Knowing this, the problem with concatenating a number of String objects as shown in Listing 7-2 becomes obvious.
String result = "";
for (int i=0; i < 20; i++) {
result += getNextString();
}
Concatenating String objects
The javac compiler would automatically transform this to
String result = "";
for (int i=0; i < 20; i++) {
result = new StringBuffer().append(result)
.append(getNextString())
.toString();
}
This code creates two objects every time through the loop-one StringBuffer and one String (via the call to toString). That's OK if you're only going to iterate over this loop a few times, but if you're going to be executing this code often you might want to handle the String objects differently. The code in Listing 7-3 produces the same results, but does not allocate any objects inside the loop. This approach is much more efficient.
String result = "";
StringBuffer buffer = new StringBuffer();
for (int i=0; i < 20; i++) {
buffer.append(getNextString())
}
result = buffer.toString();
Concatenating String objects more efficiently
Another important fact to note is that the creation of extra String instances is not limited to occasions where the overloaded mathematical operators are used. There are several methods in the String class that generate new instances, including
StringBuffer.
java.awt package defines several classes that encapsulate geometric information
. These geometry classes are shown in Table 7-1.
The java.awt.Component and java.awt.Container classes define methods to access certain geometric information. These methods are shown in Listing 7-4.
public Point getLocation(); public void setLocation( Point loc); public Dimension getSize(); public void setSize(Dimension size); public Insets getInsets(); public void setInsets(Insets insets); public Rectangle getBounds(); public void setBounds(Rectangle bounds);
Methods for accessing geometric information
This functionality illustrates the importance of decisions about mutability. What happens when the following code is executed?
Is the component moved? The answer has to be no. AWT needs to prevent this type of operation to avoid inconsistencies. For example, when theRectangle bounds = button.getBounds(); bounds.x += 10;
setBounds
method is called, AWT makes sure that the Component is marked invalid. This ensures
that layout is performed properly. Similarly, many other types of actions and
notifications occur when the geometry-related set methods are called. If you
could directly modify a Component object's internal data structures, you could
easily put it into an inconsistent state.
How does AWT prevent modification of the internal state of a Component? It returns a newly created Rectangle object every time getBounds is called. The actual internal representation of data in the Component remains private and is never passed outside the Component itself. For example, the code in Listing 7-5 actually creates four separate Rectangle objects:
int x = button.getBounds().x; int y = button.getBounds().y; int h = button.getBounds().height; int w = button.getBounds().width;
Component mutability
Although several of these objects can be created without having a detectable effect on performance, creating large numbers of temporary objects can negatively impact performance. Profiling tools can help you determine whether or not temporary allocations are affecting your application's performance.
Swing added methods to provide access to the information in the geometry objects directly, which eliminated the need to copy the objects. For example, the following methods were added to the JComponent class:
Because these methods return primitive types instead of objects, there is no need to worry about encapsulation being violated. With these methods, rather than writingpublic int getX(); public int getY(); public int getHeight(); public int getWidth();
you can writeint width = comp.getSize().width; // allocates temp object
The primary drawback to this approach is that is complicates the public API of the class, which usually translates toint width = comp.getWidth(); // no allocation
Another problem with this solution is that it moves the responsibility for controlling mutability out of the object and into any object that wants to use it. In the previous Swing example, the Rectangle object's mutability is being controlled by JComponent. Any other class that wants to use Rectangle in a similar manner has to duplicate methods that already exist in JComponent.
For Swing, this was really the only solution-the Rectangle class has existed since JDK 1.0, and there were many reasons to reuse the existing class instead of creating one from scratch. However, if you don't have to deal with legacy classes, there are other solutions that can be very effective. Some of these are discussed in the next section.
const keyword defined in the C++ language.
const
objects. In C++, the const keyword allows you to specify that a particular object
is to be treated as immutable. Any attempt to change a const object's state
triggers a compiler error. Although the Java programming language doesn't provide
a direct analog to const, it is fairly easy to structure your classes so that you
can simulate it.
To demonstrate how simulating the behavior of const in a Java program can help minimize temporary allocations, we'll use two versions of a highly simplified physics simulation framework. The first implements the framework using traditional techniques similar to those used in AWT; the second uses the const technique. Both versions provide encapsulation of an object's internal data representation.
This simple physics simulation framework consists of two classes: Body and Location. A Body, as shown in Listing 7-6, has a mass and a location in space. A Location is a three-dimensional point that represents a body's position.
public class Body {
private int mass = 10;
private Location loc = new Location();
public int getMass() {
return mass;
}
public void setMass(int mass) {
this.mass = mass;
}
public Location getLocation() {
return new Location(loc.x, loc.y, loc.z);
}
public void move() {
// we're just moving at random here
// in a real sim we'd have forces and such
loc.x += 1;
loc.y += 2;
loc.z += 3;
}
}
Body
Listing 7-7 shows the Location class. Note that the getLocation method in the Body class returns a copy of the internally stored Location object-not a reference to the original. This is done to preserve encapsulation and prevents the Location fields from being modified by external code.
To analyze the performance of this small framework we can use a Simulation class. This class, shown in Listing 7-8, creates a large number of Body objects and performs various operations on them. This example simulation doesn't actually do any useful work, but it approximates the kind of work that might be performed in
public class Location {
public int x;
public int y;
public int z;
public Location() { }
public Location(int x, int y, int z) {
this.x = x;
this.y = y;
this.z = z;
}
}
Location
public class Simulation {
static ArrayList bodies = new ArrayList();
static final int NUM_BODIES = 200;
static final int TIME_STEPS = 100000;
public static void main(String[] args) {
for (int i = 0; i < NUM_BODIES; i++) {
bodies.add(new Body());
}
Stopwatch timer = new Stopwatch().start();
for (int i = 0; i < TIME_STEPS ; i++) {
doTimeStep(i);
}
timer.stop();
System.out.println(timer.getElapsedTime());
}
public static void doTimeStep(int timeStep) {
Iterator iter = bodies.iterator();
while (iter.hasNext()) {
Body body = (Body)iter.next();
body.move();
Location loc = body.getLocation();
log(body, loc, timeStep);
}
}
public static void log (Body body, Location loc, int time) {
// log this info to somewhere
}
}
A simple simulation
Running this simulation on our test configuration takes about 16 seconds. Using a profiling tool to analyze the simulation gives us a better understanding of where the time is spent.
The profiling results in Table 7-2 show that more than 40 percent of the time it takes to run the simulation is spent in two methods: Body.getLocation and the constructor for the Location class. Almost all of this overhead is related to copying the returned Location objects.
In a real simulation, more work would likely be done in Body.move or elsewhere in the Simulation class, so the percentages might be quite different. However, the overhead of copying the Location objects is still likely to be significant.
Since the profiling results indicate that a significant amount of time is being spent copying the Location objects, this is a good candidate for optimization. There are a number of ways you can improve performance in this situation without sacrificing encapsulation. One solution would be to do what Swing did for its geometry objects-add accessor methods to Body:
public int getX(); public int getY(); public int getZ();
This would improve performance, but there are drawbacks. For example, if the simulation framework were more full-featured there might be many internal objects. This could cause an explosion in the number of these accessor methods. For example, the interface of your Body class might have to change to include
public int getLocationX(); public int getLocationY(); public int getLocationZ(); public int getVelocityX(); public int getVelocityY(); public int getVelocityZ(); // and even more
Adding many methods like this to your public API can needlessly complicate your code. A better alternative would be to move the concept of mutability into the Location object. To do this, you can split the single Location class into two classes-one that is immutable and one that is mutable. Listing 7-9 shows the modified Location class.
Note that two things have changed from the original version in Listing 7-7. First, the fields of the class have been changed from public to protected. This means that these fields can only be accessed by subclasses of Location, or by other classes in the same package. Any client code outside the package that contains this class will be denied access to the fields. Since the fields cannot be directly accessed, get methods have been added for read-only access.
public class Location {
protected int x;
protected int y;
protected int z;
public Location() { }
public Location(int x, int y, int z) {
this.x = x;
this.y = y;
this.z = z;
}
public final int getX() { return x; }
public final int getY() { return y; }
public final int getZ() { return z; }
}
The new Location class
There are times when you need a mutable version of the Location class. The MutableLocation class, shown in Listing 7-10, is a subclass of Location. The main purpose of this subclass is to enable modification of the object's internal fields. This is done by adding set methods for each field.
public class MutableLocation extends Location{
public MutableLocation() { }
public MutableLocation(int x, int y, int z) {
super(x,y,z);
}
public final void setX(int x) { this.x=x; }
public final void setY(int y) { this.y=y; }
public final void setZ(int z) { this.z=z; }
}
MutableLocation
Once you have the separate Location and MutableLocation classes, you can easily create an approximation of the C++ const facility. Internally, you store a MutableLocation object, but return it typed as a simple Location when you want to allow only read-only access. This is similar to returning a const reference in C++.
Listing 7-11 shows the changes that need to be made to the Body class to implement this behavior. In this version, the internally stored Location becomes a MutableLocation, and the getLocation method is changed to return a direct reference of the loc field, instead of a copy. Note that getLocation still returns a Location.
private MutableLocation loc = new MutableLocation();
public Location getLocation() {
return loc;
}
Modifications to the Body class
If the following code is written in a package separate from the Location class, it will now cause compile-time errors:
Be aware, however, that it is possible to cast the returnedLocation loc = body.getLocation(); loc.x = 5; // field x is not accessible loc.setX(5); // method setX not found in class Location
Location object to a
MutableLocation. The following code will work and is quite dangerous.
This is perfectly legal from the compiler's perspective and gives code in any package access to the internals of theLocation loc = body.getLocation(); MutableLocation mLoc = (MutableLocation)loc; mLoc.setX(5);
Location object. By writing this code, however, you're explicitly asking to do dangerous things. Note that the C++ const keyword is subject to the same limitation. You can "cast away" const-ness, but do so at your own risk.
So, how does this new version of the Location code perform? Running the same simulation as before, the code executes in about 8 seconds-almost twice as fast as the previous version. These results are consistent with the profiling data we collected: The profiler indicated that almost half of the execution time was spent copying the Location objects.
java.math package was
rewritten. The java.math package includes the classes BigDecimal and
BigInteger. In older versions of J2SE, these classes were implemented mostly as
C code. For version 1.3, they were ported entirely to use the Java language. (This
project is discussed further in Section 9.3.2 on page 143.)
One of the goals of this project was to improve performance of these classes. BigInteger, much like String, is an immutable object. One of the key performance enhancements in the rewrite was to create a mutable version of BigInteger. A private class called MutableBigInteger was added to the package java.math, and although it isn't exposed as public API, it is used internally to speed up many operations. Mike McCloskey, the engineer at Sun who did most of the work on this project, had the following to say about it:
The originalBigIntegeris well designed and easy to use, but it has a major performance drawback in its immutability. When you perform multistep operations such asgcd,modInverse, andmodPow, you have to create a new immutable number every step you take. Some of these operations take hundreds or thousands of steps, so it was absolutely necessary to make a mutable multiprecision number so the calculations could be done in place. Then you save copying the bits around, initializations of new numbers, allocating memory for new numbers, garbage collection of temporary numbers, etc. That's why we use theMutableBigIntegerclass behind the scenes.1
Table 7-3 shows results for the simple physics simulation benchmark from the previous section. The Classic VM column shows the execution times under the classic virtual machine implementation with the Symantec JIT. The HotSpot VM column shows the execution times under the HotSpot Client VM.
Interestingly, the penalty for creating a lot of small objects is much greater with the classic VM implementation. Under the classic implementation, the version of the benchmark that creates all the small objects is over four times slower than the version that does not.
Under the HotSpot Client VM, the penalty for creating all of the small objects is significantly reduced. However, there is still an obvious penalty. This means that creating large numbers of small objects can still be an issue, although not as critical an issue as it once was. (For more information about how the HotSpot VM implements garbage collection, see Memory Allocation and Garbage Collection on page 208.)
Vector, Hashtable, or raw array) to store free lists of objects.
Generally, when the program starts, a number of objects are put into the
pool. Then when the program needs a new instance of the object, it simply gets it
from the free list. When the immediate use of this object is over, it is returned to
the free list.
In the past, object pooling was often used successfully. However, with the new generation of JVM implementations that include advanced memory management systems, object pooling small objects is often counterproductive. The overhead of managing the object pool is often greater than the small object penalty. Pooling can also increase a program's memory footprint. The need for many of these small objects can often be avoided altogether by having control over object mutability. Pooling small objects is not a recommended tactic when you're working with the new generation of JVMs.
Although pooling small objects isn't recommended with newer JVM implementations, pooling large objects or objects that work with native resources can be useful. For example, large bitmaps or arrays are often good candidates for reuse. Classes like Thread or Graphics that require native resources are also often excellent candidates for caching. Large arrays are also good candidates due to the overhead of clearing all of the elements during initialization.
In short, when making decisions about caching or reusing objects, let your profiler be your guide. If you find that you're spending a lot of time creating a particular type of object, and you can't control the creation by manipulating its mutability, then you might want to consider pooling. Be aware, however, that pooling might actually hurt performance when used with small objects on new JVMs; be sure to benchmark so you can compare the different solutions.
Iterator. It is possible to use the Iterator interface in a fashion that provides
you with immutable arrays. The following example is designed to show why
such a construct is needed.
Listing 7-12 shows a fragment of a class that might be part of a program designed to help ship packages.
public class ShippingInfo {
private static final String[] states = {
"AK", "AZ", "CA", "DE", "NV", "NY"};
// more stuff down here
}
Simple shipping class
The following code fragment could be used to iterate through the list of states and print them out.
for (int i = 0; i < ShippingInfo.states.length; i++) {
System.out.println(ShippingInfo.states[i]);
}
This is easy enough, but using the final keyword with arrays can be tricky. The
following code does not compile as you might expect.
It fails because you cannot assign values toShippingInfo.states = new String[50];
final variables. The following code,
however, is perfectly legal:
This code replaces the entry for Nevada ("ShippingInfo.states[5] = "Java City";
NV") with "Java City." Obviously,
final arrays are not immutable, and passing them around can violate encapsulation.
No syntax in the Java programming language provides a truly immutable array.
This can lead to all kinds of inconsistencies. To avoid these problems, one
solution is to make the following changes:
private.
getStates method.
states array from the getStates method.
This preserves encapsulation, but is likely to cause performance issues-especially if the array is large. The array in the shipping example represents the 50 states and contains a relatively small number of elements, but in another situation the array might contain thousands of elements. For example, the array might represent the part numbers for all of the parts in a new car. One way to avoid copying the array and still maintain encapsulation is to create an Iterator.
Listing 7-13 shows a custom class that implements the Iterator interface. This listing creates the Iterator as an inner class, which gives it access to the private states array. Note that the remove method, which is required by the Iterator interface, throws an UnsupportedOperationException. This is how the Collections Framework enables you to create a read-only Iterator.
public class ShippingInfo {
private static final String[] states = {
"AK", "AZ", "CA", "DE", "NV", "NY"};
public static Iterator getStates() {
return new StateIterator();
}
public static class StateIterator implements Iterator {
private int current = 0;
/* from Iterator */
public boolean hasNext() {
return current < states.length;
}
/* from Iterator */
public Object next() {
return nextState();
}
/* from Iterator */
public void remove() {
throw new UnsupportedOperationException();
}
/* custom typesafe next */
public String nextState() {
if (current < states.length) {
String state = states[current];
current++;
return state;
} else {
throw new NoSuchElementException();
}
}
}
}
An Iterator as a read-only array
The following code snippet can be used to iterate through the array safely, without concern that it could be accidentally damaged.
Iterator iter = ShippingInfo.getStates();
while (iter.hasNext()) {
System.out.println(iter.next());
}
Note that the Java 2 Collections Framework provides a great deal of infrastructure
for creating read-only collections. For more information on this feature, see
Section 8.4.10, Immutable Collections.
The tactic of adding a wrapper object to hide the mutability of an underlying structure isn't unique to arrays. In fact, you can use this general approach to hide the mutability of many types of structures. Doug Lea discusses this idea in Section 2.4.3 of his book Concurrent Programming in Java.2
From an email exchange with Mike McCloskey.
2Doug Lea, Concurrent Programming in Java: Design Principles and Patterns, Second Edition, pp. 132-135. Addison-Wesley, 1999. Chapter 2 provides a good introduction to some of the problems associated with encapsulation in a multithreaded environment and is well worth reading.
Copyright © 2001, Sun Microsystems,Inc.. All rights reserved.