This chapter discusses how to select and evaluate algorithms and data structures and addresses how these choices can affect the performance of your Java programs. Section 8.4 provides an introduction to the Java 2 Collections Framework, which contains a variety of high-performance data structures and algorithms that you can use.
A full introduction to algorithms and data structures would be an entire book by itself-in fact, the study of algorithms and data structures has been the subject of many books and papers. If you'd like to read more on this subject, see Section 8.6 for some good references.
For example, let's look at two different ways to calculate the sum of all integers that fall within a specified range. One solution is to simply build a loop around an addition statement, as shown in Listing 8-1.
public class SimpleSummer {
public long sum(int start, int stop){
long acc = 0;
for(int i=start;i<=stop;i++){
acc += i;
return acc;
}
}
Simple sum computer
As long as the number of integers to be summed is small, this simple algorithm works just fine. However, as the number of integers to be summed increases, the amount of time required to execute this algorithm grows linearly. This is what's known as an order-N function, which is abbreviated O(n) in big-oh notation.1
If your program needed this number-summing functionality and profiling showed that this function was a hot spot, you would probably want to speed it up. Your first inclination might be to tune the implementation of the algorithm. While in some cases tuning the existing algorithm might help, this implementation is pretty simple and there isn't much you can do. A more effective approach would be to consider other solutions.
For example, there is another way to sum a series of integers. The sum of any series of integers between zero and n, inclusive, can be calculated with the formula (n(n+1))/2. To calculate the sum of the series between n and m, inclusive, you can use the technique shown in Listing 8-2.
public class FormulaicSummer {
public long sum(int start, int stop){
int bigseries = stop*(stop+1)/2;
start--; // so result is inclusive of start
int littleseries = start*(start+1)/2;
return bigseries-littleseries;
}
}
Smarter sum computer
This method returns the result for any start and stop values in a fixed amount of time, regardless of how many numbers are summed. This is called an order-1 or constant-time function, commonly abbreviated O(1).
Listing 8-3 shows the code for a micro-benchmark that compares the two summing algorithms. (In practice, it's best to test alternative algorithms under different situations on multiple pieces of data-micro-benchmarks don't always show real-world results. For our simple example, however, this micro-benchmark is sufficient.)
public static void main(String args[]) {
System.out.println("Sum integers between 13 and 1000");
System.out.println("SimpleSummer:");
long acc;
SimpleSummer ss = new SimpleSummer();
long time = System.currentTimeMillis();
for (int i=0;i<5000;i++)
acc = ss.sum(13,1000);
long time2 = System.currentTimeMillis();
System.out.println("Took "+(time2-time)+" ms.");
System.out.println("FormulaicSummer:");
FormulaicSummer fs = new FormulaicSummer();
time = System.currentTimeMillis();
for (int i=0;i<5000;i++)
acc = fs.sum(13,1000);
time2 = System.currentTimeMillis();
System.out.println("Took "+(time2-time)+" ms.");
Micro-benchmark
The results of this benchmark show that the constant-time algorithm is much faster than the linear algorithm for the specified inputs-the order-N algorithm was 65 times slower on our test configuration. As this example illustrates, switching to a smarter algorithm can lead to significant performance improvements that might be impossible to realize through tuning alone.
Order-N algorithms are not always a bad choice-in fact, they often provide the best solution. A more common problem is algorithms that have performance characteristics worse than order-N, where the time required to execute the algorithm increases nonlinearly as the amount of data increases. Algorithms that exhibit such super-linear performance characteristics, such as O(n2) algorithms, are a common cause of performance degradation and are a leading cause of poor scalability.
Comparison of algorithm performance
Figure 8-1 shows performance results for three theoretical algorithms that produce identical results. As you can see, it is possible for an order-N or N-squared algorithm to perform better than a constant-time algorithm on low numbers of data points. If you know you're only going be working with small amounts of data, the constant-time algorithm might be a poor choice. Similarly, if you'reworking with large or unpredictable amounts of data, using the N-squared algorithm could lead to significant performance problems.
In contrast, solutions such as the iterative loop in Listing 8-1 are referred to as brute-force solutions. Such solutions only work because computers can do repetitive tasks very quickly. When you find yourself tuning a brute-force solution to remove a bottleneck, consider whether there is a more elegant way to approach the problem. It's very possible that no amount of micro-tuning will significantly improve performance, while changing the algorithm could eliminate the bottleneck entirely.
To illustrate this point, let's look at a common problem in graphics programming: drawing lines. The grid drawing framework shown in Listing 8-4 fills a canvas with a grid of lines. Line drawing is implemented as a general framework based on a LineDrawer interface so we can easily compare the performance of different line drawing algorithms.
// Grid container
public class GridTest extends JFrame {
GridCanvas myCanvas;
TextField status;
public GridTest(LineDrawer ld) {
super(ld.getClass().toString());
getContentPane().setLayout(new BorderLayout());
setSize(500,500);
status = new TextField("TESTING: Do not disturb!");
myCanvas = new GridCanvas(ld,status);
getContentPane().add(myCanvas,BorderLayout.CENTER);
getContentPane().add(status,BorderLayout.SOUTH);
setDefaultCloseOperation(DISPOSE_ON_CLOSE);
show();
myCanvas.test();
}
public void closetest() {
dispose();
}
}
// Grid surface with timing code
public class GridCanvas extends JComponent {
LineDrawer ld;
TextField status;
static long acc=0;
BufferedImage bi;
public GridCanvas(LineDrawer ld, TextField status){
this.ld = ld;
this.status = status;
}
public void test(){
Rectangle r = getBounds();
bi = new BufferedImage(r.width,
r.height,
BufferedImage.TYPE_INT_RGB);
Stopwatch watch = new Stopwatch();
for (int j=0;j<10;j++) { // do 10 tiems for stability
watch.reset().start();
for (int i =0; i< (r.height-1)/2; i++) {
ld.drawLine(bi,0,i*2,r.width-1,i*2);
}
for (int i=0; i<(r.width-1)/2;i++) {
ld.drawLine(bi,i*2,0,i*2,r.height-1);
}
watch.stop();
acc += watch.getElapsedTime();
}
status.setText("Time (ms) to do grid = "+(acc/10));
repaint();
}
public void paint(Graphics g) {
g.drawImage(bi,0,0,this);
}
}
// The LineDrawer interface
public interface LineDrawer {
public void drawLine(BufferedImage i,int x0, int
y0, int x1, int y1);
}
Grid test framework
An object that implements the LineDrawer interface is passed to the framework to draw the lines into the BufferedImage. Listing 8-5 shows a concrete implementation of LineDrawer that's based on Bresenham's algorithm, a well-known, efficient algorithm for general line drawing.
public class BasicBresenham implements LineDrawer {
public void drawLine(BufferedImage bi,
int x0, int y0,
int x1, int y1) {
int dx,dy;
int temp;
// reduce to half the octant cases by always
// drawing + in y
if (y0>y1) {
temp = y0;
y0 = y1;
y1 = temp;
temp = x0;
x0 = x1;
x1 = temp;
}
dx = x1-x0;
dy = y1-y0;
if (dx>0) {
if (dx>dy) {
octant0(bi,x0,y0,dx,dy,1);
} else {
octant1(bi,x0,y0,dx,dy,1);
}
} else {
dx = -dx;
if (dx>dy) {
octant0(bi,x0,y0,dx,dy,-1);
} else {
octant1(bi,x0,y0,dx,dy,-1);
}
}
}
private void octant0(BufferedImage bi,
int x0, int y0,
int dx, int dy,
int xdirection){
int DeltaYx2;
int DeltaYx2MinusDeltaXx2;
int ErrorTerm;
int pix = 0xffffffff;
// set up initial error term and drawing values
DeltaYx2 = dy*2;
DeltaYx2MinusDeltaXx2 = DeltaYx2 - (dx*2);
ErrorTerm = DeltaYx2 - dx;
// draw loop
bi.setRGB(x0,y0,pix); // draws a single point
while ( dx-- > 0) {
// check if we need to advance y
if (ErrorTerm >=0) {
// advance Y and reset ErrorTerm
y0++;
ErrorTerm += DeltaYx2MinusDeltaXx2;
} else {
// add error to ErrorTerm
ErrorTerm += DeltaYx2;
}
x0 += xdirection;
bi.setRGB(x0,y0,pix);
}
}
private void octant1(BufferedImage bi,
int x0, int y0,
int dx, int dy,
int xdirection){
int DeltaXx2;
int DeltaXx2MinusDeltaYx2;
int ErrorTerm;
int pix = 0xffffffff;
// set up initial error term and drawing values
DeltaXx2 = dx * 2;
DeltaXx2MinusDeltaYx2 = DeltaXx2 - (dy*2);
ErrorTerm = DeltaXx2- dy;
// draw loop
bi.setRGB(x0,y0,pix);
while ( dy-- > 0) {
// check if we need to advance x
if (ErrorTerm >= 0) {
// advance X and reset ErrorTerm
x0 += xdirection;
ErrorTerm += DeltaXx2MinusDeltaYx2;
} else {
// add to ErrorTerm
ErrorTerm += DeltaXx2;
}
y0++;
bi.setRGB(x0,y0,pix);
}
}
}
General-purpose line drawing algorithm
Bresenham's2 is a very efficient, general-purpose algorithm for drawing lines. Switching to a different general-purpose line drawing algorithm probably wouldn't gain us anything. However, if you carefully consider this application's problem domain, it is possible to find a better solution. Bresenham's is designed to draw lines of any arbitrary slope, but this application draws a grid that consists of lines that are either completely horizontal or completely vertical. Listing 8-6 shows a line drawer that uses this information about the problem space to provide a simpler solution.
To compare the two algorithms, we'll use the simple benchmark wrapper in Listing 8-7, which can invoke GridTest with either line drawing solution.
public class FlatLiner implements LineDrawer {
static int pix = 0xffffffff;
static int[] pixa = new int[0];
static int pixasz = 0;
public void drawLine(BufferedImage bi, int x0, int y0, int x1, int y1){
if (x0 == x1) {
int h = (y1-y0);
if (h < 0) h = 0-h;
h++;
if (h>pixasz) {
pixasz = h;
pixa = new int[h];
for (int i=0;i<pixasz;i++) pixa[i] = 0xffffffff;
}
bi.setRGB(x0,y0,1,h,pixa,0,1);
} else if (y0 == y1) {
int w = x1-x0;
if (w < 0) w = 0-w;
w++;
if (w>pixasz) {
pixasz = w;
pixa = new int[w];
for (int i=0;i<pixasz;i++) pixa[i] = 0xffffffff;
}
bi.setRGB(x0,y0,w,1,pixa,0,pixasz);
} else {
throw (new RuntimeException("FlatLiner called on "
+ "non-flat line"));
}
}
}
Special-purpose line drawing algorithm
public class GridBenchmark {
public static void main(String args[]) {
if (Array.getLength(args) < 1) {
System.out.println("usage: GridBenchmark b|f");
System.exit(0);
}
if ((args[0].startsWith("b")) ||
(args[0].startsWith("B"))) {
new GridTest(new BasicBreshenham());
} else {
new GridTest(new FlatLiner());
}
}
}
Benchmark wrapper for grid drawing
On our test configuration, the FlatLiner implementation ran approximately 25 percent faster than the BasicBresenham implementation. This result is typical-special-purpose algorithms usually run faster than general-purpose algorithms. When selecting an algorithm, consider your problem space and try constraining your solution to just the necessary operations.
Keep in mind that alternate solutions aren't necessarily mutually exclusive-in many cases it's possible to dynamically select the best algorithm. For example, the FlatLiner and BasicBresenham implementations can be combined by invoking Bresenham's algorithm inside FlatLiner. Instead of throwing an exception for lines with undefined slopes or slopes other than 0, simply insert the code from the BasicBresenham implementation. This results in a general-purpose solution that is just as fast as FlatLiner for simple grids and almost as fast as the general-purpose Bresenham's for all other lines. The cost of the two extra if-statements is negligible. This tactic of testing for easy-to-solve subcases is a mainstay of high-performance programming.
Recursive algorithms usually take the form shown in Listing 8-8, where the algorithm repeatedly calls itself, simplifying the problem at each iteration.
Solveit(problem) {
if (problem is trivial) {
return result;
} else {
simplify problem;
return SolveIt(simplified problem);
}
}
Pseudocode for recursion
For example, Listing 8-9 shows a simple, recursive algorithm for reversing the letters in a String.
public class TrivialApplication {
public static void main(String args[]) {
System.out.println( reverseString("Hello World!"));
}
static public String reverseString(String s){
if (s.length() <= 1)
return s;
else {
char c = s.charAt(0);
eturn reverseString(s.substring(1))+c;
}
}
}
Recursive string reverser
If recursion is so useful, why would you want to avoid it? From a performance perspective, there are two potential problems with using recursion: memory consumption and the overhead of successive function calls.
Each successive call to a function creates a new stack frame. If the function is called too many times, the program could potentially run out of memory. For example, passing in a very long string to the reverseString method in Listing 8-9, might cause a StackOverflowError to be thrown.
The performance of a recursive algorithm can be affected by the overhead of the successive function calls. In languages like C++, the overhead is so great that recoding a naturally recursive algorithm using nonrecursive techniques is almost always a performance win.
In the Java language, the overhead of successive function calls can be less of an issue. For example, let's look at both recursive and nonrecursive implementations of the Towers of Hanoi problem.3 Listing 8-10 shows the recursive solution.
public static void Towers(int numDisks,
char src, char dest, char temp){
if (numDisks == 1) {
System.out.println("Move top disk from pole "+
src+" to pole "+dest);
} else {
Towers(numDisks-1,src,temp,dest);
Towers(1,src,dest,temp);
Towers(numDisks-1,temp,dest,src);
}
}
Recursive Towers of Hanoi
The nonrecursive solution is much more complex. To implement the solution as a linear routine, shown in Listing 8-11, you have to create and manage your own stack and ensure that items are pushed in the right order. Despite the increase in complexity, conventional wisdom is that this version should be faster.
public static void LinearTowers(int orig_numDisks,
char orig_src, char orig_dest,
char orig_temp){
int numDisksStack[] = new int[100];
char srcStack[] = new char[100];
char destStack[] = new char[100];
char tempStack[] = new char[100];
int stacktop = 0;
numDisksStack[0] = orig_numDisks;
srcStack[0] = orig_src;
destStack[0] = orig_dest;
tempStack[0] = orig_temp;
stacktop++;
while (stacktop>0) { // this is the same as having
// call frames on the system stack
stacktop--; // pop current off stack
int numDisks = numDisksStack[stacktop];
char src = srcStack[stacktop];
char dest = destStack[stacktop];
char temp = tempStack[stacktop];
if (numDisks == 1) {
System.out.println("Move top disk from pole "
+ src + "to pole " + dest);
} else {
/* do this after the other two */
/* Towers(numDisks-1,temp,dest,src); */
numDisksStack[stacktop] = numDisks -1;
srcStack[stacktop] = temp;
destStack[stacktop] = dest;
tempStack[stacktop++] = src;
/* do this after the first */
/* Towers(1,src,dest,temp); */
numDisksStack[stacktop] =1;
srcStack[stacktop] = src;
destStack[stacktop] = dest;
tempStack[stacktop++] = temp;
/* do this first */
/* Towers(numDisks-1,src,temp,dest); */
numDisksStack[stacktop] = numDisks -1;
srcStack[stacktop] = src;
destStack[stacktop] = temp;
tempStack[stacktop++] = dest;}
}
}
Nonrecursive Towers of Hanoi
To compare the two algorithms, we'll use the program shown in Listing 8-12. Note that the println call in each implementation is actually much slower than the calculation and needs to be removed so we can get a realistic measurement of the calculation speeds.
public static void main(String args[]) {
Stopwatch watch = new Stopwatch();
watch.reset().start();
Towers(25,'A','B','C');
watch.stop();
System.out.println("Time to solve recursively = "+
watch.getElapsedTime());
watch.reset().start();
LinearTowers(25,'A','B','C');
watch.stop();
System.out.println("Time to solve lineraly = "+
watch.getElapsedTime());
}
Timing harness for the Towers of Hanoi
As it turns out, the recursive version runs about 15 percent faster under our test configuration. Why isn't the nonrecursive version faster, as it probably would be in a C++ program?
Many variables might be affecting the performance of these algorithms, but one possible reason that the linear version is slower is that the JVM performs bounds checking on array access. This means that every time the linear version of the algorithm manipulates its artificial stacks, the JVM has to check the array index to make sure it is valid. In this case, these bounds checks seem to cost more than invoking a static method.
This example illustrates the importance of measuring performance before you begin optimizing your code. Unless you run a benchmark to compare the two algorithms, you might be inclined to linearize this method on the assumption that it would improve performance. In this case, that would just lead to slower, less maintainable code.
A real-life example of this was found in the JComboBox component in early versions of the Swing toolkit. Bugs were filed against JComboBox, noting that it took an extremely long time to add items to a JComboBox object's drop-down list. Further testing showed that performance was fine for small numbers of items, but as the number of items grew, the time it took to add them grew exponentially. It turned out that adding items to a JComboBox exhibited O(n2) behavior. In other words, if it took 100 milliseconds to add 10 items, it might take 10 seconds to add 100 items. For large numbers of items, this behavior obviously wasn't acceptable.
As it turned out, this N-squared behavior wasn't caused by the implementation of a single method. It was the result of interactions between several classes involved in the operation of JComboBox. Once the problem was identified, it was possible to rewrite selected classes to make the behavior more closely resemble a O(n) algorithm. These changes significantly reduced the start-up times of many applications. Understanding this problem also made it possible to document that there were different ways to use JComboBox. One technique allows items to be added in a constant-time fashion, which further improved the performance of many applications. (For more information about JComboBox performance issues, see Chapter 10, Swing Models and Renderers.)
java.util.Vector and java.util.Hashtable classes. Although they have
been used successfully by thousands of developers, these classes have a number of
performance-related problems. The Java 2 platform introduced an expanded set of
data structures known as the Collections Framework. The classes in this framework
address a number of the shortcomings of the Vector and Hashtable classes
and add some high-performance algorithms for tasks such as sorting.
The Collections Framework is a unified architecture for representing and manipulating collections that reduces programming effort while increasing performance. The framework is designed to
By providing high-performance, high-quality implementations of useful data structures and algorithms, the Collections Framework helps you improve the quality and performance of your software. The implementations of each interface are interchangeable, which makes it possible to tune a program by simply switching collection implementations. These ready-to-use data structures free you from the drudgery of writing custom data structures, leaving you with more time to devote to improving the quality and performance of the rest of your program.
Collections Framework class hierarchy
The interfaces in the Collections Framework define the types of data structures the framework provides. Each type of data structure is best suited for a particular type of task. Understanding the purpose of each of these data structures will enable you to select the best data structure for a task. For some of the interfaces, the framework provides multiple, concrete implementations. To achieve optimal speed and memory utilization, you need to use the implementation best suited to the task. The purpose of each interface and the trade-offs involved in using the various implementations are discussed in the following sections.
Collection interface is the foundation of the Collections Framework; it
defines the basic functionality implemented by all collections. A Collection represents
a group of objects, which are known as the collection's elements. Listing
8-13 shows the methods defined by the Collection interface.
public interface Collection {
int size();
boolean isEmpty();
boolean contains(Object o);
Iterator iterator();
Object[] toArray();
Object[] toArray(Object a[]);
boolean add(Object o);
boolean remove(Object o);
boolean containsAll(Collection c);
boolean addAll(Collection c);
boolean removeAll(Collection c);
boolean retainAll(Collection c);
void clear();
boolean equals(Object o);
int hashCode();
}
Collection interface
The Collections Framework doesn't provide any direct implementations of the Collection interface. Instead, it provides implementations of more specific subinterfaces like Set and List. Some of these implementations allow duplicate elements and some don't. Similarly, some implementations are ordered and others are not. Features such as ordering and duplicate elimination have a performance cost, so you should select the collection type with the fewest features that still meets your needs.
List is an ordered collection, sometimes referred to as a sequence. List objects
can contain duplicate elements. If you've used java.util.Vector, you're already
familiar with the general characteristics of List. When using a List, you generally
have precise control over where each element is inserted in the List. You can
access elements in a list by their integer indexes.
Most of the time when you need a List, the ArrayList implementation is the best choice. ArrayList offers constant-time positional access and is quite fast. It doesn't have to allocate a node object for each element in the List and it uses the native method System.arraycopy to move multiple elements simultaneously. An ArrayList is essentially a Vector without the synchronization overhead.
ArrayList has one tuning parameter, its initial capacity. This is the number of elements that the ArrayList can hold before it has to grow. If you have a good idea how big the collection is, you can avoid unnecessary allocations when items are added to the collection by setting this parameter.
You might want to consider using LinkedList if you need to frequently add elements to the beginning of the List or iterate over the List and delete elements from its interior. These are constant-time operations in a LinkedList and linear-time operations in an ArrayList. However, keep in mind that this performance gain might be more than offset by other factors. Positional access can be much slower in a LinkedList-it's a linear-time operation in a LinkedList and a constant-time operation in an ArrayList. The fixed overhead for LinkedList operations is also much greater than for ArrayList operations.
The best way to decide which type of List to use is to create some benchmarks that reflect how the List is used in your program. Since ArrayList and LinkedList share the same interface, it's easy to swap implementations.
Set is a collection that cannot contain duplicate elements. This interface models
the mathematical abstraction set. It is used to represent sets like the cards in a
poker hand, the courses that make up a student's schedule, or the processes running
on a machine.
The two general-purpose Set implementations are HashSet and TreeSet. Deciding which to use is very straightforward. HashSet is much faster4 but offers no ordering guarantees. If you need to use the operations in SortedSet or if in-order iteration is important to you, use TreeSet. Otherwise, use HashSet. You will probably use HashSet most of the time.
If iteration performance matters, it's important to choose an appropriate initial capacity for a HashSet. Choosing a capacity that's too high can waste space as well as time. (Iteration is linear in the sum of the number of entries and the capacity.) The default initial capacity is 101, which is often more than you need. In general, you should set the initial capacity to about twice the size that you expect the Set to grow to. If your estimate is way off, the Set might have to grow or you might waste a bit of space, but it won't have a big impact. If you know a prime number that's about the right size, use it. If not, use an odd number.
The initial capacity can be specified using the int constructor. For example, to construct a HashSet with an initial capacity of 17:
Set s= new HashSet(17);
HashSets have one other tuning parameter, the load factor. This is a measure of
how full the HashSet is allowed to get before its capacity is automatically
increased. When the number of entries exceeds the product of the load factor and
the current capacity, the capacity is doubled. Generally, the default load factor
(.75) offers a good trade-off between time and space costs. Higher values decrease
the space overhead, but increase the time it takes to look up an entry.
Map is an object that maps keys to values. Maps cannot contain duplicate keys:
each key can map to at most one value. If you've used java.util.Hashtable,
you're already familiar with the general characteristics of Map. Using Map is analogous
to using Set: If you need SortedMap operations or in-order collection-view
iteration, use TreeMap; otherwise, use HashMap.
Hashtable and Vector, are fully synchronized.
Programs pay the costs associated with thread synchronization even when they're
used in a single-threaded environment. By default, the classes in the Collections
Framework are not synchronized. This provides the fastest possible access to these
structures when they're used in a single-threaded environment. The Collections
Framework supports synchronized collections through a set of wrapper classes.
These wrapper classes add functionality to the standard Collection implementations, but delegate all of their real work to the standard implementations. Design patterns fans will recognize this as an example of the Decorator pattern.5 The wrapper implementations are anonymous: Rather than exposing additional public classes, the API provides a static factory method6 for each implementation. These factory methods, shown in Listing 8-14, are defined in the Collections class (not to be confused with the Collection interface).
public static Collection synchronizedCollection(Collection c); public static Set synchronizedSet(Set s); public static List synchronizedList(List list); public static Map synchronizedMap(Map m); public static SortedSet synchronizedSortedSet(SortedSet s); public static SortedMap synchronizedSortedMap(SortedMap m);
Factory methods for synchronized collections
Each factory method returns a synchronized (thread-safe) Collection that is backed by an instance of the specified collection type. For example, to create a thread-safe ArrayList object, you call Collections.synchronizedList:
AList list = Collections.synchronizedList(new ArrayList());
Collection created in this fashion is every bit as thread-safe as a "normally" synchronized collection like a Vector. To guarantee serial access, however, it is critical that all access to the backing collection be done through the Collection returned by the factory method. The easiest solution is to avoid keeping a reference to the backing Collection.
When you iterate over a synchronized collection, you have to manually synchronize the iteration operation. This is because iteration is performed through multiple calls into the collection and these calls have to be composed into a single atomic operation. Listing 8-15 shows the proper technique for iterating over a wrapper-synchronized collection. Failing to use this idiom can result in non-deterministic behavior.
Collection c = Collections.synchronizedCollection(myCollection);
synchronized(c) {
Iterator i = c.iterator(); // Must be in synchronized block!
while (i.hasNext())
foo(i.next());
}
Iterating over a thread-safe collection
With these synchronization wrappers, you can get complete thread-safety when you need it without having to incur the overhead if you're working in a single-threaded environment.
Collections. Two are particularly interesting from a performance perspective:
sort and binarySearch. Sorting and searching are common tasks that can take up
a great deal of time. The Collections Framework provides well-tested implementations
of these tasks that have well-documented performance characteristics. For
example, the sort method uses a merge sort7 that provides good performance
across a wide variety of situations.
These algorithm implementations are accessed through static utility methods defined in the Collections class. The method signatures are shown in Listing 8-16.
public static int binarySearch(List list, Object key);
public static int binarySearch(List list, Object key,
Comparator c);
public static void sort(List list) ;
public static void sort(List list, Comparator c);
Sorting and searching utilities
Collection are typed for Object. Many times, however,
you need to work with primitive types such as int, double, and boolean.
One approach is to use the object wrappers for primitive types such as Integer,
Double, and Boolean and then insert them into the Collection. This works, but
in performance-critical situations it's not a good solution. The overhead of allocating
a wrapper for each primitive and then extracting the primitive value from the
wrapper each time it's used is quite high. In performance-critical situations, a
better solution is to work with plain array structures when you're dealing with
collections of primitive types.
While working with plain arrays can be less convenient and more error-prone than using the Collection interface, the Collections Framework does include some utilities to make it easier. The java.util.Arrays class provides similar functionality to the java.util.Collections class. This class gives you access to the same well-tested, high-performance sorting and searching functions that you can use with other collections.
UnsupportedOperationException.
By making the collection returned by a method immutable, you can avoid the overhead of copying the collection while maintaining full encapsulation of your data.
Like the synchronization wrappers, there is one static factory method for each of the six core Collection interfaces, as shown in Listing 8-17. These methods reside in the Collections class.
public static Collection unmodifiableCollection(Collection c); public static Set unmodifiableSet(Set s); public static List unmodifiableList(List list); public static Map unmodifiableMap(Map m); public static SortedSet unmodifiableSortedSet(SortedSet s); public static SortedMap unmodifiableSortedMap(SortedMap m);
Immutable collection wrappers
Collection interface. It demonstrates
just how large an impact the choices you make can have on performance.
Listing 8-18 contains the benchmark we'll use to test the speed of different operations on a Collection.
public static void test(Collection c) {
System.out.println("Testing "+c.getClass());
Stopwatch timer = new Stopwatch().start();
add(c);
timer.stop();
System.out.println("add: "+
timer.getElapsedTime());
timer.reset().start();
iterate(c);
timer.stop();
System.out.println("iter: "+
timer.getElapsedTime());
if (c instanceof List) {
List l = (List)c;
timer.reset().start();
randomAccess(l);
timer.stop();
System.out.println("random: "+
timer.getElapsedTime());
}
timer.reset().start();
remove(c);
timer.stop();
System.out.println("remove: "+
timer.getElapsedTime());
}
Collections benchmark
The collection operations performed by this benchmark are shown in Listing 8-19. These methods test the speed of adding items to the collection, iterating over the collection, and removing items from the collection. For collections that also implement the List interface, a method is provided to test the speed of randomly accessing items. (This function is only tested on List objects because other collections don't provide a get(int) method.)
public static void add(Collection c) {
for (int i = 0; i < NUM_ITEMS; i++) {
c.add(objects[i]);
}
}
public static void iterate(Collection c) {
for (int i = 0; i < 100; i++) {
Iterator iter = c.iterator();
while (iter.hasNext()) {
Object o = iter.next();
}
}
}
public static void remove(Collection c) {
Iterator iter = c.iterator();
while (iter.hasNext()) {
Object o = iter.next();
iter.remove();
}
}
public static void randomAccess(List c) {
for (int i = 0; i < NUM_ITEMS; i++) {
Object o = c.get(random());
}
}
Collection operations
Collection interface. In
this section, we examine the results for five classes:
The randomAccess method in Listing 8-19 contains a call to a method
called random. The initial version of this benchmark used the
java.util.Random class to generate the random index for the item to retrieve.
However, it turned out that generating the random numbers sometimes
took longer than getting the items. Since we wanted to focus on the
performance characteristics of the List, not the random number generator,
this version of the benchmark uses a different approach. A large array
of random numbers is created at start-up time and the random method
simply pulls them out of the array during the timed part of the test. This
helped focus the results of the benchmark on the List behavior.
|
Listing 8-20 shows how instances of these classes are created and passed to the benchmark for testing.
public static void main(String[] args) {
Collection arrayList = new ArrayList();
test(arrayList);
Collection linkedList = new LinkedList();
test(linkedList);
Collection vector = new Vector();
test(vector);
Collection treeSet = new TreeSet();
test(treeSet);
Collection hashSet = new HashSet();
test(hashSet);
}
Testing the collections
Table 8-1 shows how long it took to execute each test on our test configuration. Note that columns that contain 0 indicate that the time it took to execute the test was below the resolution of the System.getCurrentMillis timer. (In other words, these operations were really fast.)
These results illustrate how important the selection of algorithms and data structures can be. Look at the results from the different List implementations. ArrayList and Vector, two similar classes that are both array-based structures, show similar performance characteristics. Since Vector provides synchronization by default, it's slightly slower. If you compare the ArrayList and LinkedList results, however, you'll see that they have very different performance characteristics. While these two structures are capable of performing the same operations, operations that are practically free on the ArrayList are very costly with the LinkedList and vice-versa. The results also show how the set-based structures have very different performance characteristics from the list-based structures.
As you can see from this simple benchmark, you need to understand how your data structures will be used before settling on any particular implementation. Abstractions such as the Collection interface make it fairly easy to try several alternatives, and benchmarking and profiling will help you identify which is best.
Abrash, Michael. Graphics Programming Black Book, Special Edition, The Coriolis Group, Scottsdale, AZ, 1997.
Binstock, Andrew and John Rex. Practical Algorithms for Programmers, Addison-Wesley, Reading, MA, 1995.
Sedgewick, Robert. Algorithms (Second Edition), Addison-Wesley, Reading, MA, 1988.
For more information about big-oh notation, see Andrew Binstock and John Rex, Practical Algorithms for Programmers, p. 2. Addison-Wesley, 1995.
2For more information about Bresenham's algorithm, see Michael Abrash, Graphics Programming Black Book, Special Edition, pp. 657-678. The Coriolis Group, 1997.
3For more information about the Towers of Hanoi problem, see http://www.pangea.ca/kolar/javascript/Hanoi/HTonWEBE.html.
4O(1) instead of log(n) time for most operations.
5For more information about the Decorator pattern, see Erich Gamma, et al., Design Patterns: Elements of Reusable Object-Oriented Software, pp. 127-134. Addison-Wesley, 1995.
6For more information about Factory methods, see Erich Gamma, et al., pp. 107-116.
7For more information about merge sort, see Robert Sedgewick, Algorithms (Second Edition), pp. 163-175. Addison-Wesley, 1988.
Copyright © 2001, Sun Microsystems,Inc.. All rights reserved.