[Contents] [Prev] [Next] [Index]

CHAPTER 8 - Algorithms and
Data Structures

Understanding how to choose the best algorithm or data structure for a particular task is one of the keys to writing high-performance software. If you start with the wrong algorithm, all the micro-level tuning in the world won't produce optimal results.

This chapter discusses how to select and evaluate algorithms and data structures and addresses how these choices can affect the performance of your Java programs. Section 8.4 provides an introduction to the Java 2 Collections Framework, which contains a variety of high-performance data structures and algorithms that you can use.

A full introduction to algorithms and data structures would be an entire book by itself-in fact, the study of algorithms and data structures has been the subject of many books and papers. If you'd like to read more on this subject, see Section 8.6 for some good references.

8.1 Selecting Algorithms

When you choose an algorithm, there are always trade-offs involved. An algorithm might be the fastest solution for one type of data, but perform significantly slower for others. For example, there is no single sorting algorithm that is optimal in all situations. You have to consider how the algorithm is going to be used and select the solution that provides the best results most often.

For example, let's look at two different ways to calculate the sum of all integers that fall within a specified range. One solution is to simply build a loop around an addition statement, as shown in Listing 8-1.

public class SimpleSummer {
    public long sum(int start, int stop){
        long acc = 0;
        for(int i=start;i<=stop;i++){
            acc += i;
        return acc;
    }
}
Simple sum computer

As long as the number of integers to be summed is small, this simple algorithm works just fine. However, as the number of integers to be summed increases, the amount of time required to execute this algorithm grows linearly. This is what's known as an order-N function, which is abbreviated O(n) in big-oh notation.1

If your program needed this number-summing functionality and profiling showed that this function was a hot spot, you would probably want to speed it up. Your first inclination might be to tune the implementation of the algorithm. While in some cases tuning the existing algorithm might help, this implementation is pretty simple and there isn't much you can do. A more effective approach would be to consider other solutions.

For example, there is another way to sum a series of integers. The sum of any series of integers between zero and n, inclusive, can be calculated with the formula (n(n+1))/2. To calculate the sum of the series between n and m, inclusive, you can use the technique shown in Listing 8-2.

public class FormulaicSummer {
    public long sum(int start, int stop){
        int bigseries = stop*(stop+1)/2;
        start--; // so result is inclusive of start
        int littleseries = start*(start+1)/2;
        return bigseries-littleseries;
    }
}
Smarter sum computer

This method returns the result for any start and stop values in a fixed amount of time, regardless of how many numbers are summed. This is called an order-1 or constant-time function, commonly abbreviated O(1).

8.1.1 Comparing Algorithms

Once you have alternate solutions like our two sum computers, you can compare them and choose the one that performs best for your program. The easiest way to compare the algorithms is to write a benchmark.

Listing 8-3 shows the code for a micro-benchmark that compares the two summing algorithms. (In practice, it's best to test alternative algorithms under different situations on multiple pieces of data-micro-benchmarks don't always show real-world results. For our simple example, however, this micro-benchmark is sufficient.)

public static void main(String args[]) {
   System.out.println("Sum integers between 13 and 1000");
   System.out.println("SimpleSummer:");
   long acc;
   SimpleSummer ss = new SimpleSummer();
   long time = System.currentTimeMillis();
   for (int i=0;i<5000;i++)
      acc = ss.sum(13,1000);
   long time2 = System.currentTimeMillis();
   System.out.println("Took "+(time2-time)+" ms.");
   System.out.println("FormulaicSummer:");
   FormulaicSummer fs = new FormulaicSummer();
   time = System.currentTimeMillis();
   for (int i=0;i<5000;i++)
      acc = fs.sum(13,1000);
   time2 = System.currentTimeMillis();
   System.out.println("Took "+(time2-time)+" ms.");

}

Micro-benchmark

The results of this benchmark show that the constant-time algorithm is much faster than the linear algorithm for the specified inputs-the order-N algorithm was 65 times slower on our test configuration. As this example illustrates, switching to a smarter algorithm can lead to significant performance improvements that might be impossible to realize through tuning alone.

Order-N algorithms are not always a bad choice-in fact, they often provide the best solution. A more common problem is algorithms that have performance characteristics worse than order-N, where the time required to execute the algorithm increases nonlinearly as the amount of data increases. Algorithms that exhibit such super-linear performance characteristics, such as O(n2) algorithms, are a common cause of performance degradation and are a leading cause of poor scalability.

Comparison of algorithm performance

Figure 8-1 shows performance results for three theoretical algorithms that produce identical results. As you can see, it is possible for an order-N or N-squared algorithm to perform better than a constant-time algorithm on low numbers of data points. If you know you're only going be working with small amounts of data, the constant-time algorithm might be a poor choice. Similarly, if you'reworking with large or unpredictable amounts of data, using the N-squared algorithm could lead to significant performance problems.

8.1.2 Achieving Elegance

Computer scientists and software engineers sometimes discuss the elegance of a piece of code. Elegant code, like good art, is a bit hard to define. However, an elegant solution is generally one that approaches the problem in a way that is minimally complex and maximally suited to the particular problem being solved. As a result, elegant code is almost always fast code. One rule of thumb is that an elegant solution is one that you wouldn't mind calculating by hand.

In contrast, solutions such as the iterative loop in Listing 8-1 are referred to as brute-force solutions. Such solutions only work because computers can do repetitive tasks very quickly. When you find yourself tuning a brute-force solution to remove a bottleneck, consider whether there is a more elegant way to approach the problem. It's very possible that no amount of micro-tuning will significantly improve performance, while changing the algorithm could eliminate the bottleneck entirely.

8.1.3 Considering the Problem Space

When deciding what algorithm to use, it is important to take the problem space into account. The problem space is the set of problems with which an algorithm will be presented. You can often achieve a more efficient solution by constraining the algorithm to handle only the defined problem space instead of using a more general algorithm. In short, what your program doesn't do is often as important as what it does.

To illustrate this point, let's look at a common problem in graphics programming: drawing lines. The grid drawing framework shown in Listing 8-4 fills a canvas with a grid of lines. Line drawing is implemented as a general framework based on a LineDrawer interface so we can easily compare the performance of different line drawing algorithms.

// Grid container
public class GridTest extends JFrame {
    GridCanvas myCanvas;
    TextField status;
    
	public GridTest(LineDrawer ld) {
		super(ld.getClass().toString());
		getContentPane().setLayout(new BorderLayout());
		setSize(500,500);
		status = new TextField("TESTING: Do not disturb!");
		myCanvas = new GridCanvas(ld,status);
		getContentPane().add(myCanvas,BorderLayout.CENTER);
		getContentPane().add(status,BorderLayout.SOUTH);
		setDefaultCloseOperation(DISPOSE_ON_CLOSE);							
		show();
		myCanvas.test();			
	}
	
	public void closetest() {
		dispose();
	}	
	
}

// Grid surface with timing code
public class GridCanvas extends JComponent {
   LineDrawer ld;
   TextField status;
   static long acc=0;
   BufferedImage bi;

   public GridCanvas(LineDrawer ld, TextField status){
      this.ld = ld;
      this.status = status;   
   }

   public void test(){
      Rectangle r = getBounds();
      bi = new BufferedImage(r.width,
                             r.height,
                             BufferedImage.TYPE_INT_RGB);         
      Stopwatch watch = new Stopwatch();
      for (int j=0;j<10;j++) { // do 10 tiems for stability												
         watch.reset().start();
         for (int i =0; i< (r.height-1)/2; i++) {
            ld.drawLine(bi,0,i*2,r.width-1,i*2);
         }
         for (int  i=0; i<(r.width-1)/2;i++) {
            ld.drawLine(bi,i*2,0,i*2,r.height-1);        
         } 
         watch.stop();
         acc += watch.getElapsedTime();
      }        
      status.setText("Time (ms) to do grid = "+(acc/10));
      repaint();
   }

   public void paint(Graphics g) {
      g.drawImage(bi,0,0,this);
   }    
} 	
// The LineDrawer interface
public interface LineDrawer {
	public void drawLine(BufferedImage i,int x0, int 
                                      y0, int x1, int y1);
}
Grid test framework

An object that implements the LineDrawer interface is passed to the framework to draw the lines into the BufferedImage. Listing 8-5 shows a concrete implementation of LineDrawer that's based on Bresenham's algorithm, a well-known, efficient algorithm for general line drawing.

public class BasicBresenham implements LineDrawer {
   public void drawLine(BufferedImage bi, 
                        int x0, int y0, 
                        int x1, int y1) {
      int dx,dy;
      int temp;
      // reduce to half the octant cases by always 
      // drawing + in y
      if (y0>y1) {
         temp = y0;
         y0 = y1;
         y1 = temp;
         temp = x0;
         x0 = x1;
         x1 = temp;
      }
      dx = x1-x0;
      dy = y1-y0;
      if (dx>0) {
         if (dx>dy) {
            octant0(bi,x0,y0,dx,dy,1);
         } else {
            octant1(bi,x0,y0,dx,dy,1);
         }
      } else {
         dx = -dx;
         if (dx>dy) {
            octant0(bi,x0,y0,dx,dy,-1);
         } else {
            octant1(bi,x0,y0,dx,dy,-1);
         }
      }
   }
   private void octant0(BufferedImage bi, 
                        int x0, int y0, 
                        int dx, int dy,
                        int xdirection){
      int DeltaYx2;
      int DeltaYx2MinusDeltaXx2;
      int ErrorTerm;
      int pix = 0xffffffff;
      // set up initial error term and drawing values
      DeltaYx2 = dy*2;
      DeltaYx2MinusDeltaXx2 = DeltaYx2 - (dx*2);
      ErrorTerm = DeltaYx2 - dx;
      // draw loop
      bi.setRGB(x0,y0,pix); // draws a single point
      while ( dx-- > 0) {
      // check if we need to advance y
         if (ErrorTerm >=0) {
            // advance Y and reset ErrorTerm
            y0++;
            ErrorTerm += DeltaYx2MinusDeltaXx2;
         } else {
            // add error to ErrorTerm
            ErrorTerm += DeltaYx2;
         }
         x0 += xdirection;
         bi.setRGB(x0,y0,pix);
      }
   }
   private void octant1(BufferedImage bi, 
                        int x0, int y0, 
                        int dx, int dy,
                        int xdirection){
      int DeltaXx2;
      int DeltaXx2MinusDeltaYx2;
      int ErrorTerm;
      int pix = 0xffffffff;
      // set up initial error term and drawing values
      DeltaXx2 = dx * 2;
      DeltaXx2MinusDeltaYx2 = DeltaXx2 - (dy*2);
      ErrorTerm = DeltaXx2- dy;
      // draw loop
      bi.setRGB(x0,y0,pix);
      while ( dy-- > 0) {
      // check if we need to advance x
         if (ErrorTerm >= 0) {
            // advance X and reset ErrorTerm
            x0 += xdirection;
            ErrorTerm += DeltaXx2MinusDeltaYx2;
         } else {
            // add to ErrorTerm
            ErrorTerm += DeltaXx2;
         }
         y0++;
         bi.setRGB(x0,y0,pix);
      }
   }
}
General-purpose line drawing algorithm

Bresenham's2 is a very efficient, general-purpose algorithm for drawing lines. Switching to a different general-purpose line drawing algorithm probably wouldn't gain us anything. However, if you carefully consider this application's problem domain, it is possible to find a better solution. Bresenham's is designed to draw lines of any arbitrary slope, but this application draws a grid that consists of lines that are either completely horizontal or completely vertical. Listing 8-6 shows a line drawer that uses this information about the problem space to provide a simpler solution.

To compare the two algorithms, we'll use the simple benchmark wrapper in Listing 8-7, which can invoke GridTest with either line drawing solution.

public class FlatLiner implements LineDrawer {
 static int pix = 0xffffffff;
   static int[] pixa = new int[0];
   static int pixasz = 0;
  public void drawLine(BufferedImage bi, int x0, int y0, int x1, int y1){
      if (x0 == x1) {
         int h = (y1-y0);
         if (h < 0) h = 0-h;
         h++;
         if (h>pixasz) {
            pixasz = h;
            pixa = new int[h];
            for (int i=0;i<pixasz;i++) pixa[i] = 0xffffffff;
         }
         bi.setRGB(x0,y0,1,h,pixa,0,1);
      } else if (y0 == y1) {
         int w = x1-x0;
         if (w < 0) w = 0-w;
         w++;
         if (w>pixasz) {
            pixasz = w;
            pixa = new int[w];
            for (int i=0;i<pixasz;i++) pixa[i] = 0xffffffff;
         }
         bi.setRGB(x0,y0,w,1,pixa,0,pixasz);
      } else {
         throw (new RuntimeException("FlatLiner called on " 
                                     + "non-flat line"));
      }
   }
}
Special-purpose line drawing algorithm
public class GridBenchmark {
 
 public static void main(String args[]) {
      if (Array.getLength(args) < 1) {
         System.out.println("usage: GridBenchmark b|f");
         System.exit(0);
      }
      if ((args[0].startsWith("b")) || 
          (args[0].startsWith("B"))) {
         new GridTest(new BasicBreshenham());
      } else {
         new GridTest(new FlatLiner());
      }
   }
}
Benchmark wrapper for grid drawing

On our test configuration, the FlatLiner implementation ran approximately 25 percent faster than the BasicBresenham implementation. This result is typical-special-purpose algorithms usually run faster than general-purpose algorithms. When selecting an algorithm, consider your problem space and try constraining your solution to just the necessary operations.

Keep in mind that alternate solutions aren't necessarily mutually exclusive-in many cases it's possible to dynamically select the best algorithm. For example, the FlatLiner and BasicBresenham implementations can be combined by invoking Bresenham's algorithm inside FlatLiner. Instead of throwing an exception for lines with undefined slopes or slopes other than 0, simply insert the code from the BasicBresenham implementation. This results in a general-purpose solution that is just as fast as FlatLiner for simple grids and almost as fast as the general-purpose Bresenham's for all other lines. The cost of the two extra if-statements is negligible. This tactic of testing for easy-to-solve subcases is a mainstay of high-performance programming.

8.2 Using Recursive Algorithms

Recursion is a handy programming tool-many algorithms can be expressed naturally in recursive form. Problems that are complex or extremely difficult to solve using linear techniques often have simple recursive solutions. Although conventional wisdom often advocates avoiding recursion, it can offer the best solution to some problems. Not all programming languages support recursion, but the Java programming language does.

Recursive algorithms usually take the form shown in Listing 8-8, where the algorithm repeatedly calls itself, simplifying the problem at each iteration.

Solveit(problem) {
    if (problem is trivial) {
        return result;
    } else {
        simplify problem;
        return SolveIt(simplified problem);
    }
}
Pseudocode for recursion

For example, Listing 8-9 shows a simple, recursive algorithm for reversing the letters in a String.

public class TrivialApplication {
    public static void main(String args[]) {
        System.out.println( reverseString("Hello World!"));
    }
    static public String reverseString(String s){
        if (s.length() <= 1)
            return s;
        else {
            char c = s.charAt(0);
            eturn reverseString(s.substring(1))+c;
        }
    }
}
Recursive string reverser

If recursion is so useful, why would you want to avoid it? From a performance perspective, there are two potential problems with using recursion: memory consumption and the overhead of successive function calls.

Each successive call to a function creates a new stack frame. If the function is called too many times, the program could potentially run out of memory. For example, passing in a very long string to the reverseString method in Listing 8-9, might cause a StackOverflowError to be thrown.

The performance of a recursive algorithm can be affected by the overhead of the successive function calls. In languages like C++, the overhead is so great that recoding a naturally recursive algorithm using nonrecursive techniques is almost always a performance win.

In the Java language, the overhead of successive function calls can be less of an issue. For example, let's look at both recursive and nonrecursive implementations of the Towers of Hanoi problem.3 Listing 8-10 shows the recursive solution.

public static void Towers(int numDisks, 
                          char src, char dest, char temp){
   if (numDisks == 1) {
      System.out.println("Move top disk from pole "+
                         src+" to pole "+dest);
   } else {
      Towers(numDisks-1,src,temp,dest);
      Towers(1,src,dest,temp);
      Towers(numDisks-1,temp,dest,src);
   }
}
Recursive Towers of Hanoi

The nonrecursive solution is much more complex. To implement the solution as a linear routine, shown in Listing 8-11, you have to create and manage your own stack and ensure that items are pushed in the right order. Despite the increase in complexity, conventional wisdom is that this version should be faster.

public static void LinearTowers(int orig_numDisks, 
                                char orig_src, char orig_dest, 
                                char orig_temp){
   int numDisksStack[] = new int[100];
   char srcStack[] = new char[100];
   char destStack[] = new char[100];
   char tempStack[] = new char[100];
   int stacktop = 0;
   numDisksStack[0] = orig_numDisks;
   srcStack[0] = orig_src;
   destStack[0] = orig_dest;
   tempStack[0] = orig_temp;
   stacktop++;
   while (stacktop>0) { // this is the same as having
                        // call frames on the system stack
      stacktop--; // pop current off stack
      int numDisks = numDisksStack[stacktop];
      char src = srcStack[stacktop];
      char dest = destStack[stacktop];
      char temp = tempStack[stacktop];
      if (numDisks == 1) {
         System.out.println("Move top disk from pole "
                            + src + "to pole " + dest);
      } else {
         /* do this after the other two */
         /* Towers(numDisks-1,temp,dest,src); */
         numDisksStack[stacktop] = numDisks -1;
         srcStack[stacktop] = temp;
         destStack[stacktop] = dest;
         tempStack[stacktop++] = src;
         /* do this after the first */
         /* Towers(1,src,dest,temp); */
         numDisksStack[stacktop] =1;
         srcStack[stacktop] = src;
         destStack[stacktop] = dest;
         tempStack[stacktop++] = temp;
         /* do this first */
         /* Towers(numDisks-1,src,temp,dest); */
         numDisksStack[stacktop] = numDisks -1;
         srcStack[stacktop] = src;
         destStack[stacktop] = temp;
         tempStack[stacktop++] = dest;}
   }
}
Nonrecursive Towers of Hanoi

To compare the two algorithms, we'll use the program shown in Listing 8-12. Note that the println call in each implementation is actually much slower than the calculation and needs to be removed so we can get a realistic measurement of the calculation speeds.

public static void main(String args[]) {
   Stopwatch watch = new Stopwatch();
   watch.reset().start();
   Towers(25,'A','B','C');
   watch.stop();
   System.out.println("Time to solve recursively = "+
                         watch.getElapsedTime());
   watch.reset().start();
   LinearTowers(25,'A','B','C');
   watch.stop();
   System.out.println("Time to solve lineraly = "+
                         watch.getElapsedTime());
}
Timing harness for the Towers of Hanoi

As it turns out, the recursive version runs about 15 percent faster under our test configuration. Why isn't the nonrecursive version faster, as it probably would be in a C++ program?

Many variables might be affecting the performance of these algorithms, but one possible reason that the linear version is slower is that the JVM performs bounds checking on array access. This means that every time the linear version of the algorithm manipulates its artificial stacks, the JVM has to check the array index to make sure it is valid. In this case, these bounds checks seem to cost more than invoking a static method.

This example illustrates the importance of measuring performance before you begin optimizing your code. Unless you run a benchmark to compare the two algorithms, you might be inclined to linearize this method on the assumption that it would improve performance. In this case, that would just lead to slower, less maintainable code.

8.3 Beyond Simple Algorithms

When considering algorithms, programmers often focus on the merits of heavily studied algorithms, such as different sorting and hashing algorithms. While this is important, keep in mind that algorithms aren't just neat, well-defined solutions for standard tasks. Algorithms pervade your code and have a major impact on the performance of your software.

A real-life example of this was found in the JComboBox component in early versions of the Swing toolkit. Bugs were filed against JComboBox, noting that it took an extremely long time to add items to a JComboBox object's drop-down list. Further testing showed that performance was fine for small numbers of items, but as the number of items grew, the time it took to add them grew exponentially. It turned out that adding items to a JComboBox exhibited O(n2) behavior. In other words, if it took 100 milliseconds to add 10 items, it might take 10 seconds to add 100 items. For large numbers of items, this behavior obviously wasn't acceptable.

As it turned out, this N-squared behavior wasn't caused by the implementation of a single method. It was the result of interactions between several classes involved in the operation of JComboBox. Once the problem was identified, it was possible to rewrite selected classes to make the behavior more closely resemble a O(n) algorithm. These changes significantly reduced the start-up times of many applications. Understanding this problem also made it possible to document that there were different ways to use JComboBox. One technique allows items to be added in a constant-time fashion, which further improved the performance of many applications. (For more information about JComboBox performance issues, see Chapter 10, Swing Models and Renderers.)

8.4 Selecting Data Structures

Selecting the appropriate data structure for your task is often just as important as selecting the algorithm. Just as with algorithms, there are trade-offs that need to be weighed when selecting a data structure-no single data structure is appropriate for all situations. This section discusses the data structures supplied with the Java 2 platform and some of the trade-offs to consider when using them.

8.4.1 Java 2 Collections

The Java platform has always provided a few basic data structures in the form of the java.util.Vector and java.util.Hashtable classes. Although they have been used successfully by thousands of developers, these classes have a number of performance-related problems. The Java 2 platform introduced an expanded set of data structures known as the Collections Framework. The classes in this framework address a number of the shortcomings of the Vector and Hashtable classes and add some high-performance algorithms for tasks such as sorting.

What Is a Collection?

A collection is an object that groups multiple elements into a single unit. Collections are used to store, retrieve, and manipulate data. They are also used to transmit data from one method to another. Collections typically represent data items that form a natural group such as

The Collections Framework is a unified architecture for representing and manipulating collections that reduces programming effort while increasing performance. The framework is designed to

By providing high-performance, high-quality implementations of useful data structures and algorithms, the Collections Framework helps you improve the quality and performance of your software. The implementations of each interface are interchangeable, which makes it possible to tune a program by simply switching collection implementations. These ready-to-use data structures free you from the drudgery of writing custom data structures, leaving you with more time to devote to improving the quality and performance of the rest of your program.

8.4.2 The Collection Interfaces

The Collections Framework is based on six collection interfaces. The framework provides implementations of each these interfaces and algorithms to manipulate them. Figure 8-2 shows the hierarchy of interfaces and classes that make up the core of the framework.
Collections Framework class hierarchy

The interfaces in the Collections Framework define the types of data structures the framework provides. Each type of data structure is best suited for a particular type of task. Understanding the purpose of each of these data structures will enable you to select the best data structure for a task. For some of the interfaces, the framework provides multiple, concrete implementations. To achieve optimal speed and memory utilization, you need to use the implementation best suited to the task. The purpose of each interface and the trade-offs involved in using the various implementations are discussed in the following sections.

8.4.3 Collection Interface

The Collection interface is the foundation of the Collections Framework; it defines the basic functionality implemented by all collections. A Collection represents a group of objects, which are known as the collection's elements. Listing 8-13 shows the methods defined by the Collection interface.

public interface Collection {
    int         size();
    boolean     isEmpty();
    boolean     contains(Object o);
    Iterator    iterator();
    Object[]    toArray();
    Object[]    toArray(Object a[]);
    boolean     add(Object o);
    boolean     remove(Object o);
    boolean     containsAll(Collection c);
    boolean     addAll(Collection c);
    boolean     removeAll(Collection c);
    boolean     retainAll(Collection c);
    void        clear();
    boolean     equals(Object o);
    int         hashCode();
}
Collection interface

The Collections Framework doesn't provide any direct implementations of the Collection interface. Instead, it provides implementations of more specific subinterfaces like Set and List. Some of these implementations allow duplicate elements and some don't. Similarly, some implementations are ordered and others are not. Features such as ordering and duplicate elimination have a performance cost, so you should select the collection type with the fewest features that still meets your needs.

8.4.4 List Objects

A List is an ordered collection, sometimes referred to as a sequence. List objects can contain duplicate elements. If you've used java.util.Vector, you're already familiar with the general characteristics of List. When using a List, you generally have precise control over where each element is inserted in the List. You can access elements in a list by their integer indexes.

Most of the time when you need a List, the ArrayList implementation is the best choice. ArrayList offers constant-time positional access and is quite fast. It doesn't have to allocate a node object for each element in the List and it uses the native method System.arraycopy to move multiple elements simultaneously. An ArrayList is essentially a Vector without the synchronization overhead.

ArrayList has one tuning parameter, its initial capacity. This is the number of elements that the ArrayList can hold before it has to grow. If you have a good idea how big the collection is, you can avoid unnecessary allocations when items are added to the collection by setting this parameter.

You might want to consider using LinkedList if you need to frequently add elements to the beginning of the List or iterate over the List and delete elements from its interior. These are constant-time operations in a LinkedList and linear-time operations in an ArrayList. However, keep in mind that this performance gain might be more than offset by other factors. Positional access can be much slower in a LinkedList-it's a linear-time operation in a LinkedList and a constant-time operation in an ArrayList. The fixed overhead for LinkedList operations is also much greater than for ArrayList operations.

The best way to decide which type of List to use is to create some benchmarks that reflect how the List is used in your program. Since ArrayList and LinkedList share the same interface, it's easy to swap implementations.

8.4.5 Set Objects

A Set is a collection that cannot contain duplicate elements. This interface models the mathematical abstraction set. It is used to represent sets like the cards in a poker hand, the courses that make up a student's schedule, or the processes running on a machine.

The two general-purpose Set implementations are HashSet and TreeSet. Deciding which to use is very straightforward. HashSet is much faster4 but offers no ordering guarantees. If you need to use the operations in SortedSet or if in-order iteration is important to you, use TreeSet. Otherwise, use HashSet. You will probably use HashSet most of the time.

If iteration performance matters, it's important to choose an appropriate initial capacity for a HashSet. Choosing a capacity that's too high can waste space as well as time. (Iteration is linear in the sum of the number of entries and the capacity.) The default initial capacity is 101, which is often more than you need. In general, you should set the initial capacity to about twice the size that you expect the Set to grow to. If your estimate is way off, the Set might have to grow or you might waste a bit of space, but it won't have a big impact. If you know a prime number that's about the right size, use it. If not, use an odd number.

The initial capacity can be specified using the int constructor. For example, to construct a HashSet with an initial capacity of 17:

Set s= new HashSet(17);
HashSets have one other tuning parameter, the load factor. This is a measure of how full the HashSet is allowed to get before its capacity is automatically increased. When the number of entries exceeds the product of the load factor and the current capacity, the capacity is doubled. Generally, the default load factor (.75) offers a good trade-off between time and space costs. Higher values decrease the space overhead, but increase the time it takes to look up an entry.

8.4.6 Map Objects

A Map is an object that maps keys to values. Maps cannot contain duplicate keys: each key can map to at most one value. If you've used java.util.Hashtable, you're already familiar with the general characteristics of Map. Using Map is analogous to using Set: If you need SortedMap operations or in-order collection-view iteration, use TreeMap; otherwise, use HashMap.

8.4.7 Synchronized Collections

The older data structure classes, Hashtable and Vector, are fully synchronized. Programs pay the costs associated with thread synchronization even when they're used in a single-threaded environment. By default, the classes in the Collections Framework are not synchronized. This provides the fastest possible access to these structures when they're used in a single-threaded environment. The Collections Framework supports synchronized collections through a set of wrapper classes.

These wrapper classes add functionality to the standard Collection implementations, but delegate all of their real work to the standard implementations. Design patterns fans will recognize this as an example of the Decorator pattern.5 The wrapper implementations are anonymous: Rather than exposing additional public classes, the API provides a static factory method6 for each implementation. These factory methods, shown in Listing 8-14, are defined in the Collections class (not to be confused with the Collection interface).

public static Collection synchronizedCollection(Collection c);
public static Set synchronizedSet(Set s);
public static List synchronizedList(List list);
public static Map synchronizedMap(Map m);
public static SortedSet synchronizedSortedSet(SortedSet s);
public static SortedMap synchronizedSortedMap(SortedMap m);
Factory methods for synchronized collections

Each factory method returns a synchronized (thread-safe) Collection that is backed by an instance of the specified collection type. For example, to create a thread-safe ArrayList object, you call Collections.synchronizedList:

List list = Collections.synchronizedList(new ArrayList());
A Collection created in this fashion is every bit as thread-safe as a "normally" synchronized collection like a Vector. To guarantee serial access, however, it is critical that all access to the backing collection be done through the Collection returned by the factory method. The easiest solution is to avoid keeping a reference to the backing Collection.

When you iterate over a synchronized collection, you have to manually synchronize the iteration operation. This is because iteration is performed through multiple calls into the collection and these calls have to be composed into a single atomic operation. Listing 8-15 shows the proper technique for iterating over a wrapper-synchronized collection. Failing to use this idiom can result in non-deterministic behavior.

Collection c = Collections.synchronizedCollection(myCollection);
synchronized(c) {
    Iterator i = c.iterator(); // Must be in synchronized block!
    while (i.hasNext()) 
        foo(i.next());
}
Iterating over a thread-safe collection

With these synchronization wrappers, you can get complete thread-safety when you need it without having to incur the overhead if you're working in a single-threaded environment.

8.4.8 Collections Framework Algorithms

The Collections Framework provides some common algorithms that operate on Collections. Two are particularly interesting from a performance perspective: sort and binarySearch. Sorting and searching are common tasks that can take up a great deal of time. The Collections Framework provides well-tested implementations of these tasks that have well-documented performance characteristics. For example, the sort method uses a merge sort7 that provides good performance across a wide variety of situations.

These algorithm implementations are accessed through static utility methods defined in the Collections class. The method signatures are shown in Listing 8-16.

public static int binarySearch(List list, Object key);
public static int binarySearch(List list, Object key, 
                               Comparator c);
public static void sort(List list) ;
public static void sort(List list, Comparator c);
Sorting and searching utilities

8.4.9 Plain Arrays

Most of the Collections Framework is geared toward handling collections of objects-all the methods in Collection are typed for Object. Many times, however, you need to work with primitive types such as int, double, and boolean. One approach is to use the object wrappers for primitive types such as Integer, Double, and Boolean and then insert them into the Collection. This works, but in performance-critical situations it's not a good solution. The overhead of allocating a wrapper for each primitive and then extracting the primitive value from the wrapper each time it's used is quite high. In performance-critical situations, a better solution is to work with plain array structures when you're dealing with collections of primitive types.

While working with plain arrays can be less convenient and more error-prone than using the Collection interface, the Collections Framework does include some utilities to make it easier. The java.util.Arrays class provides similar functionality to the java.util.Collections class. This class gives you access to the same well-tested, high-performance sorting and searching functions that you can use with other collections.

8.4.10 Immutable Collections

Chapter 7 addresses the importance of making informed decisions about object mutability. The ability to return immutable objects can be important when you're working with collections, which can be expensive to copy. The Collections Framework provides wrapper implementations you can use to make a collection immutable. These wrappers are similar to the synchronization wrappers. They simply intercept any operations that would modify the collection and throw an UnsupportedOperationException.

By making the collection returned by a method immutable, you can avoid the overhead of copying the collection while maintaining full encapsulation of your data.

Like the synchronization wrappers, there is one static factory method for each of the six core Collection interfaces, as shown in Listing 8-17. These methods reside in the Collections class.

public static Collection unmodifiableCollection(Collection c);
public static Set unmodifiableSet(Set s);
public static List unmodifiableList(List list);
public static Map unmodifiableMap(Map m);
public static SortedSet unmodifiableSortedSet(SortedSet s);
public static SortedMap unmodifiableSortedMap(SortedMap m);
Immutable collection wrappers

8.5 Collections Example

Throughout this chapter, we've emphasized how the algorithms and data structures you use can have a large impact on the performance of your software. To help illustrate this point, this section examines a simple benchmark that compares the speed of various implementations of the Collection interface. It demonstrates just how large an impact the choices you make can have on performance.

Listing 8-18 contains the benchmark we'll use to test the speed of different operations on a Collection.

public static void test(Collection c) {
   System.out.println("Testing "+c.getClass());
   Stopwatch timer = new Stopwatch().start();
   add(c);
   timer.stop();
   System.out.println("add: "+ 
                      timer.getElapsedTime());
   timer.reset().start();
   iterate(c);
   timer.stop();     
   System.out.println("iter: "+ 
                      timer.getElapsedTime());
   if (c instanceof List) {
      List l = (List)c;
      timer.reset().start();
      randomAccess(l);
      timer.stop();
      System.out.println("random: "+ 
                         timer.getElapsedTime());        
   }
   timer.reset().start();      
   remove(c);
   timer.stop();
   System.out.println("remove: "+ 
                      timer.getElapsedTime());
}
Collections benchmark

The collection operations performed by this benchmark are shown in Listing 8-19. These methods test the speed of adding items to the collection, iterating over the collection, and removing items from the collection. For collections that also implement the List interface, a method is provided to test the speed of randomly accessing items. (This function is only tested on List objects because other collections don't provide a get(int) method.)

public static void add(Collection c) {
   for (int i = 0; i < NUM_ITEMS; i++) {
      c.add(objects[i]);
   }
}
public static void iterate(Collection c) {
   for (int i = 0; i < 100; i++) {
      Iterator iter = c.iterator();
      while (iter.hasNext()) {
         Object o = iter.next();
      }
   }
}
public static void remove(Collection c) {
   Iterator iter = c.iterator();
   while (iter.hasNext()) {
      Object o = iter.next();
      iter.remove();
   }
}
public static void randomAccess(List c) {
   for (int i = 0; i < NUM_ITEMS; i++) {
      Object o = c.get(random());
   }
}
Collection operations

8.5.1 Collection Benchmark Results

The collection benchmark defined in Section 8.5 can be used to perform basic performance testing on different implementations of the Collection interface. In this section, we examine the results for five classes:


Random Numbers in Benchmarks

The randomAccess method in Listing 8-19 contains a call to a method called random. The initial version of this benchmark used the java.util.Random class to generate the random index for the item to retrieve. However, it turned out that generating the random numbers sometimes took longer than getting the items. Since we wanted to focus on the performance characteristics of the List, not the random number generator, this version of the benchmark uses a different approach. A large array of random numbers is created at start-up time and the random method simply pulls them out of the array during the timed part of the test. This helped focus the results of the benchmark on the List behavior.



Listing 8-20 shows how instances of these classes are created and passed to the benchmark for testing.

public static void main(String[] args) {
   Collection arrayList = new ArrayList();
   test(arrayList);
   Collection linkedList = new LinkedList();
   test(linkedList);
   Collection vector = new Vector();
   test(vector);
   Collection treeSet = new TreeSet();
   test(treeSet);
   Collection hashSet = new HashSet();
   test(hashSet);     
}
Testing the collections

Table 8-1 shows how long it took to execute each test on our test configuration. Note that columns that contain 0 indicate that the time it took to execute the test was below the resolution of the System.getCurrentMillis timer. (In other words, these operations were really fast.)

Collection Benchmark Results

Class


Add


Iterate


Random


Remove


ArrayList


0 ms


660 ms


0 ms


2,360 ms


LinkedList


50 ms


1,100 ms


26,800 ms


0 ms


Vector


0 ms


880 ms


0 ms


2,580 ms


TreeSet


330 ms


1,430 ms


N/A


60 ms


HashSet


110 ms


1,430 ms


N/A


50 ms

These results illustrate how important the selection of algorithms and data structures can be. Look at the results from the different List implementations. ArrayList and Vector, two similar classes that are both array-based structures, show similar performance characteristics. Since Vector provides synchronization by default, it's slightly slower. If you compare the ArrayList and LinkedList results, however, you'll see that they have very different performance characteristics. While these two structures are capable of performing the same operations, operations that are practically free on the ArrayList are very costly with the LinkedList and vice-versa. The results also show how the set-based structures have very different performance characteristics from the list-based structures.

As you can see from this simple benchmark, you need to understand how your data structures will be used before settling on any particular implementation. Abstractions such as the Collection interface make it fairly easy to try several alternatives, and benchmarking and profiling will help you identify which is best.

8.6 References on Algorithms and Data Structures

Abrash, Michael. Graphics Programming Black Book, Special Edition, The Coriolis Group, Scottsdale, AZ, 1997.

Binstock, Andrew and John Rex. Practical Algorithms for Programmers, Addison-Wesley, Reading, MA, 1995.

Sedgewick, Robert. Algorithms (Second Edition), Addison-Wesley, Reading, MA, 1988.

Key Points

  • The performance of any algorithm or data structure depends on the context in which it is used.
  • You must consider your problem domain when designing or selecting algorithms.
  • Recursion can be a useful tool and might be less costly than you expect.
  • The Collections Framework provides pretested, high-performance algorithms and data structures that can be used in many common situations.
  • The Collections Framework gives you a high degree of control over synchronization and mutability.



[Contents] [Prev] [Next] [Index]

1

For more information about big-oh notation, see Andrew Binstock and John Rex, Practical Algorithms for Programmers, p. 2. Addison-Wesley, 1995.

2

For more information about Bresenham's algorithm, see Michael Abrash, Graphics Programming Black Book, Special Edition, pp. 657-678. The Coriolis Group, 1997.

3

For more information about the Towers of Hanoi problem, see http://www.pangea.ca/kolar/javascript/Hanoi/HTonWEBE.html.

4

O(1) instead of log(n) time for most operations.

5

For more information about the Decorator pattern, see Erich Gamma, et al., Design Patterns: Elements of Reusable Object-Oriented Software, pp. 127-134. Addison-Wesley, 1995.

6

For more information about Factory methods, see Erich Gamma, et al., pp. 107-116.

7

For more information about merge sort, see Robert Sedgewick, Algorithms (Second Edition), pp. 163-175. Addison-Wesley, 1988.

Copyright © 2001, Sun Microsystems,Inc.. All rights reserved.