Sun Java Solaris Communities My SDN Account Join SDN
 
Technical Articles and Tips

String Concatenation/Performance and Improving Java I/O Performance

 

Tech Tips Archive


WELCOME to the Java Developer Connection (JDC) Tech Tips, March 5, 2002. This issue covers:

Note that performance results vary depending on environment. The results shown in these tips were obtained by the author -- you might see different results on your system.

These tips were developed using Java 2 SDK, Standard Edition, v 1.3.

This issue of the JDC Tech Tips is written by Glen McCluskey.

Pixel

STRING CONCATENATION AND PERFORMANCE

The February 5, 2002 Tech Tip, "Writing toString Methods," included the following statement:

Note that using "+" within toString to build up the return value is not necessarily the most efficient approach. You might want to use StringBuffer instead.

A reader of the Tech Tip noted that the Java documentation says that StringBuffer is used to actually implement the + operator, and so the question is, what if anything do you gain in performance by explicitly using StringBuffer in your programs? This tip attempts to answer that question.

To start, let's look at an example -- one where a string is built up by repeating the same character over and over:

    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
    
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
    
    public class AppDemo1 {
        static final int N = 47500;
    
        public static void main(String args[]) {
    
            // build up string using +
    
            MyTimer mt = new MyTimer();
            String str1 = "";
            for (int i = 1; i <= n; i++) {
                str1 = str1 + "*";
            }
            system.out.println("elapsed time #1 = " +
                mt.getelapsed());
   
            // build up string using stringbuffer
    
            mt = new mytimer();
            stringbuffer sb = new stringbuffer();
            for (int i = 1; i <= n; i++) {
                sb.append("*");
            }
            string str2 = sb.tostring();
            system.out.println("elapsed time #2 = " +
                mt.getelapsed());
    
            // sanity check
    
            if (!str1.equals(str2)) {
              system.out.println("str1/str2 mismatch");
            }
        }
    }

When you run this program, the result should be something like this:

    elapsed time #1 = 61890
    elapsed time #2 = 16

Approach #2 explicitly uses StringBuffer, while approach #1 uses it implicitly as part of the implementation of the + operator. You can examine the bytecodes used to implement the first approach by saying:

    javap -c -classpath . AppDemo1

Why such a huge time difference between the two approaches? The second approach appends characters to a StringBuffer, which is quite efficient. Doesn't the first approach do the same? Actually, it doesn't. The statement:

    str1 = str1 + "*";

does not append characters to the string referenced by str1. That's because Java strings are immutable, they don't change after they're created. What actually happens is that:

  • A StringBuffer is set up
  • str1 is copied to it
  • The "*" is appended to the buffer
  • The result is converted to a string
  • The str1 reference is made to point at that string.
  • The old string that str1 previously referenced is then made available for garbage collection.

The loop in the demo goes through N iterations, and on each iteration, the contents of str1 (containing N-1 characters) must be copied to the buffer. This behavior implies that the first approach above has quadratic or worse performance. "Quadratic" means that the running time is proportional to the square of N. It is possible to effectively freeze an application by using this kind of loop.

The AppDemo1 example illustrates the case where you're repeatedly concatenating one string to another, such that both strings must be copied to a temporary area (StringBuffer), a new string created, and then the original string reference made to point at the new string.

But what if you're not doing this kind of operation, and instead, you simply have some code like this:

    public String toString() {
        return "X=" + x + " Y=" + y; 
    }

There's no loop or repetition here, and no string that keeps getting longer and longer. Is there any harm in using + instead of StringBuffer in this example?

To answer that question, let's look at some more code:

    class MyPoint {
        private final int x, y;
        private final String cache;
    
        public MyPoint(int x, int y) {
            this.x = x;
            this.y = y;
            cache = "X=" + x + " Y=" + y; 
        }
    
        public String toString1() {
            return "X=" + x + " Y=" + y; 
        }
    
        public String toString2() {
            StringBuffer sb = new StringBuffer();
            sb.append("X=");
            sb.append(x);
            sb.append(" Y=");
            sb.append(y);
            return sb.toString();
        }
    
        public String toString3() {
            String s = "";
            s = s + "X=";
            s = s + x;
            s = s + " Y=";
            s = s + y;
            return s;
        }
    
        public String toString4() {
            return cache;
        }
    }
    
    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
    
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
    
    public class AppDemo2 {
        static final int N = 1000000;
        public static void main(String args[]) {
            MyPoint mp = new MyPoint(37, 47);
            String s1 = null;
            String s2 = null;
            String s3 = null;
            String s4 = null;
    
            // toString1
    
            MyTimer mt = new MyTimer();
            for (int i = 1; i <= N; i++) {
                s1 = mp.toString1();
            }
            System.out.println("toString1 " + 
                mt.getElapsed());
    
            // toString2
    
            mt = new MyTimer();
            for (int i = 1; i <= N; i++) {
                s2 = mp.toString2();
            }
            System.out.println("toString2 " + 
                mt.getElapsed());
    
            // toString3
    
            mt = new MyTimer();
            for (int i = 1; i <= N; i++) {
                s3 = mp.toString3();
            }
            System.out.println("toString3 " + 
                mt.getElapsed());
    
            // toString4
    
            mt = new MyTimer();
            for (int i = 1; i <= N; i++) {
                s4 = mp.toString4();
            }
            System.out.println("toString4 " + 
                mt.getElapsed());
    
            // sanity check to make sure equivalent
            // results returned from each toString method
    
            if (!s1.equals(s2) || !s2.equals(s3) ||
                !s3.equals(s4)) {
                System.out.println("check error");
            }
        }
    }

This demo sets up a class MyPoint, that is used to represent X,Y points. The demo implements a variety of toString methods for the class. The result of running the program should look something like this:

    toString1 2797
    toString2 2703
    toString3 5656
    toString4 32

The first two approaches, using + and StringBuffer, have nearly identical performance. So you might conclude that these two approaches are in fact the same. The bytecodes generated for toString1 indicate that a StringBuffer is set up, and then the various strings are simply appended to it. The resulting code is very similar to toString2.

But it's not quite that simple. The first problem is that you can't always formulate the value to be returned from toString as a single expression. toString3 and toString1 produce identical results. But toString3 takes twice as long to run, for the same reasons as described for the AppDemo1 example. The AppDemo2 example illustrates the situation where you need to build up the return value a piece at a time. In such a case, toString2, using an explicit StringBuffer, is a better choice.

The other problem concerns a statement found in the Java Language Specification section 15.18.1.2, which says:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

This statement suggests that a Java compiler is not required to optimize an expression such as:

    str1 + str2 + str3 + ...

as is currently done for the toString1 method above, but instead can create intermediate string objects.

So it pays to be cautious in your use of the + operator, especially for long strings or within loops.

Note that there's an even faster way to implement toString for this example. MyPoint is an immutable class, meaning that instances cannot be modified after creation. Given this, the value returned by toString will always be the same. Because the value is always the same, it can be computed once in the MyPoint constructor, and then simply returned by toString4.

This kind of caching is often very useful, but not without drawbacks. If the class is mutable, then caching might not make sense. The same is true if computing the cache value is expensive, if the cache ties up a lot of memory, or if the toString method is called infrequently.

For more information about this topic, see Section 15.18.1, String Concatenation Operator +, in "The Java Language Specification Second Edition", and Chapter 5, Strings, in "Java Performance Tuning" by Jack Shirazi.

Pixel

IMPROVING JAVA I/O PERFORMANCE

This tip looks at some of the issues and techniques around improving Java I/O performance. This is a complex subject, so the tip is only able to illustrate a few key points.

The first thing to understand is that some factors that can significantly impact I/O performance exist primarily outside of Java applications. One of these items is operating system file caching. When you read a file from disk, the operating system often keeps a copy of the file in memory. The operating system does this so that no disk access is required if the file is read a second time. This technique can result in huge speedups, and you need to be careful to correct for this factor when doing performance analysis of your Java programs. This is one example where adding extra memory often improves application performance, even if the application does not directly need the memory for its internal data structures.

Another performance factor is disk file fragmentation, where pieces of a file are scattered across a disk. Modern disk drives typically require 10-15 milliseconds to move from one part of a disk surface to another (seek time). This is a long time, especially compared to ultrafast processor speeds. The same issue occurs if you're using java.io.RandomAccessFile on a big database, and the database needs to move between items of information distributed across the disk. Yet another example is if you're copying one file to another, and the files are at widely-separated locations on the same disk so that the disk head must move back and forth repeatedly.

This tip is not going to illustrate the caching or seek time issues with example code, but these issues are extremely important, and worth considering when you're designing your Java applications.

Let's look at some code. The first issue to illustrate is the cost of enumerating and opening files. Suppose that you're writing a Java application that searches through all the files in a directory structure. Suppose too that as part of this project, you need to write some code that creates a list of path names based on a starting point in a directory. The code looks like this:

    import java.io.*;
    import java.util.*;
    
    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
    
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
   
    public class IODemo1 {
    
        // recursive method that walks directory 
        // structure
    
        static void getFileList2(File f, List list) {
            if (f.isDirectory()) {
                String entries[] = f.list();
                int maxlen = (
                 entries == null ? 0 : entries.length);
                for (int i = 0; i < maxlen; i++) {
                    getFileList2(
                        new File(f, entries[i]), list);
                }
            }
            else if (f.isFile()) {
                list.add(f.getPath());
            }
        }
    
        // top-level method to get list of path names 
        // starting at a specified point in directory 
        // structure
    
        static List getFileList(String fn) {
            List list = new ArrayList();
            getFileList2(new File(fn), list);
            return list;
        }
   
        public static void main(String args[]) {
            MyTimer mt = new MyTimer();
            List list = getFileList(args[0]);
    //      for (int i = 0; i < list.size(); i++) {
    //          System.out.println(list.get(i));
    //      }
            System.out.println(
                   "number of files = " + list.size());
            System.out.println(
                   "microseconds per file = " +
                 mt.getElapsed() * 1000 / list.size());
        }
    }

If you want to see the resulting list of path names, you can uncomment the three lines in IODemo1, starting with for (int i = 0; ...). When you run the program, for example in Windows by saying:

    java IODemo1 C:\\

the result is something like:

    number of files = 11298
    microseconds per file = 525

on the first run, and something like:

    number of files = 11298
    microseconds per file = 157

on the second. The time discrepancy is due to caching effects, as mentioned above.

The result of this demo indicates that it takes about 500 microseconds per pathname to enumerate a set of files, or about 2,000 files/second. So if you're writing an application, and it searches a directory structure of 100,000 files, it will take at least 50 seconds just to enumerate the files. This time is separate from actually opening and reading the files.

Here's another example. It measures the time required to repeatedly open a file, read one byte, and close the file:

    import java.io.*;
    
    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
   
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
    
    public class IODemo2 {
        static final int N = 100000;
        static final String TESTFILE = "testfile";
     
        public static void main(String args[]) 
                                   throws IOException {
    
            // create file and write one byte into it
    
            FileOutputStream fos = 
                        new FileOutputStream(TESTFILE);
            fos.write('A');
            fos.close();
    
            // repeatedly open file and 
            // read a byte from it
    
            MyTimer mt = new MyTimer();
            for (int i = 1; i <= N; i++) {
                FileInputStream fis = 
                         new FileInputStream(TESTFILE);
                fis.read();
                fis.close();
            }
    
            System.out.println(
                 "microseconds per open/read/close = "
                 + mt.getElapsed() * 1000 / N);
        }
    }

The result of running the program should look something like this:

    microseconds per open/read/close = 101

This result translates to about 10,000 file open/read/closes per second.

It's hard to give simple advice on how to improve the performance of the IODemo1 and IODemo2 examples. For the IODemo1 example, you could do the enumeration once and keep the result in a cache. That approach could work if you know that the directory contents will not change. For the IODemo2 example, an obvious approach is to keep files open instead of repeatedly opening and closing them.

These areas tend to be more amenable to changes in architecture than to applying performance techniques after the fact. For example, if you have 100,000 small files, and you want to search them, could you change the architecture to instead use 1,000 large files?

Let's now look at performance issues related to reading bytes from a file. Suppose that you have a file containing a stream of ABABAB... bytes, and you'd like to count the number of "A" bytes. What's the fastest way of doing this? Here's some code that illustrates three approaches:

    import java.io.*;
    
    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
    
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
    
    public class IODemo3 {
        static final int N = 1000000;
        static final String TESTFILE = "testfile";
        static int cnt1, cnt2, cnt3;
    
        static void write() throws IOException {
    
            // write a sequence of ABABAB... 
            // bytes to the file
    
            FileOutputStream fos = 
                      new FileOutputStream(TESTFILE);
            boolean flip = false;
            for (int i = 1; i <= N; i++) {
                fos.write(flip ? 'B' : 'A');
                flip = !flip;
            }
            fos.close();
        }
    
        static void read1() throws IOException {
    
            // read bytes from the file one at a time
    
            MyTimer mt = new MyTimer();
            FileInputStream fis = 
                         new FileInputStream(TESTFILE);
            cnt1 = 0;
            int c;
            while ((c = fis.read()) != -1) {
                if (c == 'A') {
                    cnt1++;
                }
            }
            fis.close();
            System.out.println("read1 time = " + 
                                      mt.getElapsed());
        }
    
        static void read2() throws IOException {
    
            // read bytes from the file one at a time 
            // with buffering
    
            MyTimer mt = new MyTimer();
            FileInputStream fis = 
                         new FileInputStream(TESTFILE);
            BufferedInputStream bis = 
                          new BufferedInputStream(fis);
            cnt2 = 0;
            int c;
            while ((c = bis.read()) != -1) {
                if (c == 'A') {
                    cnt2++;
                }
            }
            fis.close();
            System.out.println("read2 time = " + 
                                      mt.getElapsed());
        }
    
        static void read3() throws IOException {
    
            // read from the file with buffering
            // and with direct access to the buffer
    
            MyTimer mt = new MyTimer();
            FileInputStream fis = 
                         new FileInputStream(TESTFILE);
            cnt3 = 0;
            final int BUFSIZE = 1024;
            byte buf[] = new byte[BUFSIZE];
            int len;
            while ((len = fis.read(buf)) != -1) {
                for (int i = 0; i < len; i++) {
                    if (buf[i] == 'A') {
                        cnt3++;
                    }
                }
            }
            fis.close();
            System.out.println("read3 time = " 
                                    + mt.getElapsed());
        }
    
        public static void main(String args[]) 
                                   throws IOException {
    
            // write the test file
    
            write();
    
            // read from the file the first time
    
            read1();
            read2();
            read3();
    
            // sanity check
    
            if (cnt1 != cnt2 || cnt2 != cnt3) {
                System.out.println(
                            "cnt1/cnt2/cnt3 mismatch");
            }
    
            // read from the file the second time
    
            read1();
            read2();
            read3();
        }
    }

If you run the the program, you should see a result that looks something like this:

    read1 time = 4063
    read2 time = 140
    read3 time = 31
    read1 time = 4079
    read2 time = 140
    read3 time = 16

The program does everything twice, just to make sure that there are no file caching effects.

The read1 method opens a FileInputStream and then calls the read method repeatedly. The read2 method works similarly, but sets up a buffer on top of the input stream. The read2 method runs much faster than read1 -- in this case, about 25 times as fast. Why? The reason is that FileInputStream.read is implemented as a native method, and calls the underlying system for each byte that is read. This process may involve such things as invoking system calls, and is very expensive. The read2 method uses BufferedInputStream. This class overrides read to grab a byte out of a buffer. Only when the buffer is empty is a FileInputStream.read call made to refill it. Because the buffer is long (currently 2048 bytes), such a call occurs infrequently.

The read3 method is even faster than read2. It uses an explicit buffer to grab chunks of bytes from the file. This approach is similar to the buffer used within BufferedInputStream. But instead of repeatedly calling the read method, it iterates over the buffer and directly accesses bytes from it.

The difference between read1 and read2 is buffering. The difference between read2 and read3 is the elimination of method call overhead, and the processing that BufferedInputStream.read must do for each call. This processing includes checking whether the buffer is empty. Both of these techniques -- the use of buffering, and exercising caution in the use of per-byte or per-character I/O operations -- are central to improving I/O performance.

It is possible to go too far with this kind of optimization. For example, the following line appears in the read3 code (it's used to iterate across the buffer):

    for (int i = 0; i < len; i++) {

If the line was this:

    for (int i = 1; i < len; i++) {

it would represent an off-by-one error. If you really need the last bit of performance, then read3 might be the appropriate approach. But getting into low-level details can be a big source of coding errors.

Here's a final example, one that counts the number of lines in a text file:

    import java.io.*;
    
    class MyTimer {
        private final long start;
    
        public MyTimer() {
            start = System.currentTimeMillis();
        }
  
        public long getElapsed() {
            return System.currentTimeMillis() - start;
        }
    }
    
    public class IODemo4 {
        static final int N = 1000000;
        static final String TESTFILE = "testfile";
        static int cnt1, cnt2;
      
        static void write() throws IOException {
    
            // write a sequence of text lines 
            // to the test file, 
            // terminating with \r, \n, or \r\n
    
            FileWriter fw = 
                              new FileWriter(TESTFILE);
            BufferedWriter bw = new BufferedWriter(fw);
            String str = "test";
            int slen = str.length();
            int state = 0;
            for (int i = 1; i <= N; i++) {
                bw.write(str, 0, slen);
                if (state == 0) {
                    bw.write('\r');
                    state = 1;
                }
                else if (state == 1) {
                    bw.write('\n');
                    state = 2;
                }
                else {
                    bw.write('\r');
                    bw.write('\n');
                    state = 0;
                }
            }
            bw.close();
        }
    
        static void read1() throws IOException {
    
            // read from text file using readLine()
    
            MyTimer mt = new MyTimer();
            FileReader fr = new FileReader(TESTFILE);
            BufferedReader br = new BufferedReader(fr);
            cnt1 = 0;
            while (br.readLine() != null) {
                cnt1++;
            }
            br.close();
            System.out.println(
                           "read1 " + mt.getElapsed());
        }
    
        static void read2() throws IOException {
    
            // read from text file using a buffer
            // and reading individual characters
    
            MyTimer mt = new MyTimer();
            FileReader fr = new FileReader(TESTFILE);
            BufferedReader br = new BufferedReader(fr);
            cnt2 = 0;
            int prev = -1;
    
            final int BUFSIZE = 1024;
            char cbuf[] = new char[BUFSIZE];
            int currpos = 0;
            int maxpos = 0;
    
            while (true) {
                if (currpos == maxpos) {
                    maxpos = br.read(cbuf, 0, BUFSIZE);
                    if (maxpos == -1) {
                        break;
                    }
                    currpos = 0;
                }
                int c = cbuf[currpos++];
                if (c == '\r' || 
                         (c == '\n' && prev != '\r')) {
                    cnt2++;
                }
                prev = c;
            }
            br.close();
            System.out.println(
                           "read2 " + mt.getElapsed());
        }
    
        public static void main(String args[]) 
                                   throws IOException {
    
            // write the test file
    
            write();
    
            // read the test file the first time
    
            read1();
            read2();
    
            // sanity check
    
            if (cnt1 != cnt2) {
                System.out.println
                                ("cnt1/cnt2 mismatch");
            }
   
            // read the test file the second time
    
            read1();
            read2();
        }
    }

The result of running this program should look something like this:

    read1 1125
    read2 578
    read1 1093
    read2 563

The IODemo4 program creates some test data, consisting of a series of lines. A line of text is defined to end with carriage return, line feed, or carriage return line feed. All three of these cases are represented in the data.

Two read methods are defined to count the number of lines. The first calls BufferedReader.readLine. The second uses a custom approach involving explicit buffering. Characters are read from the buffer and each line terminator is counted.

The read2 method runs about twice as fast as read1. One reason for this difference is that the readLine method used by read1 returns a string for each line that is read. That string must be created and built up from individual characters. In this particular application, the string is not used for anything. It's inefficient to process the lines in this way -- unnecessary work is being done.

But there's another side to this issue. In fact, you might want to use read1, even though it's slower. One problem with read2 is that the logic for detecting the end of line is tricky, and the three possible line terminators are explicitly coded into the read2 method. There's a good argument for leaving such details to the readLine method in the library. Also, what about an ambiguous case, where the last line of a text file does not end with a line terminator? Should it count as a line? In this case, what if read2 does something different than readLine?

The extra speed of read2 might indeed be worth it, but it's a good idea to ask yourself whether the improved performance is worth the added complexity.

The examples above touched on some of the core principles involved in speeding up I/O. There are other issues that have not been discussed. These include:

  • Performance issues for serialization
  • Byte/character and character/byte conversions
  • Converting numbers to strings and strings to numbers
  • Compression
  • Network I/O
  • Terminal I/O

These areas are sometimes quite important.

For more information about improving Java I/O performance, see Chapter 8, I/O, Logging, and Console Output, in "Java Performance Tuning" by Jack Shirazi.

Pixel

IMPORTANT: Please read our Terms of Use and Privacy policies:
http://www.sun.com/share/text/termsofuse.html
http://www.sun.com/privacy/
http://developer.java.sun.com/berkeley_license.html

FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:
jdc-webmaster@sun.com

SUBSCRIBE/UNSUBSCRIBE
- To subscribe, go to the subscriptions page, choose the newsletters you want to subscribe to and click "Update".
- To unsubscribe, go to the subscriptions page, uncheck the appropriate checkbox, and click "Update".

- ARCHIVES
You'll find the JDC Tech Tips archives at:
http://java.sun.com/jdc/TechTips/index.html

- COPYRIGHT
Copyright 2002 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.

This document is protected by copyright. For more information, see:
http://java.sun.com/jdc/copyright.html

JDC Tech Tips
March 5, 2002

Sun, Sun Microsystems, Java, and Java Developer Connection are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.