|
Tech Tips Archive
June 4, 2002
WELCOME to the Java Developer Connection (JDC) Tech Tips, June 4, 2002. This issue covers:
These tips were developed using Java 2 SDK, Standard Edition, v 1.4.
This issue of the JDC Tech Tips is written by Glen McCluskey.
USING THE CHARSEQUENCE INTERFACE
Suppose that you're writing a method in the Java programming language. Suppose too that you want to specify one of the method parameters such that a user of the method can pass a string or sequence of characters to the method. An obvious way to do this is to use a String parameter. But what happens if the method caller has a StringBuffer or CharBuffer object instead of a String? Or what happens if the caller wants to pass a character array to your method? In those cases, using a String parameter will not work. Instead, you can use the java.lang.CharSequence interface. This is an interface that supports generalization of character sequences.
A class that implements the CharSequence interface must define the following methods:
length - returns the number of characters in the sequence
charAt - returns a character at a given position
subSequence - returns a subsequence of a sequence
toString - returns a String containing the sequence characters
The String, StringBuffer, and CharBuffer classes implement the CharSequence interface, so passing an object of one of these classes to a method with a CharSequence parameter will work. Let's look at an example:
import java.nio.*;
public class CSDemo1 {
// dump out information about a CharSequence
public static void dumpSeq(CharSequence cs) {
System.out.println(
"length = " + cs.length());
System.out.println(
"first char = " + cs.charAt(0));
System.out.println("string = " + cs);
System.out.println();
}
public static void main(String args[]) {
// String
String s = "test";
dumpSeq(s);
// StringBuffer
StringBuffer sb = new StringBuffer("ing");
dumpSeq(sb);
// CharBuffer
CharBuffer cb = CharBuffer.allocate(3);
cb.put("123");
cb.rewind();
dumpSeq(cb);
}
}
|
In the CSDemo1 program, dumpSeq is a method with a CharSequence parameter. The parameter is passed instances of String, StringBuffer, and CharBuffer. When you run the program, the output is:
length = 4
first char = t
string = test
length = 3
first char = i
string = ing
length = 3
first char = 1
string = 123
|
This example makes clear that you can use CharSequence as a generalization of character sequences. It shows that you can write methods that accept objects of any class type that implements the CharSequence interface.
Let's look at another example, one that defines a CharArrayWrapper class that implements the CharSequence interface. The class is used to wrap a character array such that the wrapper object can be passed as an argument to a method expecting a CharSequence. Here's what the code looks like:
class CharArrayWrapper implements CharSequence {
private char vec[];
private int off;
private int len;
// length of sequence
public int length() {
return len;
}
// character at a given index
public char charAt(int index) {
if (index < 0 || index >= len) {
throw new IndexOutOfBoundsException(
"invalid index");
}
return vec[index + off] ;
}
// subsequence from start (inclusive)
// to end (exclusive)
public CharSequence subSequence(
int start, int end) {
if (start < 0 || end < 0 || end > len ||
start > end) {
throw new IndexOutOfBoundsException(
"invalid start/end");
}
return new CharArrayWrapper(
vec, start + off, end - start);
}
// convert to string
public String toString() {
return new String(vec, off, len);
}
// construct an CharArrayWrapper
// from a portion of a char array
public CharArrayWrapper(
char vec[], int off, int len) {
if (vec == null) {
throw new NullPointerException(
"null vec");
}
if (off < 0 || len < 0 ||
off > vec.length - len) {
throw new IllegalArgumentException(
"bad off/len");
}
this.vec = vec;
this.off = off;
this.len = len;
}
// construct a CharArrayWrapper
// from a char array
public CharArrayWrapper(char vec[]) {
this(vec, 0, vec.length);
}
}
public class CSDemo2 {
public static void main(String args[]) {
// create array and wrap it
char vec[] = {'a', 'b', 'c', 'd', 'e'};
CharSequence cs =
new CharArrayWrapper(vec);
// test the CharArrayWrapper
// interface implementation
System.out.println("string = " + cs);
System.out.println(
"length = " + cs.length());
System.out.println("subSequence(2,4) = " +
cs.subSequence(2, 4));
System.out.println(
"charAt(0) = " + cs.charAt(0));
if (cs.charAt(0) != 'a')
System.out.println("*** error ***");
if (cs.subSequence(2, 4).charAt(0) != 'c')
System.out.println("*** error ***");
if (cs.subSequence(2, 4).subSequence(0, 1).charAt(0) != 'c')
System.out.println("*** error ***");
}
}
|
In the CSDemo2 program, the subSequence method is implemented by creating another CharArrayWrapper object, one that has a view of a portion of the original character array. The toString method is implemented by calling a String constructor with the array as its argument. Note that creating a string in this way is fairly expensive, because the array is copied. If instead, you simply provide a CharArrayWrapper for an array, it involves no copying.
The CharArrayWrapper class and CharSequence provide a read-only interface. There's no way to modify the character sequence using this interface. If you're familiar with the java.nio.CharBuffer class, you'll notice that using that class to wrap a character array achieves a similar effect to CharArrayWrapper. That's because CharBuffer implements the CharSequence interface.
If you run the CSDemo2 program, the results look like this:
string = abcde
length = 5
subSequence(2,4) = cd
charAt(0) = a
The CharSequence interface is used in the regular expression
package (java.util.regex). You can see an example of its use
by examining the BufferDemo7 program in the Programming With
Buffers tip that follows this tip.
For more information about the CharSequence interface, see the interface description.
PROGRAMMING WITH BUFFERS
If you've done any I/O programming in the Java programming language, you've probably had a situation where you're reading from an input stream into an array, and then writing the array contents to an output stream. Typically, you might have an array that's 1024 bytes long, and each read of the stream puts 0-to-1024 bytes into the array.
An array used in this way is also called a buffer. Version 1.4 of the Java 2 Platform, Standard Edition adds a major new set of buffer classes. These classes are found in the java.nio package. The classes are based on the the superclass java.nio.Buffer. Formally, a buffer is a linear, finite sequence of elements of a specific primitive type such as char (CharBuffer) or double (DoubleBuffer). Buffers are typically used in I/O operations, but you can also use them in other areas.
This tip examines some uses of buffers. Let's start by looking at an example that shows how buffers are allocated:
import java.nio.*;
public class BufferDemo1 {
public static void main(String args[]) {
// allocate non-direct buffer
ByteBuffer bb1 = ByteBuffer.allocate(100);
System.out.println(bb1);
// allocate direct buffer
ByteBuffer bb2 =
ByteBuffer.allocateDirect(100);
System.out.println(bb2);
}
}
|
Buffers have a fixed size, in this example 100 bytes. Here, the program allocates one buffer as a direct buffer, and another as non-direct. What does this mean? The documentation for ByteBuffer explains it this way:
A byte buffer is either direct or non-direct. Given a direct byte buffer, the Java virtual machine* will make a best effort to perform native I/O operations directly upon it. That is, it will attempt to avoid copying the buffer's content to (or from) an intermediate buffer before (or after) each invocation of one of the underlying operating system's native I/O operations.
Direct buffers could have higher allocation costs, and they're not always the right choice. However, they are worth considering to improve the performance of I/O.
When you run the BufferDemo1 program, the result is:
java.nio.HeapByteBuffer[pos=0 lim=100 cap=100]
java.nio.DirectByteBuffer[pos=0 lim=100 cap=100]
HeapByteBuffer and DirectByteBuffer are internal classes used in the implementation of ByteBuffer.
You might wonder what the pos, lim, and cap values represent in the result. These are basic properties of a buffer. They represent the position, limit, and capacity of the buffer, respectively. A buffer's position is the index (0 basis) of the next element to be read or written. A buffer's limit is the index of the next element that should not be read or written. (The difference between the limit and the position is a buffer's "remaining value," that is, the number of elements remaining in the buffer.) A buffer's capacity is its fixed number of elements. In the BufferDemo1 example, the capacity is 100 elements, and each element is a single byte.
These basic properties of a buffer represent one answer to the question "how does a buffer differ from an array of the same primitive type?" A buffer is essentially a wrapper on top of an array. It contains additional state information about the use of the array, for example, the index of the next location to be read or written.
Let's solidify these concepts with another example:
import java.nio.*;
public class BufferDemo2 {
// dump state of a buffer
static void dumpState(Buffer b) {
System.out.println(
"position=" + b.position());
System.out.println("limit=" + b.limit());
System.out.println(
"capacity=" + b.capacity());
System.out.println(
"remaining=" + b.remaining());
System.out.println();
}
public static void main(String args[]) {
// allocate buffer
IntBuffer ib = IntBuffer.allocate(2);
dumpState(ib);
// add a value to it
ib.put(37);
dumpState(ib);
// add another value to it
ib.put(47);
dumpState(ib);
}
}
|
The BufferDemo2 program allocates a two-long int buffer. It then adds two values to the buffer using relative put operations. These operations are relative to the current position in the buffer. Each time the program adds an element, the position is incremented, and the number of remaining elements decreases by one. If you run the program, the output looks like this:
position=0
limit=2
capacity=2
remaining=2
position=1
limit=2
capacity=2
remaining=1
position=2
limit=2
capacity=2
remaining=0
|
Let's look at a more complicated example. It's another example of using buffers for I/O -- here buffers are used to copy one file to another. However, the example has a couple of unusual aspects to it. First, the input and output "files" are byte arrays. Second, the example simulates a case where I/O doesn't work very well, that is, the case where an I/O request is only partially successful. For example, in this case, you might request that two bytes be read from a file, but zero, one, or two bytes might actually be read.
Here's the code:
import java.nio.ByteBuffer;
import java.util.Random;
public class BufferDemo3 {
static Random rn = new Random(0);
// size of input "file" (buffer)
static final int FILESIZE = 10;
// input and output "files" (arrays of bytes)
static byte infile[] = new byte[FILESIZE];
static int inptr = 0;
static byte outfile[] = new byte[FILESIZE];
static int outptr = 0;
// initialize input
static {
for (int i = 0; i < infile.length; i++) {
infile[i] = (byte)(i + 1);
}
}
// read 0-2 bytes from input file
static int read(ByteBuffer bb) {
// at end of file?
if (inptr == infile.length) {
return -1;
}
// read 0-2 bytes from input
// and put into buffer
int n = rn.nextInt(3);
int cnt = 0;
while (bb.hasRemaining() && n > 0 &&
inptr < infile.length) {
bb.put(infile[inptr++]);
cnt++;
n--;
}
// return the amount read
return cnt;
}
// write to output file
static int write(bytebuffer bb) {
// read 0-2 bytes from buffer
// and add to output
int n = rn.nextint(3);
int cnt = 0;
while (bb.hasremaining() && n > 0) {
outfile[outptr++] = bb.get();
cnt++;
n--;
}
// return amount written
return cnt;
}
public static void main(string args[]) {
// allocate buffer
bytebuffer bb = bytebuffer.allocate(2);
bb.clear();
// transfer from input to output file,
// using buffer as intermediary
int cnt = 0;
while (cnt < filesize) {
// handle case where at end of file
// but buffer not yet empty
if (
read(bb) < 0 && bb.position() == 0) {
break;
}
bb.flip();
cnt += write(bb);
// handle case where write only
// partially successful
bb.compact();
}
// check results of copy
for (int i = 0; i < outfile.length; i++) {
if (outfile[i] != i + 1) {
system.out.println(
"compare error");
}
}
}
}
|
The main method of the BufferDemo3 program allocates a two-long byte buffer, clears it, and then uses the buffer to copy from one file (byte array) to another. Clearing the buffer makes it ready for use by setting the limit to the capacity and the position to zero.
Each time it calls the read method, the program transfers zero, one, or two bytes from the input into the buffer. Each write call transfers zero, one, or two bytes from the buffer into the output.
Notice that for each copy iteration, the program reads from the input, flips the buffer, and writes to the output. What does it mean to flip the buffer? Flipping sets the buffer limit to the current position, and then sets the position to zero. What's the benefit of flipping? Consider that as you write into a buffer, its position keeps going up. At some point, when you're done writing, you want to read from that buffer. Flipping captures the position (the amount you've written), and then sets the position to zero. This allows you to read up to the limit.
The read and write methods have to be sensitive to the fact that the buffer they're passed might not be empty. That's why the read and write methods call the hasRemaining method. Also, the loop in the main method has to handle the case where an end of file has been reached, but the buffer is not yet empty. That's why the main method includes the test:
if (read(bb) < 0 && bb.position() == 0)
If you omit the latter part of this test, the program will work most of the time, but it occasionally does an incorrect copy.
The main loop also has to handle the case where a write is only partially successful. In this case, the buffer still contains data to be written after the write method returns. This situation is handled by the compact method. This method copies the bytes between the buffer's current position and its limit to the beginning of the buffer. If you omit the call to the compact method, the copy hangs.
Let's look at some additional buffer features. Suppose that you have a byte array containing the bytes 0x39 and 0x30. These bytes represent the quantity 12345 as a little-endian 16-bit value (least significant byte first). How can you extract the value from this byte array? Here's one way:
import java.nio.*;
public class BufferDemo4 {
// byte value for the short value 12345
// (little-endian)
static final byte buf[] = {0x39, 0x30};
public static void main(String args[]) {
// allocate buffer
ByteBuffer bb = ByteBuffer.allocate(2);
// add contents of buf to it
bb.put(buf);
// prepare buffer for reading
bb.flip();
// make read-only, change order to
// little-endian, and create view
ShortBuffer sb = bb.asReadOnlyBuffer().
order(ByteOrder.LITTLE_ENDIAN).
asShortBuffer();
// dump out original two bytes
System.out.println(Integer.toHexString(bb.get(0)));
System.out.println(Integer.toHexString(bb.get(1)));
// dump out original two bytes as a short
System.out.println(bb.getShort());
// dump out little-endian view of short
System.out.println(sb.get());
}
}
|
The BufferDemo4 program creates a two-long buffer, and then does a bulk add of the byte array containing the two values.
The program then creates a read-only view of the buffer. Here, the ordering is changed from big-endian to little-endian. Finally, the program creates a short view of the buffer. This process is an example of "invocation chaining." In invocation chaining, each method returns a buffer, such that method invocations can be chained together. The net result of this operation is a view of the byte buffer as a sequence of read-only short values. Each value is composed of two bytes, with the bytes of each short assumed to be in little-endian order.
The original two bytes are dumped out using absolute get methods that do not affect the buffer position. Then the short value is displayed from the original buffer (which defaults to big-endian order). Last, the short value is displayed from the short view buffer, using little-endian order. The result is:
39
30
14640
12345
The short values are different because two different byte orderings are applied. In other words, the byte values 0x39 and 0x30 have the value 14640 if viewed as a big-endian short. They have the the value 12345 if viewed as a little-endian short.
The buffer classes include two methods, duplicate and slice, that you can use to create copies of buffers. These methods make shallow copies, that is, they copy only state information, and share the underlying buffer contents. Changes to the contents of one buffer is visible in a copy. The duplicate method copies the capacity, limit, and position values of a buffer. The slice method sets the position to zero, and the limit and capacity to the number of elements remaining in the original buffer. Let's look at an example of how duplicate and slice work:
import java.nio.*;
public class BufferDemo5 {
public static void main(String args[]) {
// allocate a buffer and add values to it
IntBuffer ib = IntBuffer.allocate(2);
ib.put(37);
ib.put(47);
// create duplicate and slice of buffer
ib.position(1);
IntBuffer dup = ib.duplicate();
IntBuffer slc = ib.slice();
// display buffer details
System.out.println(dup);
System.out.println(slc);
// change index 0 of duplicate and then
// display value of original
dup.put(0, 57);
System.out.println(ib.get(0));
}
}
|
If you run this program, the result is:
java.nio.HeapIntBuffer[pos=1 lim=2 cap=2]
java.nio.HeapIntBuffer[pos=0 lim=1 cap=1]
57
The first line of output is the duplicate of the original buffer. The duplicate contains the same position, limit, and capacity as the original. The second line of output is the slice of the original. The position was set to 1 when the slice was created. So the slice buffer has a smaller limit and capacity, and contains a view of only the last element of the original buffer.
The last output line illustrates the idea that changes to the duplicate are reflected in the original. When the value at index 0 in the duplicate is changed, the original buffer is changed as well. That's because the original and the duplicate share the same contents.
A final example illustrates the use of MappedByteBuffer, a subclass of ByteBuffer. Use a MappedByteBuffer to map a region of a file to a buffer. A MappedByteBuffer was used in the May 7, 2002 Tech Tip "File Channels".
Recall that the earlier Tech Tip in this issue, "Using the CharSequence Interface," showed how to implement the CharSequence interface in an array wrapper class. Here's another example of implementing CharSequence, this time with the character sequence in a file.
Suppose you have the following program:
import java.io.*;
public class BufferDemo6 {
public static void main(String args[])
throws IOException {
// write "testing" as a string to the
// output file, each char encoded as
// two bytes
FileOutputStream fos =
new FileOutputStream("data");
DataOutputStream dos =
new DataOutputStream(fos);
dos.writeChars("testing");
dos.close();
fos.close();
}
}
|
The BufferDemo6 program writes the string "testing" to a file, with each Unicode character written as two bytes.
What if you could make the contents of this file "implement" the CharSequence interface, so that the contents could be passed to a method that expects a CharSequence parameter? Is there any way to do this? Here's one approach, that amends the CSDemo1 program example from the Tech Tip "Using the CharSequence Interface":
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class BufferDemo7 {
// dump out information about a CharSequence
public static void dumpSeq(CharSequence cs) {
System.out.println(
"length = " + cs.length());
System.out.println(
"first char = " + cs.charAt(0));
System.out.println("string = " + cs);
System.out.println();
}
public static void main(String args[])
throws IOException {
// String
String s = "test";
dumpSeq(s);
// StringBuffer
StringBuffer sb = new StringBuffer("ing");
dumpSeq(sb);
// CharBuffer
CharBuffer cb = CharBuffer.allocate(3);
cb.put("123");
cb.rewind();
dumpSeq(cb);
// mapped file
FileInputStream fis =
new FileInputStream("data");
FileChannel fc = fis.getChannel();
MappedByteBuffer mbb =
fc.map(FileChannel.MapMode.READ_ONLY,
0, fc.size());
cb = mbb.asCharBuffer();
dumpSeq(cb);
fc.close();
fis.close();
}
}
|
The first part of the BufferDemo7 program is the same as the CSDemo1 program. The difference is the last part of BufferDemo7. In that part, the program opens the data file written by BufferDemo6, gets a channel, and then maps the file contents to a MappedByteBuffer. It then creates a CharBuffer view of the MappedByteBuffer.
Because CharBuffer implements the CharSequence interface, the CharBuffer object can be passed to the dumpSeq method, and is treated just like a String or StringBuffer. The CharBuffer view in this case represents a region of a file. So accessing characters through the CharSequence interface means that characters are being read from a file.
If you run the BufferDemo7 program, the output looks like this:
length = 4
first char = t
string = test
length = 3
first char = i
string = ing
length = 3
first char = 1
string = 123
length = 7
first char = t
string = testing
|
For more information about programming with buffers, see the "New I/0 Functionality for Java 2 Standard Edition 1.4"
IMPORTANT: Please read our Terms of Use and Privacy policies:
http://www.sun.com/share/text/termsofuse.html
http://www.sun.com/privacy/
http://developer.java.sun.com/berkeley_license.html
FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:
jdc-webmaster@sun.com
SUBSCRIBE/UNSUBSCRIBE
- To subscribe, go to the subscriptions page, choose the newsletters you want to subscribe to and click "Update".
- To unsubscribe, go to the subscriptions page, uncheck the appropriate checkbox, and click "Update".
- ARCHIVES
You'll find the JDC Tech Tips archives at:
http://java.sun.com/jdc/TechTips/index.html
- COPYRIGHT
Copyright 2002 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.
This document is protected by copyright. For more information, see:
http://java.sun.com/jdc/copyright.html
JDC Tech Tips
June 4, 2002
Sun, Sun Microsystems, Java, and Java Developer Connection are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.
* As used in this document, the terms "Java
virtual machine" or "JVM" mean a virtual machine for the Java platform.
|