Sun Java Solaris Communities My SDN Account Join SDN
 
Article

The Swing Connection - JCanyon - Grand Canyon Demo

 

JCanyon: Grand Canyon for Java

As seen at JavaOne

Kenneth Russell, Sun Microsystems, Inc.

Contents

i. Quick Start with Java Web Start

System requirements:

  • 330 MB of free disk space
  • 192 MB or more RAM
  • OpenGL(R) (is pre-installed on Windows(R) PCs and many GNU/Linux distributions)

System recommendations:

  • 3D accelerator card with hardware texture mapping

The launch link below requires Java Web Start and runs on all platforms supported by JOGL. As of this writing, these included:

  • Solaris/SPARC 2.8 or later with OpenGL for Solaris/SPARC
  • GNU/Linux on x86 (Mesa is usually pre-installed; vendors may provide drivers for hardware acceleration)
  • Windows 95/98/ME/NT/2000/XP (OpenGL is pre-installed)
  • Macintosh OS X 10.3 or later (OpenGL is pre-installed)

Launch the jcanyon demo

The first time the program is launched, a 20 MB jar file will be downloaded and decompressed onto your local disk into a directory you choose. The location of this directory is written into a file called ".jcanyon" located in your home directory (as specified by the system property "user.home".) In case of problems during data downloading or decompression, delete the .jcanyon file as well as the associated data directory and relaunch the demonstration.

When the demo is launched, a window will pop up indicating the available resolutions for full-screen mode; select a resolution and click "OK". If you would like to run the program in a window instead, click the "Cancel" button of this dialog.

The complete source code for jcanyon is available below.

Mouse controls:

Mouse left/right roll left/right
Mouse up/down pitch down/up
Left button click slow down
Right button click speed up

Keyboard controls:

a Toggle afterburner
f Toggle fillets
g Toggle fog
q Quit
r Reset simulator
t Toggle control panel

1. Introduction

Before JDK 1.4, the only mechanisms for transferring large amounts of data in to and out of the Java platform were the JNI Get<PrimitiveType>ArrayElements and GetPrimitiveArrayCritical entry points. The implementation of Get<PrimitiveType>ArrayElements either incurs a memory allocation and data copy, which is expensive, or requires pinning support in the garbage collector, which is difficult or impossible depending on the garbage collection algorithm used by the Java virtual machine implementation (JVM). GetPrimitiveArrayCritical is fast but has severe limitations on how it can be used which make it unacceptable for many kinds of applications.

1.1. java.nio Direct Buffers

JDK 1.4 introduces the concept of direct buffers in the new java.nio package (JSR-000051). Direct buffers provide a safe platform-, vendor-, and JVM-independent zero-copy mechanism that have improved I/O throughput significantly in JDK 1.4. They are applicable to more domains than just I/O; any high-bandwidth application like I/O, 2D, 3D, database access, and inter-language interoperability can potentially benefit from direct buffers. The Java HotSpot Client and Server VMs have been optimized to generate very efficient code for buffer operations.

The jcanyon demonstration is a simple flight simulator which visualizes a large data set (roughly three hundred megabytes in size) using the free software OpenGL binding JOGL. JOGL is not a Java technology standard extension, but it is lightweight, highly portable, and has built-in support for the use of java.nio direct buffers. Standard Java programming language extensions which operate on large amounts of data, such as Java3D and Java Advanced Imaging, have been modified to take similar advantage of the java.nio features.

JCanyon uses a multiresolution rendering algorithm to avoid holding the entire data set in memory at once, and uses memory mapped files, accessed via java.nio, to minimize data copying and heap size. The innermost rendering loop decimates the geometric data at run time while sending it to the graphics card using java.nio direct buffers. The entire application, including the physically-accurate flight model and rendering loop, is written in the Java programming language. No native code aside from the OpenGL binding is required or used. The simulator runs at real-time rates using the Java HotSpot Client VM on a single-CPU 750 MHz Pentium III PC with an NVIDIA GeForce or GeForce 2 graphics card.

The rest of this paper is organized as follows. Section 2 gives an overview of the jcanyon demonstration, its use of java.nio, and a description of the inner rendering loop. Section 3 provides details on the multiresolution algorithm and the flight dynamics. Section 4 contains answers to frequently asked questions. Section 5 provides links to the complete source code of the jcanyon demonstration. Section 6 contains acknowledgments, Section 7 the alias for feedback, and Section 8 references for further reading.

2. JCanyon Overview

The jcanyon demonstration uses java.nio in two ways; first, it takes advantage of the new file I/O facilities to minimize data copying when reading data from the disk, and second, it uses java.nio direct buffers to send data to the graphics card via OpenGL.

2.1. Use of New File I/O

The data set which jcanyon renders is comprised of roughly 100 MB (uncompressed) of elevation data, which provides the shape of the canyon and mountains, and 200 MB of satellite imagery, which provides the surface color. The data is too large to fit in RAM on most machines; a multiresolution algorithm, described in Section 3.1, is used to shrink the data down to a manageable size. Memory-mapped file I/O is used to transfer both the elevation and image data into the application. The elevation data is currently always mapped in at the highest resolution and is decimated at run time. The image data is downsampled into pyramids ahead of time and only the data of the needed resolution is mapped into memory at any given time. Imagery is fetched from the disk in a background thread and is scanned by that thread to ensure the data is in the disk cache before it becomes accessible to the main thread; this reduces the pauses associated with fetching data from the disk, especially on multi-CPU systems.

2.2. Rendering Loop

The inner rendering loop strides through the geometric data for a portion of the data set and assembles 3D points, as well as references to the 2D imagery in the form of texture coordinates, into a large buffer containing single-precision floating-point values. This buffer is then transferred to the graphics card for rendering. In order for OpenGL to render the buffer, it requires a persistent pointer to the buffer's starting address.

Earlier versions of the jcanyon demonstration included two versions of the inner rendering loop; one used a java.nio direct buffer to hold the data to be sent to OpenGL, and the other used a standard Java programming language float[]. When using java.nio, since the buffer is "direct", it is not managed by the Java garbage-collected heap and therefore a pointer to its data can be safely passed to OpenGL using the JNI entry point GetDirectBufferAddress, which is new in 1.4. Because of the semantics of the OpenGL APIs, the only correct method of sending data down to the card from a Java programming language float[] is to either use Get<PrimitiveType>ArrayElements (which requires a memory allocation and data copy on any JVM which does not support pinning) or to use GetPrimitiveArrayCritical and copy the data manually.

Previous versions of the jcanyon demonstration implemented the latter technique as a comparison to show that the data copy very quickly became the bottleneck in this application; it was roughly a factor of ten slower than using New I/O to avoid the data copy. The current version of the jcanyon demo implements only the New I/O loop, as maintaining separate native code for the slower rendering loop had become too problematic.

This demonstration was originally intended to use the NVIDIA GL_NV_vertex_array_range (VAR) extension to OpenGL for high-throughput rendering of the terrain mesh. However, after implementing the terrain engine as well as supporting code to allow the use of the VAR extension, it was discovered that enabling VAR occasionally caused a slowdown when the polygon count went up; more work is needed to resolve this performance penalty. In the meantime, VAR can be enabled by passing in the -nv command line option to DBFlyer when running on a GeForce-based card with the latest drivers from NVidia.

3. Algorithm Details

3.1. Multiresolution Algorithm

The terrain renderer uses a variant of the algorithm described in [LKR96]. The data set is subdivided into roughly 200 tiles and measures 13 by 15 tiles. Each tile contains both elevation and image data. At the highest resolution, each tile measures 513x513 elevation samples and 512x512 image samples. Tiles are downsampled into pyramids by recursively dropping every other sample. Image pyramids for texture mapping are created off-line. The geometric data is decimated at run time by iterating through the highest-resolution data with an appropriate stride.

Maximum deltas as described in [LKR96] are computed off-line for each level of the elevation data pyramid. These deltas are defined as the maximum world-coordinate error incurred in the vertical direction by rendering a given tile at the next-lower resolution. The level-of-detail (LOD) selection algorithm is somewhat dependent on the deltas being monotonically increasing, and it was found that this was not always the case. For this reason, the deltas in the data set are defined as the sum of all of the maximum deltas for all of the previous levels. The vertical error metric has been increased to compensate for these relatively larger deltas.

During rendering of each frame, the position of the camera is compared to the centroid of each visible tile. The maximum screen-space error is computed and compared against the precomputed maximum deltas; the appropriate decimation level is chosen to not exceed the current error. Thus, LOD selection is performed for each tile every frame and each tile's LOD is independent.

The two primary contributions of jcanyon's rendering algorithm are its texture selection technique and a method for eliminating cracks in the surface called filleting.

3.1.1. Texture selection

It was observed that using the same algorithm for texture LOD as is used for geometry LOD caused unacceptable visual artifacts, as too-coarse textures were frequently selected. Synthetic "deltas" for the textures were created and the texture LOD selection decoupled from the geometry, but tuning of the texture deltas did not yield acceptable results; among the reasons for this was a propensity of the LOD algorithm to incorrectly shift to too-coarse LODs for tiles directly underneath the camera. Instead, a new algorithm was devised for selection of texture LODs. This algorithm is an approximation to screen-space error for images, and takes into account that the visible area of the texture decreases first as the tile moves further away from the camera, and second as the angle between the normal vector of the tile and the forward vector of the camera approaches 90 degrees. It attempts to select an LOD for the texture at which magnification of texels will not be visible.

First, the relative areas of the downsampled tiles' textures are computed ahead-of-time. The zeroth (highest resolution) level has a relative area of one; the first, a relative area of 1/4; the third, a relative area of 1/16; and so on. The distance between the camera and the centroid of the tile is computed; the screen-space area of the rendered tile decreases as 1 / distance2. The current camera parameters are consulted to discover the distance from the camera at which the tile precisely fills the screen either horizontally or vertically; currently the system assumes that the window is either square or wider than it is tall. This "filling distance" is squared and divided by the square of the camera distance. The result is an approximation to the area, relative to the highest resolution version of the data, which the tile fills in the output window.

The second factor considered is the angle at which the camera views the tile. The dot product of the camera's forward vector in world coordinates is taken with the world-coordinate normal vector of the tile. It was observed that using the square of the cosine of this angle yields visually better results than simply the cosine, and that weighting the cosine term too heavily when the tile is too close to the camera yields visible artifacts. A linear ramp is therefore used to weight the distance from the camera more when the tile is close to the near clipping plane and to weight the angle the tile makes with the camera more when the tile is close to the far clipping plane.

These two factors are combined to compute the relative area the tile currently makes in the field of view; see Equation 1. The precomputed relative areas are searched to find the appropriate level at which the texture should be rendered.

relArea = (1.0f - cos2Alpha) * ooLength2 + (cos2Alpha * dotp * dotp) * ooLength2
where
ooLength2 = (dist0 * dist0) / diffLen2

dist0 = distance from eye point in world-coordinate units when a face-on tile fills the window completely either horizontally or vertically

diffLen2 = squared length of vector (eye_position - tile_position)

cos2Alpha = (diffLen2 - dist0 * dist0) / (farClipDist * farClipDist - dist0 * dist0)

dotp = abs(dot(camera_forward_vector, tile_normal_vector))
Equation 1. Computation of relative area of current tile.

This image level selection algorithm is completely automatic and does not require any tunable parameters. Further, it has been observed to select appropriate levels of texture detail for large terrain data sets. An attempt was made to replace the geometry LOD selection algorithm from [LKR96] with the abovementioned texture LOD algorithm; this did not work well, as the geometry was in general selected at too high a level of detail, causing the frame rate to drop. Further work is needed to address the degenerate cases of the geometry LOD algorithm.

3.1.2. Fillets

Previous terrain rendering algorithms, for example [LKR96] and [DWS97], have been global; dependencies propagated between tiles of the terrain to generate a continuous mesh to avoid visible cracks between adjacent tiles rendered at differing levels of detail. The recording and processing of these dependencies is extremely CPU-intensive, as it requires many or all vertices of the mesh to be processed using binary tree or quadtree algorithms.

An alternative solution was sought that would enable completely independent ("local") processing of tiles in the terrain database. Observation of the behavior of the system indicated that cracks in the terrain were usually a minor feature. It appeared that one could eliminate them by applying an image processing filter to fill in the gaps by interpolating along vertical segments that remained unfilled by the core rendering algorithm. However, reading the image back from the frame buffer to perform such processing is computationally expensive.

A scheme called filleting was devised to solve the problem geometrically. Figure 1 shows one tile from the Grand Canyon data set. A band of polygons is drawn around the edge of each tile which stretches down to the minimum elevation in the database. Each side of this band is textured with the appropriate line of the edge texels. This band covers the space, if any, between adjacent tiles, allowing them to be rendered independently and avoiding global mesh algorithms. In practice, only the forward-facing fillets need be drawn. Figure 2 compares the mesh output with and without fillets. With fillets, there are effectively no visible artifacts, especially when the terrain is in motion. In our application, fillets added roughly 4% to the rendering time, which is relatively inexpensive compared to performing complex selection algorithms on a per-vertex basis. While popping is occasionally visible in the mesh since no vertex morphing is performed, this appears to be largely caused by degenerate behavior of the geometry LOD algorithm and we hope to correct this defect in the future without resorting to per-vertex algorithms.


Figure 1. Relationship of fillets to the tile surface. Click for larger image.


Figure 2. Surface without and with fillets. Click for larger image.

3.2. Flight Dynamics

The flight dynamics are a port of Prof. Richard Murray's public domain simplified F-16 model from Fortran to the Java programming language. An option has been added to allow the angle-of-attack and sideslip aerodynamic forces to be disabled, improving the simulator's stability at the expense of realism. This option is enabled in the jcanyon demonstration.

4. Frequently Asked Questions

  • Q. Isn't Java slow?

    A. No.

  • Q. Is the comparison with earlier JNI calls really fair?

    A. The rendering loop which uses pre-1.4 JNI calls was not optimized, and performed some unnecessary data copying. Based on earlier experiments, optimization could probably bring the pre-1.4 version of the inner rendering loop within a factor of three of the 1.4 version rather than the factor of ten slower it was at the last time of measurement.

5. Source Code

jcanyon-src.zip contains all of the source code for the jcanyon demonstration. The source code is covered by the modified BSD license (the version without the advertising clause); see the enclosed documentation for details.

The version of JOGL used by jcanyon is an unmodified distribution which can be found at the JOGL home page.

6. Acknowledgments

  • Sven Goethel, author of "OpenGL for Java" (used by earlier versions of the jcanyon demonstration)
  • Prof. Richard Murray
  • Rene W. Schmidt
  • Benjamin Sleeter, US Geological Survey
  • Maintainers of aria.arizona.edu
  • US Geological Survey and EROS Data Center in Sioux Falls, SD

7. Feedback

Please direct all feedback to jcanyon-feedback@eng.sun.com. While we will read all email sent, we can not guarantee a reply.

8. References

[LKR96] P. Lindstrom, D. Koller, W. Ribarsky, L. Hodges, N. Faust, and G. Turner. "Real-time, continuous level of detail rendering of height fields", in proceedings of SIGGRAPH 96, August 4-9, 1996, pp. 109-118.

[DWS97] M. Duchaineau, M. Wolinsky, D. Sigeti, M. Miller, C. Aldrich, and M. Mineev-Weinstein. "Roaming terrain: real-time optimally adapting meshes", IEEE Visualization 97, November 1997, pp. 81-88.

9. Trademarks

Java, HotSpot, Solaris, Sun, and Sun Microsystems are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.

NVIDIA and GeForce are registered trademarks or trademarks of NVIDIA Corporation in the United States and/or other countries.

OpenGL is a registered trademark of Silicon Graphics, Inc.

All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries.

Windows is a registered trademark of Microsoft Corporation in the United States and/or other countries.