PersonalJavaTM 1.1
Application Environment Memory Usage Technical Note
July 1998
Abstract
One of the primary design goals for the PersonalJava application environment
is the minimization of the static memory footprint and the runtime memory
usage. The static memory footprint is comprised of mostly ROM, but also
some RAM. It is the memory which is used when the PersonalJava application
environment classes are preloaded with JavaCodeCompact. The runtime memory
usage, as the term suggests, is the memory which is consumed at runtime
by the PersonalJava virtual machine for dynamically loaded classes, native
and JavaTM stacks, and heap storage to
support dynamic allocation and garbage collection. Through SunTM's
work on the PersonalJava 1.1 virtual machine, the following areas of memory
usage have been decreased: the preloaded and dynamically loaded class memory
footprint, the native stack usage and the Java stack usage. The result
is a system with drastically reduced ROM and RAM requirements which has
no perceivable degradation in execution speed.
This paper discusses in detail the optimizations which were made to
achieve an approximately 28% decrease in class memory footprint from the PersonalJava
1.0 application environment to the PersonalJava 1.1 application environment.
It is assumed that the reader is familiar with Java technology and the
workings of the Java virtual machine* as specified in "The Java Virtual
Machine Specification" by Lindholm and Yellin. Familiarity with one of
the Java virtual machines implemented by Sun is also helpful.
Note that the optimizations discussed in this paper affect the implementation of the Java virtual machine and tools such as JavaCodeCompact only.
No change has been made to either the Java Virtual Machine specification or the PersonalJava API.
Introduction
By conducting an extensive analysis of memory usage in the Java application
environment, a few categories from among the top memory consumers were
identified as having the most potential for memory usage reduction:
-
The memory footprint of classes and their various components
-
Preloaded (mostly ROM use)
-
Dynamically loaded
-
Native stack usage
-
Java stacks
-
Java heap usage
-
C code size
#1, #2 and #3 were selected as having the most potential for application-independent
memory use reductions. The focus of Sun's work is on these three categories.
#4 is application dependent and therefore hard to reduce generically. #5
does not lend itself to too much reduction without significant rewrites
of the virtual machine and native methods.
Class Footprint Reductions
It is customary for the classes of a Java program to be loaded as late
during the program's execution as possible: they are loaded on demand from
the network (stored on a server), or from a local file system when first
referenced during the program's execution. The virtual machine locates
and loads each class, parses the class file, allocates internal data structures
for its various components, and links it in with other loaded classes.
This process makes the method code in the class readily executable by the
virtual machine. A large part of the runtime memory consumption in Java
is due to the dynamic allocation of the internal data structures for classes
and their components (class metadata), and of the memory allocated
for the method bytecodes.
For small and embedded systems which lack fast dynamic class loading
facilities such as a network connection, a local file system, or other
permanent storage, it makes sense to have a "class preloader" to load the
class off-line. Source licensees of the PersonalJava application environment
receive a class preloader called JavaCodeCompact which performs
this task. JavaCodeCompact creates the internal virtual machine data structures
representing a class off-line, and lays them out in mostly read-only memory.
Changes were made that reduce the memory allocations required per class.
Some of these changes occur only in JavaCodeCompact, thus affecting only
the ROM footprint. Some of the other techniques apply both to preloaded
and dynamically loaded classes, affecting both the ROM and RAM footprints.
Memory reductions that apply to preloaded classes only
Using JavaCodeCompact has the advantage of preloading all of the classes
of the PersonalJava application environment at once. This allows for the
sharing of information between classes, thereby reducing the ROM requirement
considerably. Space is saved by exploiting the fact that all symbolic references
in these classes are resolved, and that bytecode instructions referring
to these symbolic entries have been quickened (i.e. re-written
with versions that do not do symbolic lookups).
-
Reduced constant pools
During the preloading process, all symbolic references are resolved
in constant pools, and all bytecodes that make references to these entries
are quickened. Therefore NameAndType and Utf8 constants no
longer need to be kept in class constant pools since Java bytecodes never
have direct references to these constants. Method bytecodes are modified
to refer to the updated constant pool indices after the deletion of the
entries.
-
No type-tables for completely resolved constant pools
Constant pools whose references have been resolved have also had the
referring bytecode instructions quickened. Therefore, there is no need
for type information on these constant pools since the virtual machine
is never going to be resolving their entries.
-
Shared string tables
Each Java class file has occurrences of Utf8 strings in its constant
pool. When JavaCodeCompact preloads multiple classes, it keeps track of
all occurrences of Utf8 strings in its input classes and outputs a shared
Utf8 table to which individual preloaded classes point. Moreover, for any
two strings A and B, for which B is a suffix of A, only string A is generated
in the global string table. B would simply point to the middle of A. In
this manner, common string suffixes are shared in order to save space.
-
Sharing Java String bodies
JavaCodeCompact resolves all class constant pool entries, including
those of type String. It therefore creates Java String's
with handles as if the runtime memory allocator created them. Each Java
String is formed of an array of java.lang.Character's,
an offset field, and a count field. When creating each
String and associated handle, JavaCodeCompact has to generate
the correct String object fields. In the PersonalJava 1.0 application
environment, a separate handle was created for each embedded Character
array. To improve upon this in PersonalJava 1.1 application environment,
one giant Character array is created instead that contains the
bodies of all Java String's. Each Java String is laid
out to refer to portions of the same array with the right offset
and count fields initialized. Separate handles for the separate
Character arrays are therefore eliminated.
-
Table element sharing/merging
Many of the internal virtual machine data structures are short tables
formed of small integers, like constant pool indices. Examples are tables
listing the exceptions that a method throws, or the interfaces that a class
implements. JavaCodeCompact tracks the contents of each of these tables,
and shares them where appropriate.
-
Eliminating duplication by moving more data structures into read-only
segments
JavaCodeCompact generates most of its preloaded class data structures
in read-only segments. There are, however, certain kinds of data, such
as Java static fields, which need to be stored in read-write segments in
order to allow the virtual machine to write to them. The disadvantage of
this in the PersonalJava 1.0 application environment was that a system
employing preloaded classes needed to copy the data writeable by the virtual
machine from persistent storage to RAM, essentially causing duplication
of that segment. Changes were made in the virtual machine and JavaCodeCompact
to allow the putting of as much data as possible in read-only memory or
BSS (the uninitialized global data segment of the executable program) to
reduce the memory size impact. The BSS section does not suffer from the
extra copy problem, since it is only allocated at load time, and does not
occupy any space in permanent storage.
-
Making fieldblocks read-only: static storage space is now emitted in the
BSS section, instead of emitting fieldblocks representing statics in RAM.
That way the fieldblocks are all read-only, at the price of an extra indirection
to an extra word of storage per static field in the BSS section.
-
Making all method code read-only: changes were made in the virtual machine
which obviate the need for writes into the instruction stream on interface
invocations. That way, all method code can be placed in ROM.
-
Making Str2ID tables read-only: The Str2ID hash tables
for class member signature data and interned Java String's have
been made read-only. This was made possible through virtual machine changes
which disallow writes into the hash tables.
Memory reductions that apply to both preloaded and dynamically loaded classes
-
Large static initializer removals:
The specification of the class file format does not allow for a way
to express initialized data. For example, to initialize a large static
constant array, the Java compiler needs to emit class initializer code
that allocates an array and fills it in element by element. For large arrays
typically encountered in code related to internationalization, these static
initializers can get quite large. An example is java.lang.Character
where the bulk of the class file size consists of such static initializer
code. Classes with large static initializers were identified in the PersonalJava
core class libraries and changed to make the initializers unnecessary,
thus reducing their size drastically.
-
Removal of debug information:
The virtual machine and JavaCodeCompact have been changed to ignore
all debugging information in classes, such as the source file name attribute,
line number tables, and local variable tables. This information is only
processed when a debugging option is passed to the virtual machine and
JavaCodeCompact at build time, and ignored otherwise in order to save memory.
-
Data structure reductions:
In examining the class memory footprint requirements, the discovery
was made that class metadata supporting loaded classes took up excessive
space. For example, in many cases, a methodblock describing a class
method takes up much more space than the bytecodes for the method. To rectify
this, aggressive changes were made to the virtual machine data structures
representing class components. The changes were done in concert with changes
in JavaCodeCompact, allowing for the reduction of both the ROM footprint
of preloaded classes and the RAM footprint of dynamically loaded classes.
-
Make methodblocks smaller: A total of 48 bytes from each of the methodblocks
was removed, reducing the size of each methodblock from 92 bytes to 44
bytes. Among the fields removed are direct references to a method's name
and signature, and compiler and debugging related information. On the PersonalJava
application environment with nearly 7000 methods, this reduction alone
translates to around 350 kbytes of ROM savings.
-
Remove alignment requirement for method tables: The PersonalJava 1.0 virtual
machine required all method tables to be 32-byte aligned. This was in order
to be able to use the rightmost 5 bits of a methods pointer to indicate
an object's type. The virtual machine was modified to eliminate this requirement.
This saved an average of 16-bytes per class that were wasted for the alignment
requirement, and another 4 bytes in the class block that pointed to the
non-aligned memory block allocated for the methodtable.
-
Remove cbSupername, cbFinalizer: The implementation of runtime
class linking was changed to obviate the need for the classblock fields
indicating the class superclass name and the class finalizer method. This
saved 8 bytes per class.
-
Remove wasted first word of the methodtable: A word of space per class
was wasted in the PersonalJava 1.0 application environment when storing
its methodtable. The virtual machine implementation was changed to be able
to pack the methodtable tighter and to eliminate the extra word of storage.
This extra word was a vestige of the original implementation and was not
used to store any data.
-
Enforce class-file size limits in runtime data structures The PersonalJava
1.0 virtual machine had wasteful internal representations for some class
file components which were constrained in size by the class file specification.
For example, for certain types of data represented in 16-bits in the class
file format, the PersonalJava 1.0 virtual machine allocated a full 32-bit
machine word. The types of data which were changed to conform to the class
file constraints include:
-
Exception table lengths
-
PC values in line number tables and exception tables
-
Method code lengths
The result is the halving of the widths of these fields in the internal
representations yielding more memory savings.
Numbers
Here are some measurements that demonstrate the effectiveness of the class
memory footprint reductions in the PersonalJava 1.1 application environment.
We measure the memory footprint of the pre-loaded classes in ROM and RAM.
This measurement is also highly characteristic of memory footprint for
dynamically loaded classes.
The first comparison is of the PersonalJava 1.1 application environment
with all optional packages included (rmi, sql, math, zip, and code signing),
with a similarly configured Java 1.1.6 application environment, adjusted
to exclude components unsupported by the PersonalJava application environment
(e.g. java.security.acl).
|
PersonalJava 1.1
Application Environment |
Java 1.1.6
Application Environment |
| No. of classes1 |
913
|
911
|
| Total size of class files |
1.997M
|
2.022M
|
| Total class memory footprint |
1.593M
|
2.222M
|
| % of RAM in total footprint |
0.2%
|
6.5%
|
1For the PersonalJava 1.1 application environment,
this number includes core as well as all optional packages. For the Java application environment, this number includes the packages which would be necessary to implement as close to the same functionality as for a fully configured PersonalJava application environment.
The Java 1.1.6 application environment comes out to be around 39.5% larger.
It also has about 32.5 times the RAM requirement of the PersonalJava 1.1
application environment.
The second comparison is of the PersonalJava 1.1 application environment
with no optional packages included, with the PersonalJava 1.0 application
environment.
|
PersonalJava 1.0
Application Environment |
PersonalJava 1.1
Application Environment |
| No. of classes2 |
641
|
689
|
| Total size of class files |
1.284M
|
1.506M
|
| Total class memory footprint |
1.402M
|
1.227M
|
| % of RAM in total footprint |
7.6%
|
0.3%
|
2For both the PersonalJava 1.0 and 1.1 application environments,
this number includes core packages only. Since the PersonalJava 1.1 application environment has a richer set of features than the PersonalJava 1.0 application environment, there are more classes in the PersonalJava 1.1 application environment.
The PersonalJava 1.0 application environment comes out to be 14% bigger,
even though it has only 83% of the .class data of the PersonalJava 1.1
application environment. Normalized, this translates to the class footprint
of the PersonalJava 1.0 application environment being 38% bigger than that
of the PersonalJava 1.1 application environment. Similarly, the RAM requirement
for the classes has gone down as much as 25-fold when measured relative
to total class memory footprint. The normalized value is around 30-fold.
Note that the results shown above are for the core classes of the PersonalJava
application environment which are stored primarily in ROM. More important,
perhaps, is the memory usage savings which can be achieved in RAM. By recognizing
that the core classes contain a representative cross section of the classes
which will appear in PersonalJava applications, the above analysis can
be extended to dynamically loaded applications which are the primary consumers
of RAM. The result is that an equivalent savings is expected for RAM usage.
Native Stack Size Reductions
Extensive static analysis of the native code in the virtual machine and
supporting libraries was performed with the aim of:
-
Figuring out what the typical native stack requirements are
-
Ensuring system robustness against stack overflow situations caused by
tight stacks
These make it possible to determine a tight upper bound on the stack usage
of any thread. To realize this, the following investigations and modifications
were performed:
-
Recursion removal and introduction of dynamic stack checks:
First, all self-recursions and call cycles in the native code which
could result in uncontrolled stack overflow were identified. Most of the
cycles could be eliminated by iterative rewrites. There were, however,
some call cycles that could not be removed. An example is native code to
Java bytecode transitions and back: there can essentially be several such
transitions, with no real bound on the number. On each such cycle, a routine
was chosen which employs a dynamic stack check at its beginning, testing
for remaining stack space before executing the current routine. These test
points are called safe points, since in the case of insufficient
stack space, these are the places to throw a StackOverflowError. If execution
were to continue, there would be an uncontrolled stack overflow with possible
memory corruption. If the remaining stack space at a safe point is under
a certain limit (the stack red zone), a StackOverflowError is thrown.
-
Making the stack red zone smaller:
After having dealt with recursion, all call paths requiring excessive
stack usage were identified. The frames that "contributed" the most to
these paths were chosen and analyzed on a case by case basis. It was possible
to significantly reduce the stack usage of most of these by moving large
local variable storage to global static storage for non-reentrant routines
and by dynamically allocating storage for reentrant routines. In addition,
the worst case stack requirement was reduced on any path between two stack
checks by introducing extra stack checks. This allowed the stack red zone
to be made smaller.
These efforts resulted in well-defined, exact stack requirements. Static
analysis of stack consumption performed on SolarisTM/SPARCTM
revealed that the appropriate red zone in the virtual machine implementation
is 3.3 kbytes plus the maximum stack consumption of the underlying library
functions. As for native method implementations including AWT and socket
features, another red zone called the native red zone was introduced.
Unfortunately, since some Motif routines require large stacks, the native
red zone had to be set to 80 kbytes for the reference implementation on
Solaris/SPARC.
-
Checking stack consumption of an actual application:
The virtual machine's stack usage was checked by running actual applications
like Personal Applications Browser (a Web
browser). It was found that the worst case stack usage at any stack check
point in the virtual machine code was less than 12 kbytes. As for the check
point for the "native red zone", the maximum usage was less than 10 kbytes.
Thus, a typical upper bound for the thread stack size of the SPARC implementation
would be:
upperBound = max {12k + 3.3k + libfuncStackMax, 10k + nativeImplStackMax}
where libfuncStackMax is the maximum stack consumption of the
library functions used by the virtual machine like vfprintf, abort,
free, etc., and nativeImplStackMax is the maximum stack
consumption of the native method implementations including underlying libraries.
Based on these efforts, a default thread stack size of 20 kbytes is expected
to be a good candidate if libraries optimized to reduce stack consumption
are used. Moreover, the potential for a memory corruption caused by stack
overflow is eliminated since it is guaranteed that a StackOverflowError
will be thrown in advance at a safe point. This result is a significant
improvement from the PersonalJava 1.0 virtual machine level of 128 kbytes,
a very conservative, yet not even safe stack size choice.
Note that the SPARC architecture consumes more stack because of its
register windows architecture, and that a lower default stack size can
be expected for embedded purpose CPUs.
Java Stack Size Reductions
Java stacks in the PersonalJava virtual machine are allocated in a chunky
fashion during execution of a program; whenever the virtual machine determines
that more Java stack space is required, it allocates new stack chunks.
A few typical applications were analyzed to figure out an optimal Java
stack chunk size for the PersonalJava application environment with the
goal of minimizing stack space waste and making the average case run in
as few chunks of stack space as possible. It was determined that a stack
chunk size of 2 kbytes is a good number, improving upon the 8 kbytes size
in the PersonalJava 1.0 virtual machine.
Conclusion
The consumer device market is one which is driven by thin margins and high
volume. In order to achieve profitability in this market, device manufacturers
are always seeking ways to reduce component costs. A large part of this
cost is the cost of memory. Sun understands this need and has undertaken
efforts to optimize the PersonalJava application environment in order to
minimize the memory requirements. This paper describes the techniques which
were employed to achieve an approximately 28% reduction in memory requirements for the
core classes of the PersonalJava application environment from version 1.0
to version 1.1. Because these classes are a representative cross-section
of the classes used by dynamically-loaded applications, a similar reduction
is expected in the RAM usage requirements. Going into the future, Sun will
continue to make improvements to reduce memory requirements in order to
support customer needs as the industry evolves.
Copyright © 1998 Sun Microsystems, Inc., 901 San
Antonio Road, Palo Alto, CA 94303 USA. All rights reserved.
*As used on this web site, the terms "Java virtual machine" or "JVM"
mean a virtual machine for the Java platform.
|