 |
|
Bio: Tony Printezis, is a Staff Engineer at Sun Microsystems, based in Burlington, Massachusetts. He joined Sun in 2002 after a two-and-a-half year collaboration while he was a member of the faculty of the Department of Computing Science of the University of Glasgow in Scotland. He obtained a BSc(Hons) in 1995, and a Ph.D. in 2000, both from the University of Glasgow. Tony has been contributing to the Java HotSpot Virtual Machine, as well as the Java Real-Time System, since the beginning of 2006. Before that, he worked at Sun Labs for three-plus years. He spends most of his time working on dynamic memory management for the Java platform, concentrating on performance, scalability, responsiveness, parallelism, and visualization of garbage collectors.
|
Q: Tell us about the Garbage First Garbage Collector, which you have described as the next-generation low-pause garbage collector that will be included in the Java HotSpot virtual machine.
A: I'll refer to it as G1, our internal nickname. G1 will be the local garbage collector that will ultimately replace the Concurrent Mark-Sweep (CMS) garbage collector, Sun's current low-pause garbage collector in the HotSpot JVM. CMS is widely used today by customers and, hopefully, G1 will also be widely used when it replaces CMS.
G1 is a departure from what we've done in the past. All our previous collectors have had a physical separation between the young and old generations. With G1, even though it is generational, there is no physical separation between the two generations.
The heap is split into fixed size regions and the separation between the two generations is basically logical. So some regions are considered to be young, some old. All space reclamation in G1 is done through copying. G1 selects a set of regions, pick the surviving object from those regions and copy them to another set of regions. This is how all space reclamation happens in G1, instead of the combination of copying and in-place de-allocation that CMS does.
Three Objectives of G1
Q: You identify three objectives of G1.
A: The first objective is consistent low pauses over time. In essence, because G1 compacts as it proceeds, it copies objects from one area of the heap to the other. Thus, because of compaction, it will not encounter fragmentation issues that CMS might. There will always be areas of contiguous free space from which to allocate, allowing G1 to have consistent pauses over time.
The second objective is to avoid, as much as possible, having a full GC. After G1 performs a global marking phase determining the liveness of objects throughout the heap, it will immediately know where in the heap there are regions that are mostly empty. It will tackle those regions first, making a lot of space available. This way, the garbage collector will obtain more breathing space, decreasing the probability of a full GC. This is also why the garbage collector is called Garbage-First.
The final objective is good throughput. For many of our customers, throughput is king. We want G1 to have good throughput to meet our customers' requirements.
Q: What misconceptions do you encounter about garbage collection and memory management?
A: Java developers tell us that they want to do their best to optimize their applications to help the garbage collector. However, the garbage collector generally performs well without this help. I recommend that developers keep their code simple and understandable, and the garbage collector will do just fine in most cases.
Which GC to Use
Q: When are the different forms of garbage collector most appropriate?
A: Most production VMs have at least two flavors; one is the throughput garbage collector, to try to maximize throughput. The second is a low pause time or low latency garbage collector which optimizes for responsiveness.So, if you care about getting the job done as quickly as possible, and don't care much for how long your application is going to be stopped by the garbage collector, the throughput collector is the best choice. For example, if you have a batch job that is going to take a few minutes or a few hours and you want it to be done as quickly as possible, then a throughput collector is clearly the best choice. But, if you are working on a very interactive job that needs to interact with people, other applications, or users through web pages, then a low latency garbage collector is the best choice. And there are clear trade-offs between them. The low latency garbage collector is not optimized to maximize throughput and, thus, might not be the best choice for an application in which throughput is of most importance. The opposite is true too: a throughput garbage collector is not optimized to provide low latencies.
Q: Explain what you mean by "simple code".
A: Understandable code without complicated optimization. Unnecessary optimization might introduce bugs and doesn't help the garbage collector. Simple is usually best with regard to garbage collection.
Q: Why does garbage collection take so long?
A: Garbage collection is very memory-bound. And memory speeds these days are quite slow compared to CPU speeds.
Q: Brian Goetz, another Java Rock Star, once remarked in an interview, "Often the way to write fast code in Java applications is to write dumb code". Would this apply to what you say about garbage collection?
A: Absolutely. The more you complicate things, the more likely you'll confuse the collector. In a 2007 JavaOne session, titled Garbage Collection Friendly Programming, Peter Kessler, John Coomes, and I talked a lot about this. The session is available to download.
Conceptualizing GC Performance - The "Pick Two" Rule
Q: How do you conceptualize garbage collection performance?
A: There are three basic components. One of them is throughput -- basically what overhead the garbage collector imposes on the application. The second is responsiveness, or how low and how consistent the garbage collection pauses are. And the third one is footprint -- how much memory space the garbage collector requires to do its job.
And typically, for any collector, you have to pick two and sacrifice the third. So, if you want, for example, a garbage collector that has very good throughput and very good footprint, then you're going to have to sacrifice low pause times. You're going to do collections more often and the pause times might be longer.
If you want good throughput and good responsiveness, then usually you have to run with a quite larger heap, so the footprint is not as good. A "pick two" rule seems to apply here.
Q: Do you expect the "pick two" rule to change?
A: Not anytime soon.
Photography and Programming
Q: You are a talented photographer. Do you see any connection between your work as a developer and the skills needed in photography?
A: It's basically a matter of commitment. I have been known to wait for two or three hours for the light to be just right for a particular picture, and I think that's similar to development. You need to be committed and to be patient and try out things again and again, to make sure that you get it just right. I see some parallels between photography and development.
Advice for Beginners
Q: What advice would you give to a programmer just starting out?
A: Read and study lots of existing code and code samples and use some of the APIs ñ it's the best way to learn how other people program. And don't be afraid to search the web for code examples.
Q: What do enjoy most and least about programming?
A: This may sound cheesy, but I like most the joy of creating something that many people use. It's quite a kick.
What I enjoy least about programming has to be finding concurrency bugs in garbage collectors!Q: Any particular code or project that you consider the highlight of your career?
A: As part of my internship at Sun, at Sun Labs in 1998, I wrote the first version of CMS.
Q: Finally, what do you consider beautiful code?
A: Beautiful code is code that is simple, easy to understand, and efficient. And often, it gets its efficiency from simplicity. At times, you can't avoid writing ugly code to get around bottlenecks, but sometimes there's a beautiful solution that is simple, easy to understand, and efficient.
For More Information
- Tony Printezis's JavaOne Conference Sessions:
- TS-5419
Session Title: The Garbage-First Garbage Collector
Paul Ciciora, Chicago Board Options Exchange; Antonios Printezis, Sun Microsystems, Inc.
- Tony will also be present to answer questions at two BOFs:
- BOF-5268
Session Title: Java SE Platform: Your Performance Questions and Answers
- BOF-5218
Session Title: Meeting the Java SE Platform Virtual Machine Engineering Team
|