![]() |
|||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||
Conference Tools
|
Continuous Integration in the Cloud With HudsonIn the session titled, Continuous Integration in the Cloud With Hudson (TS-5301), Sun's Kohsuke Kawaguchi, project owner of java.net's Hudson Project, aided by Sun's Jesse Glick, gave a presentation on Hudson, the five-year-old open-source continuous integration system written in the Java platform that enables developers to automate various aspects of the development process, such as builds and tests, and bring transparency to projects. Continuous integration is a computing trend focusing on offloading work from people to computers and utilizing greater CPU power, a process that helps developers cope with bloated IDEs, static code analysis tools, and the need to reduce build/test executions. This wave is happening on the server side. Hudson, an open-source CI (continuous integration) server at java.net that is easy to install and use with a GUI for human users, now offers more than 140 community developed public plugins from 120 or more contributors, and has an estimated 13,000 installations. Hudson has been adopted by such leading companies and technologies as eBay, Yahoo, SwingLabs, GlassFish, JBoss, Lucene, and MySQL. Adoption is strong from a variety of industries. The new Hudson enhancement enables it to interface with cloud services and virtualization technologies so that users can improve resource utilization, reduce maintenance overhead, and cope with sudden system load spikes. The session briefly introduced lower-level libraries that enable Hudson to interface with cloud/virtualization services and explained related enhancements, such as Project Kenai and NetBeans-related options. Kawaguchi began by observing that many organizations have under-utilized computers without the necessary software to use them optimally. "The price of computing power is getting cheaper and cheaper," said Kawaguchi, "while the price of human beings gets more expensive, so it makes sense to spend more on machines and to equip each one of us with more computers to get more bang for the buck." Hudson places great emphasis on ease of use and installation, plus extensibility in an effort to harness the power of the community. "I'm the only guy at Sun working on Hudson full time, and we have more than 140 plug-ins, so this is truly a community-driven project." Going Distributed
In addition to the economic trend on companies to rely more on computers, there are practical reasons as well. "Your software may need to run on multiple environments so you must test them on multiple environments," said Kawaguchi. "Or your task depends on specific resources -- to test them multiply and concurrently, you need multiple environments." Distributed builds with Hudson begin with what Kawaguchi called a "master" that serves HTTP requests and stores all important information about builds, test results, artifacts and so on. The masters control "slaves:"
Master and slave interact via SSHD (Secure Shell Daemon) through the following process:
"A key part of the Hudson architecture is that the master and slave only need a single communication channel which is bi-directional," said Kawaguchi. "Hudson can be deployed on very different networks." Hudson works with OpenSolaris, Ubuntu, CentOS and Fedora. Once started, Hudson can be installed as a Windows service in which:
Tools and Hudson
Sun's Jesse Glick addressed the issue of tools. "One of the problems with large builds and lots of machines is that they do not have the right kind of tools for development for builds and tests pre-installed," explained Glick. "So, for example, if you are installing Ubuntu Server Edition on a new slave with a default installation, it probably won't have development tools like Maven and others." As a result, with a large number of nodes, it becomes cumbersome to perform builds on additional machines. Glick explained various customized tool options being created for Hudson, along with techniques for easier installation. The Heterogeneous Cluster Challenge
Kawaguchi then spoke of the "heterogeneous cluster challenge," which refers to the fact that builds and tests need to run on specific environments, yet dependency on individual nodes hurts utilization. While setting up slaves, it's best that they look alike. This is a challenge in a heterogeneous environment. Hudson responds to the challenge with "labels," a process by which a group of slaves ties jobs together. So a Wombat, Hudson, and GlassFish Windows test might go through a Windows label that then is disaggregated into specific slaves. "A single label can have multiple nodes, or each node can have different multiple labels," explained Kawaguchi. "So if Windows needs to run on Windows labels, Hudson can pick Windows slaves that satisfy that capacity. You can maintain a heterogeneous cluster fairly painlessly." Forecasting Failures
Hudson monitors key health metrics of slaves such as low disk space, insufficient swap, or when the clock is out of sync. Poorly functioning slaves are put offline automatically -- the system is designed to catch problems before builds are broken. "In running a big cluster, keeping the cluster healthy is really important because when people start seeing build failures because of infrastructure instability, a "crying wolf" effect prevents people from paying attention to the build failures," explained Kawaguchi. When a health metric gets low, it turns the slave offline so it doesn't affect the build or test that you are running. When to Add Slaves?
"When is it time to add more slaves?" asked Kawaguchi. "When there's almost always something in the queue." Hudson detects excessive workload and measures exponential decay to filter out noise and then notifies plug-ins, which provide more slaves. Up Into the Clouds: The Good and Bad with Amazon EC2
The session turned to Amazon's EC2 (Elastic Compute Cloud). "The cloud is Hudson's infrastructure," commented Kawaguchi. "A lot of companies work on different clouds, including Sun. The major one is Amazon EC2. From Hudson's perspective, the price model is the good thing about EC2; you are only charged per hour. Hudson can start as many nodes as needed and turn them off when they are unnecessary." Hudson can talk to EC2 directly through programmable APIs without requiring an email or other interaction. EC2 has a programmable API with instances that launch fairly quickly, especially on Linux. EC2 instances are forgetful -- especially on Hudson. So it fits well with Hudson. The negative side of Amazon EC2? The data is still inside the firewall; it takes time to check out code or to archive build artifacts. Hudson's EC2 plug-in runs on top of typica, a Java client library for a variety of Amazon Web Services. Typica:
The Hudson Hadoop Plug-in
The session turned to the Hudson Hadoop Plug-in that:
The session closed with a brief discussion of Selenium Grid, a framework for testing web applications that works well with Hudson for rapid testing. For More Information
» Hudson Do you have comments about this article? We welcome your participation in our community. Please keep your comments civil and on point. You may optionally provide your email address to be notified of replies - your information is not used for any other purpose. By submitting a comment, you agree to these Terms of Use. |
||||||||||||||||||||||||||
ContactUs | About Sun | Privacy | Terms of Use | Trademarks Conference content is subject to change. Copyright 1996 - 2009 Sun Microsystems, Inc. |
![]() |
|