Self-Managing Computers Come Online
Scientific community collaborates on future systems that are resilient and highly adaptive.
Industry is focusing on how to reduce computer system complexity by modeling the human body’s autonomic nervous system. From servers to software, researchers are building all components of the infrastructure based on the same characteristics—regulation and protection of key functions without conscious involvement. Autonomic computers will make more decisions on their own and require less human intervention.
Over the past 20 years, computing technologies have increased in complexity, continually adding to the tasks of the people who manage them. As systems grow more intricate, they become more prone to error as well as more costly to manage. Additionally, experts predict a skills shortage, with demand for experienced professionals growing by more than 100 percent in the next six years.
According to Miles Barel, program director for autonomic computing, IBM, Somers, New York, the emphasis of autonomic computing is self-management. “We want to balance the interaction between people and the systems so that system administrators can focus on enabling business needs,” he says. “The work that is being done will help facilitate a system’s ability to understand the environment in which it is running; analyze the environment, its current behavior and its performance; develop plans to achieve desired behavioral objectives; and execute those plans.”
To develop self-managing systems, researchers are exploring four attributes of the concept: machines that are self-configuring, self-optimizing, self-healing and self-protecting. The first attribute, self-configuration, enables systems to define themselves on-the-fly, allowing infrastructure to make adjustments as necessary. For example, new software could be added to the infrastructure automatically with no service disruption. Next-generation capabilities will work with plug-and-play devices, configuration setup wizards and wireless server managers so that new functions can be added with minimal human intervention.
Barel notes that IBM’s recently released DB2 version 8.1 is an example of next-generation database technology that has self-configuring capabilities. It features a configuration adviser that enables database administrators (DBAs) to accomplish database configuration tasks for optimal performance in a matter of minutes. “The configuration adviser takes basic information about how you’re going to use the database and queries the environment in which it is running,” he explains. “It uses heuristics to make recommendations about how to set approximately 100 tuning parameters that one would otherwise adjust by hand. The configuration adviser can achieve in minutes what it would take DBAs days or weeks to accomplish. We have seen results where performance has doubled.”
The second attribute of autonomic computing is self-optimization. Components such as storage, software and networks must be adjusted frequently so that operations run efficiently even during unpredictable circumstances. Academic institutions from around the world are addressing this issue. For example, computer scientists at the University of California–Berkeley are designing OceanStore, a global persistent data storage technology capable of accommodating millions of users (SIGNAL, March, page 53). Any computer may join the infrastructure, contributing storage and bandwidth. The visionary aspect of the work is that the system will provide continuous online adaptation, constantly optimizing itself to find the shortest route for information relay.
“What we get from self-configuration and self-optimization is the ability to deploy new solutions much more rapidly,” Barel shares. “When we look at IT [information technology] management and when we speak with customers, we find that a tremendous amount of the IT budget is going to just managing the systems in place. Customers are experiencing significant backlogs of new projects they want to undertake. As you reduce the amount of money you spend managing what you have, you can now address the need for new applications and systems that will help overall operations become more efficient.”
Berkeley’s OceanStore also features self-protection, the third attribute of autonomic technology. It adapts to scenarios such as server failures or attacks by automatically switching to other available or uncorrupted servers. Future computer components must detect and protect themselves from attacks anywhere. They also must be able to define and manage user access, protect against unauthorized access and report activities as they occur.
When prevention is not possible, self-healing should take place. A system must diagnose and react to any potential or actual disruptions. To help any part of the infrastructure recover from a component failure, scientists are designing technologies to locate and isolate the failed component, take it offline, fix the problem if possible and reintroduce the fixed or replacement part without system shutdown.
Duke University researchers are addressing digital self-healing with a technology called software rejuvenation, a proactive fault management technique that continuously cleans a system’s internal state to prevent crashes as well as performance degradation.
“Autonomic restoration gives businesses a more resilient infrastructure,” Barel says. “Whether a commercial or government entity, we have tremendous dependence on the applications to be there all the time. Self-healing and self-protecting attributes help ensure systems are secure and present when you need them to be. It’s not just about a resilient infrastructure; it’s about a resilient business. If IT goes down, business stops.”
Self-managing capabilities are needed for the entire infrastructure—not just individual servers or software—and requires a lot of coordination among the pieces, Barel says. “The value of autonomic computing is greatest when the components of IT that support a business process interoperate in an autonomic way.”
The key to achieving comprehensive interoperable solutions is using open industry standards. Barel notes that IBM is conducting its research and development work through collaboration with industry bodies such as the Global Grid Forum, a community-initiated forum of individual researchers working on grid technologies and promoting the use of best practice technical specifications, user experiences and implementation guidelines. “We have a coordinated effort to implement these technologies across the IBM product portfolio as well as a strong effort to work with other providers in the industry,” he points out. “We are promoting the acceptance and adoption of these technologies and open standards so that all of the pieces of a solution, no matter where they come from, can work in an autonomic fashion.”
A significant impact of this work is a business’s return on information technology investments, he adds. “As you provide greater levels of automation across an IT infrastructure and across your business processes, you have a direct impact on lowering operating costs. You improve utilization rates of your systems. You get more productivity out of the resources you’ve invested in. You can also avoid things like downtime and other intangibles that cost money.”
Additionally, system administrators are freed from overseeing mundane tasks and can customize technology to meet business goals. Administrators often manage individual components of the infrastructure according to information technology-oriented principles such as transactions per minute and response time.
“We want to measure business performance,” Barel explains. “I often talk about high-volume commerce sites where a company may do an analysis that indicates that the top one percent of its customers generate the largest amount of revenue. Administrators could set up a business rule to respond to the top one percent of customers in one second or less, to the top five percent in 30 seconds and the top 10 percent in two minutes.” Technology managers can encode business rules to help manage systems. The system automatically will know how to manage all components of a transaction from a customer’s perspective and meet performance goals regardless of the number of applications involved.
Industry is doing a good job of building new capabilities that can talk to each other, Barel adds. “Just recognizing the overall problem and focusing on it in total is really one of the most significant contributions. To just undertake one attribute and have a component be self-healing will not solve the problem. The innovation comes from looking at all aspects of the IT infrastructure.
“With every new release of [systems technology] you can expect to see more autonomic capabilities being embedded,” Barel states.
Additional information on autonomic computing is available on the World Wide Web at http://www.research.ibm.com/autonomic.