Software suite roots out causes of slowdowns.
Network monitoring tools, long the purview of the U.S. Defense Department logistics community, now are moving to the warfighting environment to support future military operations. The technology continuously examines the health of networks, then reports this information to a central location. It also can prevent system slowdowns by predicting problems and offering solutions.
Technologies that monitor networks address issues that have evolved from the combination of a growing dependence on information systems and a shortage of technical personnel. Software that anticipates and reports problems to the systems administrator ensures that critical systems remain up and running without requiring a multitude of technicians.
The Defense Automatic Addressing System Center (DAASC), Wright-Patterson Air Force Base, Ohio, relies heavily on networks to design, develop and implement logistics solutions that improve customers’ requisition processing and logistics management processes worldwide. The center, a division of the Defense Logistics Agency, receives, edits and routes more than one billion transactions each year for both the military and federal agencies. It has been using a commercial automated network monitoring technology called MAXE since 1994. MAXM, the company that developed the product, was purchased by Boole and Babbage, a firm that was later acquired by BMC Software Incorporated, Houston, Texas. BMC took the MAXE technology and made it the core component of a family of products known today as Patrol.
According to Frank R. Schweisthal, director, DAASC Information Center, the organization has been using Patrol for its Oracle database for approximately one year. “DAASC uses this tool to provide a proactive capability to recognize potential problem situations across the entire DAASC computing enterprise, including our two operating sites that are separated by 2,500 miles,” Schweisthal explains. “The DAASC enterprise consists of IBM, Compaq, HP [Hewlett Packard], Sun and Dell systems running various applications, operating systems and COTS [commercial off-the-shelf] tools. DAASC uses the tool to recognize alerts or thresholds being reached, feed the information to our problem management system and, based upon the severity level, page the proper support people. These processes are completed without human intervention.”
Dean Mericka, regional manager, national Defense Department agencies, BMC Software, who works out of the company’s McLean, Virginia, office, says Patrol was originally designed to help systems administrators monitor and manage servers in various locations. During the past several years, the family of products has grown and matured into an enterprise management solution set. Patrol features an application-centric rather than system-centric design. It provides service-level management by allowing systems administrators to find the cause of a performance slowdown quickly. This capability is particularly beneficial, Mericka says, because information technology personnel say that 80 percent of the time they devote to problem resolutions is spent identifying and locating a problem and only 20 percent is spent on fixing it.
Schweisthal agrees that the benefits of an automated monitoring tool set are twofold. “The primary goal of implementing this type of capability was to provide a very reliable and high-quality service to our customers. A secondary goal was to reduce the number of work years required to manage the DAASC’s 24-hours-a-day, seven-days-a-week operations. Through a combination of COTS tools and operations culture changes, DAASC has exceeded its goals. We have been able to automate the event recognition process and reduce the work years required to perform operational monitoring,” he says.
The savings have been substantial. According to Schweisthal, since the capability was implemented in 1994, the center has reduced the cost of operational monitoring by 25 work years.
“Prior to using this tool, DAASC had staff at both operating sites monitoring each site’s operation individually 24 hours per day, seven days a week. Since the implementation, DAASC monitors both sites’ operations from one location. We actually split the day so some monitoring is completed at each site. This provides trained people at either site in case of an emergency,” Schweisthal relates.
In addition to saving time by performing monotonous tasks, Patrol also improves the quality of the monitoring. While humans could easily miss a problem, this commercial product performs flawlessly every day, he adds.
Mericka explains that initially Patrol conducts foundation monitoring of the systems. In this role, the software examines the basics of an enterprise to determine the health of the various components through instrumentation of each component or layer of the technology stack, including the operating system, databases, middleware and applications.
The suite of software products helps technical personnel monitor enterprisewide systems from a central point. Because the software conducts the numerous operations required to ensure that an organization’s information technology is working at peak efficiency, fewer personnel are required to keep systems up and running. This is a valuable asset as the shortage of qualified technical personnel persists, Mericka offers.
Patrol agents and knowledge modules (KMs), which are expert libraries of rules that are loaded into the system, work in tandem. The agents receive instructions about how to manage an application from the KMs. The intelligent, autonomous agents discover applications and objects in an enterprise, automatically monitor parameters and statistics, detect events, gather information, initiate corrective action and notify administrators when events occur that require attention.
Approximately 300 off-the-shelf KMs are available that scan various types of systems and network activity. For example, a Microsoft Exchange module features several hundred metrics about Exchange activity. With the assistance of other KMs, system administrators can conduct capacity planning and modeling as well as root cause analysis and isolation, Mericka explains.
Patrol’s Knowledge Module Deployment Server enhances KM version control by facilitating the deployment and installation of KMs to multiple nodes simultaneously. It also allows all KMs to be managed from a single location.
Systems administrators monitor the status of enterprisewide systems from the central console component of the Patrol suite. They receive real-time event data and alerts about all applications, computers, local area and wide area networks, and communications devices throughout an enterprise. In addition, recovery actions can be initiated from the console.
Patrol Enterprise Manager conducts event correlation. By coordinating enterprise infrastructure status data, the software centralizes management, filtering and notification capabilities. This information is presented on the Patrol Explorer console in a graphical format. The console also provides access to underlying systems.
Other members in the Patrol family that offer additional capabilities complement the primary monitoring capabilities of the software tool set.
Patrol for Performance Management supplies current and historic analysis information. This data is presented graphically, which allows systems administrators to drill down through various levels and determine the source of a problem. Systems administrators can perform workload characterization and modeling for server consolidation, budget planning and acquisition with this solution. Performance reports can be distributed throughout an organization with this World Wide Web-enabled technology. The product is available for AS/400, Informix, Microsoft Exchange, OpenVMS, Oracle, R/3, Sybase, UNIX and Windows 2000 Server.
When a problem occurs, Patrol for Diagnostic Management enables technicians to perform root cause analysis in near real time. The software automatically isolates potential causes and conducts automated tests to determine how to eliminate them. When the diagnosis is complete, users receive a message through the central console that indicates how the problem will affect applications. Versions of this tool are currently available for Microsoft Exchange, Exchange Server-Diagnose and Windows 2000 Server and Server-Diagnose.
Patrol Prediction and Capacity Management tools take monitoring one step further by providing advanced modeling and analysis of hardware, applications and transaction rate changes. The software predicts the impact that changes in demand will have on performance and allows users to plan ahead. In addition, organizations can test various solutions to performance problems prior to purchase. This information not only prevents problems but also helps determine problem response times.
The increasing value of information sharing in the battlespace has all of the armed forces examining technologies such as Patrol that ensure the reliability of their networks. Patrol was one of only two systems chosen as Gold Nuggets from last year’s Joint Warrior Interoperability Demonstration, or JWID (SIGNAL, October 2000, page 71).
According to Col. James W. Dowis, USAF, director, JWID Joint Project Office, Hampton, Virginia, Patrol was selected as a Gold Nugget out of all the Tier 1 technologies demonstrated because it was the best low-cost, low-risk technology that met an immediate joint warfighter need.
All of the technologies that participated in JWID had to address at least one of 29 objectives that were organized under five capstone statements. Patrol—which during JWID was called the Reliability, Performance and Situational Awareness of Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance Systems—fulfilled the requirement to demonstrate enhanced information superiority technologies in a combined/coalition environment. “Patrol clearly showed it could detect and help resolve problems that can affect the ability to disseminate information such as e-mail between combined warfighter sites,” Col. Dowis remarks.
“The problem Patrol solves is that it provides the military user with the capability to automatically and proactively monitor systems applications, such as e-mail, Web tools, and others. Most systems used today do a good job of monitoring network backbone availability but do not monitor the applications that ride the network information highway. Patrol has the capability to identify and resolve problems before they interrupt applications used by the warfighters to pass information and decisions,” the colonel adds.
The ability to monitor the networks and applications is important, he points out, because these capabilities must be constantly available to pass crucial information and decisions in support of time-critical military operations.
The JWID Joint Project Office and its doctrine and concepts of operations (CONOPs) working group are now developing CONOPs and standard operating procedures for both Patrol and SilentRunner, the other JWID-designated Gold Nugget (SIGNAL, February, page 57). A U.S. Air Force Space Command representative chairs the working group. CONOPs verification for Patrol is scheduled for this month at Space Command.
“At the same time we conduct the CONOPs verification, we are working with representatives from the unified commands to determine their field intent for one or both of the Gold Nuggets,” Col. Dowis says. Acquisition activities to purchase limited quantities of both Patrol and SilentRunner are scheduled to begin later this month, and the technologies will be fielded between June and September 2001.