Attackers Placed At Scene of Crime Before They Arrive

December 2001
By Sharon Berry

Network security detect-and-react model evolves into a system that forecasts and neutralizes cyberassaults.

Defensive information warfare posturing traditionally has taken the form of security—passwords, firewalls and locked doors. But with less than 100 percent confidence that intruders can be kept out of information systems, a U.S. Air Force and industry team is developing a fundamentally different defensive approach. They are creating a prototype that provides advanced warning of attacks on U.S. Defense Department systems so they are prepared when security is breached.

Emphasizing proactive versus reactive operations, developers of a network early warning system are identifying activities that precede a coordinated attack by using a methodology that combines several artificial intelligence technologies—neural networks, fuzzy logic and Bayesian belief networks—to prioritize the analyst’s workload and produce predictive estimates of future attacks.

The Air Force Research Laboratory selected Logicon, which was recently rebranded as Northrop Grumman Information Technology, in 1997 after the company submitted a concept paper on algorithms that could help protect military computers from organized Internet-based attacks. In recent months, the Pentagon has reported numerous cyberattacks. According to John C. Whitson, a cybertechnology specialist with Northrop Grumman Information Technology, Bethpage, New York, the company won an initial contract to devise a methodology to answer these questions: “Can large-scale, coordinated cyberattacks be forecasted? If they can, how?”

The contract, which ran until 1999, resulted in both a methodology and a concept demonstrator—the Network Early Warning System (NEWS) 1.0—that was reviewed by more than 55 military and government organizations. In a follow-up survey, feedback indicated that technology development should continue and a prototype should be developed, Whitson says. Last August, the research laboratory awarded Northrop Grumman Information Technology an 18-month contract to proceed with the work.

“There are a lot of tools on the COTS [commercial off-the-shelf] market that address intrusion detection—basically protection and detection devices,” Whitson explains. “We have just-in-time intrusion detection. There is a real gap on the recovery side. After you’ve been hit by an attack, how do you get back on your feet?” He contends that the NEWS forecasting methodology will lessen the impact of an attack and make recovery easier.

NEWS looks for a variety of attributes that may indicate that a network event such as a contact or log-on session is a precursor to a cyberassault. Network headers, time of visit, the length of time between keystrokes and the source of the online activity are among the attributes analyzed.

Whitson shares that network traffic data enables two capabilities that are just beginning to be explored. The first is that network traffic data reveals the identity of the originator despite spoofing techniques. “There are some things you just cannot change in Internet traffic and still have your traffic delivered,” he notes. However, originators are recognizable only if they are previously encountered entities—“attackers who revisit the scene of the cybercrime whether they spoof their origin or not,” he adds.

Second, by examining traffic data, NEWS can determine the intended targets of an attack. For example, an attack signature containing a string such as CMD.EXE would indicate that the Microsoft Windows platform is the target. By determining which machines are the intended victims, NEWS will be able to hone the forecast to include only Windows machines.

NEWS also uses these characteristics to correlate individual network security incidents to help analysts get a bigger picture and determine if the site is under a large-scale or coordinated attack. Once incident correlation has taken place, the system can identify the stage of the attack.

“There is a schedule of events that an attacker has to go through to achieve an IW [information warfare] attack,” Whitson says. By looking at the incident evidence left by attackers and the network traffic information that they cannot spoof, NEWS can gauge where intruders are in the attack process and make a forecast. It looks at where attackers have been and what they have done as well as where the attackers have not been and what they have not done. This information fits into a timeline, which allows the system to forecast what the perpetrator is going to do next.

Logicon has defined seven stages of highly structured attacks. The two stages that occur earliest on the timeline are legal activities—network mapping and host probing—which are often overlooked by conventional detection systems. They are essential to the NEWS approach because they help uncover sophisticated assailants who use low-and-slow stealthy intrusion techniques. NEWS’ precursor detection abilities monitor these types of legal activities. The other five stages—user access, fortification, root access, damage and surveillance, and cover up—are typically illegal and detected by conventional systems.

Artificial intelligence technologies are applied to information gathered at each of the seven stages. Whitson says that the development team is using Bayesian belief networks, which provide two critical abilities. “One is the ability to model causal relationships as in ‘a’ implies ‘b,’” he explains. “If I have a 50 percent chance of seeing ‘a,’ that might mean I have a 20 percent chance of seeing ‘b.’ When intrusions occur, you can determine whether they contribute to a coordinated attack.”

The technology also helps the system determine a temporal relationship among events by noting the probability of one event happening after another. “A simple case would be a Gaussian curve where the forecasted ideal attack point is at the top of the curve,” Whitson states. “There is a decreasing probability [that an attack could occur] before and after that point, but it is still a significant one.” For example, over a five-minute time period there may be a 20 percent chance an attack will occur four minutes before the forecast, a 50 percent chance the attack will happen on time, and a 30 percent chance it will happen two minutes late. “You get a rise-and-fall arrangement. You can tie those into Bayesian belief networks,” he adds.

Dempster-Schafer evidential reasoning is another approach being employed. It gives developers a function similar to Bayesian belief functions except it does not use probabilities. “You don’t have to know the probabilities of the whole system,” Whitson offers. “You are taking a relative evidential reasoning approach and may say ‘this is better than that’ rather than ‘there is a 20 percent chance this is going to happen and 10 percent chance that is going to happen.’ It is important because it is hard to gather certain data; one would need to use a Bayesian network. By using a hybrid approach, we use what’s appropriate when it’s needed.”

Because every site being monitored is different, NEWS uses neural networks, which can find patterns, specifically those that exist in data streams. “This is a similar function to many systems on the market now except they traditionally don’t use an adaptive method such as a neural network; they use a statistical method,” he relates.

Fuzzy logic is combined with these technologies. It allows expert knowledge to be leveraged in an imprecise way to account for gray areas. “We have a great analogy about a pile of sand. If you take one grain away, is it still a pile of sand? If you keep taking grains away, at what point does it stop being a pile of sand? Traditional expert systems cannot deal with that concept. They’re Boolean: things are or things are not. With fuzzy logic, we can, on a graded level, say if it resembles a pile of sand. This type of ability is critical in analyzing network traffic,” he maintains.

Using these technologies to look at profiles, precursors and correlated intrusions, NEWS extrapolates and forecasts future scenarios for progressive attacks. If enough of an attacker’s profile and attack signature are present, a forecast may also indicate the type of intrusion that will occur next.

“When we actually get a forecast that something is going to happen, NEWS will produce an alert to the operator/analyst fully qualifying everything it can report about the state of attack,” Whitson says. The report will include where the intruder has visited, whether an identity has been spoofed, and where an attack originated. It also will generate a timeline that forecasts successive assaults.

Whitson says the system will help analysts more efficiently sift through voluminous data. “For anyone who has observed a lot of these attacks, or even a few, there is a large amount of data. For example, one recent attack, the Code Red worm, brought in reports of hundreds of thousands of incidents per hour. There is no analyst who can keep up with that,” he remarks.

By using NEWS, analysts will have access to summaries of correlated network incidents. Instead of seeing 100,000 identical incidents, operators might see a summary showing one incident that was generated by 30,000 hosts. “It is something the operators can digest and respond to a lot quicker than if they just watch the streams of data go by,” Whitson notes.

Once an intruder is anticipated and/or caught, outside technologies such as the Data Resiliency in Information Warfare (DRIW) program come into play. The objective becomes recovery to maintain information resiliency. One technique is called a honey pot, Whitson shares. “In the cybercommunity, it means a basin of attraction or a virtual network. You can redirect a network session into a benign environment on-the-fly. It’s a technique to gauge and analyze attackers to find what their motives or modes of operation are or what they’re looking for. If you can forecast early enough, you can take that type of action rather than letting them wander around your networks.”

The system itself must provide a reasonable forecasting capability, Whitson concludes. “It must be able to digest and correlate incident reports from a number of different sensors and reduce an operator’s workload. These are the two key bullets of NEWS. It will be there as a fusion and correlation product and make forecasts about what the analyst may have to deal with in the near future,” Whitson offers.

Enjoyed this article? SUBSCRIBE NOW to keep the content flowing.