Enable breadcrumbs token at /includes/pageheader.html.twig

DOD Agency Propelling the Evolution of AI Red Teaming

Defense Advanced Research Projects Agency crews are developing tools and techniques to better find defensive holes in AI-based systems.

DARPA’s new initiative constructs an example of an AI red team and allows realistic role play of the adversary and real-world attacks against AI-based systems. Credit: Gayan-stock.adobe.com generated by AI

As artificial intelligence (AI)-based systems become more prevalent and sophisticated than ever before, defense officials are exploring new ways to ensure that these models are protected against attacks from the enemy. One way that Defense Advanced Research Projects Agency (DARPA) officials are addressing and providing for this need is through the Securing Artificial Intelligence for Battlefield Effective Robustness (SABER) program. The initiative is designed to help construct the ideal example of an AI red team while also supplying them with counter-AI training, tools and techniques, so that they can realistically play the role of the adversary and launch simulated, real-world attacks against AI-based systems, according to DARPA personnel.

This comes as crews are finding it difficult to test and evaluate AI-based systems for potential vulnerabilities after they have integrated these capabilities into the battle. Because of this, Department of Defense officials are striving to adopt a more proactive rather than a reactive approach and create practical simulations to identify any potential areas for improvement. Furthermore, DARPA leaders acknowledge that there are several areas of AI-based systems that could be exposed by the enemy, but traditional thinking focuses solely on the model itself, according to Nathaniel Bastian, program manager of the Information Innovation Office at DARPA.

“If you actually wanted to defeat or protect and safeguard an AI-enabled system against these types of attacks, there are more areas of vulnerability than just the model,” Bastian explained during an exclusive interview with SIGNAL Media. “You have the operational environment itself that could be manipulated; you have different sensors; you have data pipeline; you have communication between; [you have] feeds and data links and things like that; you have the actual developmental model; you have when that model is called to serve a particular, in this case, that inference time to be used.”

“And so, not only are the real-world risks about these sorts of AI systems known, nor is there a way to really quantify that,” Bastian added. “It’s been overly focused on the model. The deployment of these AI-enabled systems is you have to consider the whole pipeline, both the development and deployment pipelines of an AI-enabled system.”

SABER program crews intend to address several specific areas that consist of AI-enabled technologies, especially when it comes to autonomous models. From unmanned ground vehicles to aerial systems to drones, the demand for these tools is rising due to their ability to assist warfighters in accomplishing tasks more quickly and conveniently while also keeping them out of danger. Some examples of this include casualty evacuations and logistical resupplies, according to Bastian. But the process of guaranteeing that the technologies performing these duties are protected has fallen behind.

“These trucks that drive themselves are great for logistics and things like that, but now we have these rugged unmanned ground vehicles that can drive over military terrain to get from point A to point B with different sensors and machine learning modules that allow them to navigate on their own,” Bastian said. “Well, this is again another threat surface that needs to be explored, and SABER is going to do that.”

The program arrives at a critical time in the overall timeline of the deployment of these autonomous systems. This comes as the U.S. Army strives to implement human-machine formations into its operations so that soldiers and robotic systems can work together to attain the best possible outcome on the battlefield, according to U.S. Army officials. However, this innovative formation cannot be put into action until warfighters confirm that the technologies are protected against threats from adversaries, which is what SABER aims to carry out.

“Before we can move forward with the full-scale deployment of unmanned ground vehicles, and for the Army, that directly connects to human-machine formations,” Bastian said. Human-machine formations are a top priority for Gen. James Rainey, Army Futures Command commander, and other Army leaders. “[In] human-machine formations, a big part of the machine is unmanned ground vehicles, both ground and air, so if we’re going to start deploying these into formations, we need to make sure they can be trusted. Part of that comes down to ensuring they’re thoroughly red team [tested] from an operational perspective.”

The SABER program is scheduled to last about two years, and officials will use three-quarters of that time to focus on building the necessary team and technologies and conducting assessments. Furthermore, the program is broken up into four stages.

Defense Advanced Research Projects Agency (DARPA) officials’ new way to address attacks on an AI-based system is the Securing Artificial Intelligence for Battlefield Effective Robustness program. Credit: DARPA

The first stage lasts four months, and officials will finalize the plan for the program and the relevant experiments. The second stage is called SABER operational exercise one, and it takes place over a nine-month period.

During this time, leaders plan to conduct four different exercises, zeroing in on creating metric baselines for blue and red teams. Officials will also perform iterative tests on unmanned ground vehicles to improve AI red team effectiveness. Following the experiments, teams will use the results to discover areas of improvement, which can lead to the development of new tactics, techniques and procedures. Stage three is called SABER operational exercise two, and during it, crews will repeat stage two but with drones instead of unmanned ground vehicles. Finally, stage four is the ramp-down phase. Here, SABER personnel will review the lessons learned and solidify new documentation of the tactics, techniques and procedures, according to Bastian.

SABER project leaders are working with several partners to ensure that the program addresses the main concerns of the military. Officials with the Office of the Director, Operational Test and Evaluation; U.S. Cyber Command; National Security Agency and offices within the Program Executive Office Simulation, Training and Instrumentation, among others, are all players within this development, according to Bastian.

Furthermore, officials are pleasantly surprised by the fact that a plethora of public sector groups are showing a high level of interest and willingness to get involved with and use the benefits that come out of the program. In addition to the Army and DOD as a whole, private sector personnel and individuals in academia are signaling that they want to become associated with SABER as well.

“Even with the proposal abstracts, I was blown away by the amount of interest,” Bastian said. “In terms of government partners, this is not DARPA alone. There’s a lot of interest across the DOD, the different services and the intelligence community, which have been doing a lot of talking about this in the last couple of years.”

“There’s a lot of interest for both the performer base and the stakeholder base,” Bastian added. “At the same time, industry is continuing to move forward with AI red teaming from a generative AI perspective and large language models, foundation models, whether it be system-level safeguards or different techniques and tools, but they’re not focused on classic machine learning as we would call it and how they’re integrated into not just applications but systems being used on the battlefield, so that’s good from a DARPA perspective.”