Industry and Academia Emphasize Vital Role of T&E
Members of industry and academia called attention to the crucial role regular test and evaluation (T&E) of artificial intelligence (AI) models plays when it comes to their effectiveness and subsequent user safety. While stressing the importance of T&E during the TechNet Emergence conference on Tuesday, these individuals also warned of incidents that have occurred and consequences that could arise when AI systems are not assessed.
For example, Missy Cummings, a professor at George Mason University, director of the Mason Autonomy and Robotics Center and former F/A-18 pilot with the U.S. Navy, strongly advised against inserting generative AI inside a weapon, particularly when the government organization that conducts T&E has been reorganized and reduced.
“Do not tell me that you want to put generative AI inside the weapon,” Cummings said. “That is the dumbest idea I have ever heard. You are going to kill someone by putting generative AI inside of a military system. It’s outrageous. We cannot let these companies put generative AI inside of any safety-critical system, especially if we have no branch of the government that then can evaluate that system to make sure that it at least meets its basic requirements.”
Cummings noted that during her time with the Navy, 36 fighter pilots died over a three-year stretch—all due to problems that could have been detected with T&E.
“They were always because of some problem with interaction in the plane, or they didn’t understand pilot workload, or the system never should have been designed like that,” Cummings said during a panel called "Securing American AI Dominance: Innovation, Autonomy and Trust" at TechNet Emergence in Reston, Virginia.
“If we don’t get some capability back inside the Department of Defense, people are going to die; warfighters are going to die,” Cummings warned.
Another member of the panel, Jane Pinelis, chief AI engineer of the Applied Information Sciences Branch at Johns Hopkins University's Applied Physics Laboratory, said the government should enforce a set of testing policies when it comes to AI systems. These models need to be evaluated early in the development process to allow crews to catch and fix their mistakes before building on top of them, Pinelis explained. Additionally, she said AI systems need to be tested often and in a repeatable way.

We cannot let these companies put generative AI inside of any safety-critical system, especially if we have no branch of the government that then can evaluate that system to make sure that it at least meets its basic requirements.
Crews also need to develop the necessary infrastructure to accompany the policy, Pinelis said. She called for the creation of live, virtual and constructive testing simulations and methods to store the data that emerges from T&E.
In addition to creating policy and infrastructure, another member of the panel noted that they need the right people to conduct the tests. They need personnel who understand the algorithms and how to test them, Steven Meier, a senior technology executive, said.
Lastly, to make progress in their goal of creating safe and successful AI systems consistently, individuals must have these discussions and accept differing perspectives, said Jennifer Sample, chief technology officer at Empower AI.
“The only way we’re going to get there is if we become comfortable with this tension,” Sample said. “This is our new norm, and we have to find a way through.”
TechNet Emergence is organized by AFCEA and supported by the U.S. Department of Defense, The MITRE Corporation and the National Science Foundation. SIGNAL Media is the official media of AFCEA International.
Comments