The Paramount Need for Accurate Data for Artificial Intelligence
Artificial intelligence innovators are figuring out new components they must have to build the most effective systems.
During an interview for SIGNAL Media’s Executive Video Series, Carolyn Duby, a field chief technology officer at Cloudera, described the two different types of data that researchers and producers need to construct machine learning programs.
Developers must have historical data that can be used to train the algorithms. Once they have the algorithms, they also must have the current data to make predictions, according to Duby. But that presents two different challenges: obtaining that data and preparing it for use, then making sure that data will be up to date when it is time to make a prediction or decision.
“It’s super important when you’re in the intelligence community or cyber community to be able to make that kind of accurate decision and to be able to make it quickly,” Duby said. “So in order to do that, you have to have the most recent data. If you’re making a decision based on where your adversary was yesterday, then you’re not going to be able to make that best decision.”
Furthermore, Duby said the government’s ability to share data between different departments and to share it securely is essential to securing the nation.
“If you have a decision and one person has the data here and another person has the data there, but they’re not able to share it, and you’re not able to link that up and connect the dots, then you’re not really going to be able to have the fullest defense of the country,” Duby said.
“This is of critical importance, but we also need to make sure that we’re not oversharing,” Duby added. “We’re sharing just enough; we’re sharing it securely so that it’s not leaked from place to place, and it’s also going to be the right data.”
Achieving this will provide protection for the United States and its allies, according to Duby.
Looking ahead, Duby said she is thrilled about Cloudera’s role in helping advance the nation through improvements to the machine learning and data science industries.
“Data is a strategic asset for our nation’s cyber and intelligence community and defense in general,” Duby said. “I’m really pleased that we can help with this mission to help make our nation stronger and to help manage this data asset that we have.”