National Lab Makes Science and AI-Enabled Cybersecurity Move Faster
The Department of Energy’s Pacific Northwest National Laboratory (PNNL) has created an artificial intelligence tool that combines data from disparate cyber threat databases as well as open-source information, allowing users to build a complex network of information, ask questions and rapidly retrieve answers.
The tool known as MERU—Multimodal Entity Relationship Unification—started just a few years ago as basic research, but is now being employed for PNNL’s own cybersecurity operations. And it has been made available for licensing in January through PNNL’s Office of Collaboration and Commercialization.
“There are a zillion research projects everywhere—new findings, promising results, hopefully to be implemented in the future,” PNNL spokesman Tom Rickey told SIGNAL Media in an email pitching the article. “What I like so much about this story is that the initial basic research findings done a couple years ago at PNNL are being brought to life across campus, in our operations. That’s pretty fast to go from basic research to application. Also, the reality is that PNNL is one of the most sophisticated cyber protection outfits out there. The new research must be pretty good if our operations people are choosing to add it to our existing abilities.”
MERU uses graph theory to “look at how these vulnerabilities are connected to weaknesses, which are connected to attack patterns, and which are connected back to specific TTPs [tactics, techniques and procedures] so it all can be represented as a graph,” Mahantesh Halappanavar, a chief computer scientist at PNNL, explained during a Microsoft Teams meeting with SIGNAL Media.
Defenders can access information in MERU in multiple ways. “If you ask very simple questions, such as retrieving the path from a vulnerability to corresponding attack techniques, you can retrieve answers using graph query language in the form of graph paths,” Halappanavar said. “For complex questions, we support other ways such as conversational style access using human language—English—searching using semantic embeddings, and relationship embedding approaches.”
The other, he added, is to use semantic information. For example, users can ask for information that is semantically similar to something else and find similar or related data. “Both of these approaches work really well when there is information in the database. But we also know that a lot of information does not really exist in the database. It’s a missing information problem.”
MERU also uses natural language processing and supervised learning to bridge information in four databases:
The new approach takes advantage of a blizzard of data available to defenders, all of it updated regularly:
The National Vulnerability Database contains information on more than 330,000 specific entry points for a cyber attack.
The Common Weakness Enumeration database sorts and classifies those bugs into about 1,000 categories with detailed descriptions and prevention techniques.
The Common Attack Pattern Enumeration and Classification database draws on both of those resources to spell out how bugs and weaknesses might be exploited and includes more than 500 entries of specific attack patterns.
The MITRE ATT&CK database of “adversarial tactics, techniques & common knowledge” contains more than 250 likely attack patterns based on real-world observations.
Those are the databases PNNL chose. Other users can tailor MERU to draw from whichever databases they deem best for their organizations. “That’s the model. You have the public graph and the private graph, and then the ability to connect those together without having to share your internal, nonpublic data,” reported Joseph Aguayo, PNNL deputy chief information security officer, who evaluated and employed the technology to support PNNL’s own cybersecurity.
News reports and other internet sources can also be added. “You might have, like Mahantesh was saying, the [common vulnerabilities and exposures]—the vulnerabilities, the weaknesses, the attack patterns, things like that,” Aguayo said. “Then you also have threat intelligence. You have all these different news sources and things from the internet that you can consume. So, what the knowledge graph really gets you is the ability to take that external information and then blend it together with your internal network data to look for trends and patterns and measure relevancy and impact. It kind of gives you that filtering process to weed through some of the public information and find out what really matters to you at a given moment in time.”
However, Aguayo cautioned against using too many information sources, which could slow down the information-gathering process, or using less trustworthy sources that might impact accuracy. “It’s a scalable model, so ideally, you would want to figure out your own filters. So like, these news articles or vendor notifications might be a higher reputation than other ones. There’s that level of filtering,” he said. “And then there’s also the applicability of it to your environment. If you introduce a bunch of lower-quality sources, then it can lower your accuracy.”
In addition to being available through the PNNL Office of Collaboration and Commercialization, MERU has been licensed to NJSecure, a New Jersey-based company specializing in enterprise-grade cyber defense for community banks.
Satya Vithala, NJSecure Founder, explained in an email exchange that the company integrates MERU into its Operational Risk Business Intelligence Twin (Orbit) solution. “We utilize MERU to interconnect complex threat intelligence data from national-level sources. It provides the ‘knowledge substrate’ upon which we have built agentic AI-based, end-to-end workflows. Our goal is to bridge the gap between high-level intelligence and community banks, ensuring they have access to the best available data to defend their institutions.”
The company has been recognized for its CyberGraph product, which is currently in the MVP stage as part of the New Jersey AI Innovation Challenge program, having been selected by the state as one of 10 funded projects.
Vithala praised the PNNL team, saying that working with them provided access to world-class research and development and allowed the company to build upon foundational science that would take years to develop independently. “The PNNL team provides deep domain expertise that ensures our threat intelligence knowledge substrate is robust,” he said. He added that the PNNL team helped his company navigate the rigorous federal licensing process.
Halappanavar reported that others within PNNL use the technology for government-sponsored work, and that, along with PNNL and the Department of Energy, the Defense Department has provided some funding for the research.
He outlined planned improvements from both research and operations perspectives, including tackling that missing information problem he mentioned earlier in the interview. “We are also developing tools where we can embed it in a different space and ask these questions that can still find you answers when there is missing information.”
It works by embedding entities, such as vulnerabilities and weaknesses, as well as relationships. “We can say that this attack mechanism exploits a given vulnerability. So, the exploit is the relationship, and the attack mechanism and vulnerabilities will be the entities. We take all of this information and embed it in some space where we can still ask the same question, and even though we did not have information, you can still extract good information back.”
That is for the research. For the non-research perspective, the goal is to continue integrating MERU into PNNL’s cyber operations. “The next main thing is to actually deploy it in production and have it being used on a regular basis,” Halappanavar offered.
Additionally, the PNNL team has developed a graph retrieval augmented interface, often referred to as a graph RAG, which allows more conversational questions. “You can ask questions in simple conversational style, English or a high-level language, and it would go back and convert it into these queries and come back with the information, and it will build you a structure around so you can have a conversation with the database itself,” Halappanavar added. “We have that fully working as well, and we are trying to add in new information to different sources and trying to keep it updated as we go forward.”
Asked what they would like to add at the end of the interview, Aguayo noted that AI lowers the skill level and accelerates timelines for attackers, so AI-enabled defense tools can help keep pace with the threat but do not change the need for basic cybersecurity. “You still have to manage your attack service, have an inventory, check things regularly, fix things regularly, enforce zero-trust principles. The fundamentals of cybersecurity don’t change, but this type of approach allows you to speed up and to keep pace with the rate of change.”
And Halappanavar stressed that defenders should remain aware of the various threats and address the most severe concerns. “A human could perform only so many tasks in a day, but now with AI automation, it’s becoming much broader and faster. What really helps is to know your own systems really well, what vulnerabilities we have, and know what threats are being active right now.”
The PNNL program knits together thousands of data points into a stream of data that protects computing systems. Illustration courtesy of Mahantesh Halappanavar | Pacific Northwest National Laboratory
Comments