Coping With the Big Data Quagmire
Researchers at one of the premier national laboratories in the United States are prepared to hand the Defense Department a prototype system that compresses imagery without losing the quality of vital data. The system reduces the volume of information; allows imagery to be transmitted long distances, even across faulty communications links; and allows the data to be analyzed more efficiently and effectively.
The Persistics computational system developed at Lawrence Livermore National Laboratories (LLNL) derives its name from the combination of two words: persistent surveillance. The system is designed to revolutionize the collection, communication and analysis of intelligence, surveillance and reconnaissance (ISR) data so that warfighters do not find themselves drowning in a swamp of too much information. The ground-based system has demonstrated 1,000 times compression of raw wide-area video collections from manned and unmanned aircraft and a tenfold reduction of pre-processed images. Standard video compression can achieve only a 30 times data reduction.
The existing data processing infrastructure for national security is not designed for the amounts of information being generated by unmanned aerial systems and other platforms. In addition, the communication bandwidth supporting data transmission for air to ground and the archive storage capability are too slow to support fast-turnaround human analysis, according to LLNL researchers. “These [ISR] cameras are picking up more data than we know what to do with, and there are not enough humans on the ground to analyze every pixel,” explains Sheila Vaidya, deputy program director, defense programs, Office of Strategic Outcomes, LLNL.
The Persistics concept is relatively simple—to compress the data so it is more manageable without losing image quality that would prevent warfighters from spotting suspicious activity. But meeting the challenge has been anything but simple. “Livermore got involved about 10 years ago in a small research initiative, specifically with these large-format motion imagery video cameras, which are the largest culprit from a data perspective, to help mitigate the burden on the ground analyst,” Vaidya reports. “And it has expanded from just looking at wide-area motion imagery, which can cover larger swaths of territory in a single image, to including other forms of sensing as well. Persistics is not about the camera—it’s all about the data.”
Imagery must be compressed to be quickly and efficiently transmitted, and that compression results in a loss of data quality. Vaidya compares ISR imagery to Internet videos. “If you look at video, for example, on YouTube, it’s compressed because video has so much data in it, but the user is willing to accept that compression because the information has quality that the user is happy with. But in surveillance video, you really want to look at the needle in the haystack, so you cannot compress that and lose detail in noisy environments,” Vaidya explains.
The genius behind Persistics is that it compresses only the irrelevant data, such as nonmoving background images, the jitter and movement of the camera or of the airborne platform the camera rides on, and atmospheric aberrations, including smoke and, to some extent, clouds. Compressing or eliminating the irrelevant data allows the system to maintain the image quality on everything that matters to the warfighter. “Persistics comes up with a revolutionary capability that allows you to save useful information that is pertinent but to compress irrelevant information so that the next data product is much smaller in volume and can be communicated across the globe, if necessary, along standard data links,” Vaidya says. “It reduces the data only where you don’t care.”
The system uses a technique called pixel-level dense image correspondence to stabilize video; compress background; eliminate slight differences in the apparent position of objects viewed from different cameras; and provide superb subpixel resolution of moving objects of interest. “The next product is an order of magnitude, or more reduced from what it was at the sensor. It can be communicated across narrower pipes—long-haul links that are inherently high-latency and corrupt and have all kinds of dropped bits. It compensates for that. And it gives a product to the end user that allows the use of machine learning and automation algorithm for analysis,” Vaidya says. “So, let’s say you’re trying to track a guy on a motorcycle who is going on roads that are not necessarily mainstream roads, and he is going to do something odd or abnormal. We’ve got algorithms now that look for normalcy versus the suspicious.”
The LLNL has worked with the National Geospatial-Intelligence Agency, the Defense Advanced Research Projects Agency, the Air Force, Army and several military laboratories to incorporate the Persistics pipeline into data processing ground stations receiving video feeds from Constant Hawk cameras aboard both manned and unmanned aircraft. Persistics also has been integrated into the Air Force Research Laboratory-developed Pursuer viewer, which allows analysts to pan, zoom, rewind, query and overlay maps and other metadata, according to LLNL documentation. Analysts can ask to see all of the vehicles stopping at a specific location during a particular time frame, for example.
Vaidya cannot say exactly when the system will transition to the Defense Department, in part because of budget uncertainties. But it will be “soon,” she vows. It will transition through an intermediary that will support, maintain and sustain the system. “That’s where contractors come in,” she points out. Furthermore, researchers believe it could assist in homeland security missions, such as border patrol and illegal immigration. Meanwhile, LLNL researchers already are planning improvements to the system. The prototype is ground-based, but LLNL researchers say they intend to reduce the size, weight and power so that it can be deployed on manned or unmanned aerial platforms. “The ultimate goal is to fly it in the air. Right now, Persistics is a ground cluster, so the data is brought to the ground, but the goal is to do all that we do in the air,” Vaidya says. “The size, weight and power requirements will depend upon which platform it flies on. If it is on a Predator, that has a certain size, weight and power capability; and if it is on a bigger cargo plane, that will have a different requirement,” Vaidya offers.
Although Persistics is considered revolutionary, Vaidya says there is always more to be done. “Compression by a factor of 10 or a factor of 100 or even 1,000 is not good enough because the volume of data collected by sensors is increasing exponentially. So, we have to keep improving our trajectory so that we can take several orders of magnitude leaps in what we do with the data collected,” she declares.
And today’s technology can only do so much. “No machine can really solve the full problem yet. Maybe some day we will have HAL sitting there telling us what to do,” she says referring to the fictional computer in Arthur C. Clarke’s 2001: A Space Odyssey. “But we don’t have that yet. We can only help the end user by making his job easier, and that is what Persistics is providing.”
Going forward, the ISR community must learn to collect smaller amounts of data. “The next step will be to collect smartly—not to collect every bit of information. Let’s be clever about what we ask our sensors to do so that in the end, when we have to process it all, it is all relevant information,” Vaidya proposes. “Smart collection is part of the projection of ISR because, otherwise, we’re never going to get out of this large data quagmire.”