Thinking Hard About Hard Drives
Open-source intelligence (OSINT) is a growing field, not only in terms of people who want to learn and use it but also in terms of how much room it takes up in data centers.
“When we complete an investigation, we may end up with 25 terabytes of data from a single investigation,” said José David Ortiz Cruz, an OSINT expert with the Civil Guard, Spain’s national law enforcement agency who also teaches with Interpol.
Spain alone has about half a dozen national agencies conducting OSINT investigations to fight crime and cross-border threats. Storage space runs out fast, according to Ortiz Cruz.
“Once we start building up an investigation, the data we consider uninteresting for our investigation, we eliminate it. We really do it because of our need to store, and it’s not a good practice,” Ortiz Cruz explained.
Storage space is a limitation of OSINT activities, as data centers—both commercial and governmental—run out of space under increasing demands from investigators.
As operations become more complex, using video and other assets, space goes from necessary to critical.
“The volume of available data could potentially be offset by new data storage and processing capabilities if the [intelligence community] is effectively positioned for such an evolution,” states a report by the RAND Corporation, a global-policy think tank.
This means that a series of technologies must be adopted to allow greater storage capacities, but regardless of what operators use—cloud or local—the final repository is a hard drive.
Emerging technologies seek to replace the current alternatives with greater power and smaller size. This demand grew from diminishing returns on storage technologies, which “have grown at rates as high as 100% per year, but by 2010, these rates had dropped to more like 10%,” Roger Wood, an academic at Washington State University, wrote in a research paper.
Engineers worldwide are working on potential solutions that involve improving what there is and coming up with original ideas. Among them is DNA—the same substance that dictates that a journalist typed this story with 10 fingers—not 12 or eight—to store the next family picture.
DNA is a substance that carries the code to build all living beings, from unicellular organisms to elephants. Every protein and how it should be placed in the body is stored in strands in each cell. Therefore, all the data needed to make an elephant or a human being is efficiently stored.
“Our genetic code is millions of times more efficient at storing data than existing solutions, which are costly and use immense amounts of energy and space,” wrote Win Reynolds, research science and engineering editor at Northwestern University.
DNA is very efficient at self-copying, as the coronavirus underscored, but starting a DNA strand from scratch is different. It takes hours to “write” data into chains and outside cells, and scientists are trying to shorten these lead times, which can run into 10-hour delays, according to Reynolds.
Current research to shorten the time has reduced delays to only minutes. But current knowledge is far from the fraction of seconds necessary for proper computer operation.
Still, scientists at Columbia University managed to store a full computer operating system, an 1895 French film, a $50 Amazon gift card and a computer virus, according to a release.
“We believe this is the highest-density data-storage device ever created,” Yaniv Erlich, a computer science professor at Columbia Engineering, said in the release.
While results were promising, efficiency was lacking. The release stated that it costs $7,000 to synthesize DNA for archiving two megabytes of data and another $2,000 to read it.
This technology has a long way to go before it is commercially viable.
Another way of storing files is reminiscent of compact discs.
5D optical storage employs silica glass and laser beams. It can be written, and what is recorded can stay in the unit for centuries.
“While cloud-based systems are designed more for temporary data, we believe that 5D data storage in glass could be useful for longer-term data storage for national archives, museums, libraries or private organizations,” said doctoral researcher Yuhao Lei from the University of Southampton in the United Kingdom.
According to Lei, the tiny silica structure can hold data up to 10,000 times more density than a Blue-ray disc.
These drives can be recorded at speeds above 100 pages of text per second. Researchers in Europe are currently working to increase speeds.
The technology works by “burning” nanostructures of data on the silica with a laser beam. The structures measure 50 by 500 nanometers each. These structures are cramped and can occupy several levels in the device, thus adding a demand on the materials and the laser beams employed.
This results in one of the greatest challenges: overheating the surface as lasers work on it, according to Huijun Wang and Lei.
Using this method, a CD-sized unit would hold 500 terabytes of data.
Among the technologies that are improving data centers are heat-assisted magnetic recording (HAMR), shingled magnetic recording (SMR) and bit-patterned media (BPM).
The first technology works based on manipulating the temperature at which data is stored, allowing more storage and rewriting.
“To increase hard drive capacity, engineers try to fit more data bits, or “grains,” onto each disk platter—they increase the density of bits crammed into each square inch of surface space. More bits on a disk means more data can be stored,” explained Seagate, the company involved in this research, in a release.
As these grains draw closer, their magnetism influences each other and becomes unstable. Engineers have stabilized them using new materials, but to rewrite the surface, the temperature must be increased during the extremely short time the data is altered.
“To write new data, a small laser diode attached to each recording head momentarily heats a tiny spot on the disk, which enables the recording head to flip the magnetic polarity of a single bit at a time, enabling data to be written. Each bit is heated and cools down in a nanosecond, so the HAMR laser has no impact at all on drive temperature, or on the temperature, stability or reliability of the media overall,” the company explained.
“The decades of development that have led us to HAMR productization are even more important today as highly cost-efficient mass capacity storage will be a competitive enabler in a world where data is rapidly growing and increasing in value,” William David Mosley, the company’s CEO, said to journalists during a press conference..
Seagate has already shipped the first 30 terabyte drives to data centers, but more are expected in the future, according to analysts.
The second development, shingled magnetic technology, or SMR, overlaps magnetic data on the disk’s surface, building a surface like a roof. It allows more data to fit on the magnetic surface used, but it creates a series of problems.
One problem is that the top layer dominates. “The overlap region is dominated by whichever track is written last,” according to Wood. Therefore, industry has chosen this innovation mainly for backups, not everyday use.
The third storage method in development is bit-patterned media, or BPM. This consists of using a magnetic surface printed with “islands” where data will be stored. This solves problems derived from the polarity of the “dots” used to store information and allows an expansion of 20 to 300 times versus the capacity of current hard drives, according to a research paper by Rhys Alun Griffiths and other authors, published by the Journal of Physics.
This hardware is unavailable yet, as “fabrication of BPM is viewed as the greatest challenge for its commercialization,” according to Thomas Albrecht, a researcher at Germany’s Friedrich Alexander University.
Some critics point out that the unstable magnetic conditions could make these hard drives unreliable and prone to losing data. Among them is Mojtaba Ranjbar, an academic at the University of California, Riverside.
Still, major hard drive manufacturers vow to continue research and employ them in the not-so-distant future.
While these storage technologies cover all the research around the world, the direction is toward making storage smaller and faster. Size matters, as the real estate that hosts servers is an associated issue.
Still, tackling capacity also is a growing concern among government entities in defense and law enforcement. Spain’s regulation restricts armed forces and police to using only computers in the country. “We must use our own systems for obvious security issues,” Ortiz Cruz said.
Access to affordable and significant storage capacity expansion touches on national security, and many countries worldwide are waiting for better ways to store more data to protect their citizens.