Scientists develop an Internet-scale persistent data store with the advantages of collaborative computing.
Researchers at the University of California–Berkeley have developed an approach to information security and sharing that combines the power of the Internet with a memory-sharing system and creates a globally distributed hard drive that is accessible to millions of users. The information would remain intact even when servers fail, natural disasters strike, malicious attacks are launched or all three occur simultaneously.
Computers are integrating rapidly with many more aspects of daily living. Their size is shrinking, and they are being adapted to work with everyday objects, from clothing to books to the walls of a building. This trend of ubiquitous computing has led many people to ask where information will be stored, especially in an age of disposable devices.
Known as OceanStore, the university team’s storage technology is based on a cooperative utility model where consumers pay a monthly fee to store and access persistent data. Individuals or organizations make up the federation that provides the services. A subscriber’s data would be quickly accessible from anywhere in the network and kept secure via replication and encryption techniques for disaster recovery. Any computer could join the infrastructure, contributing storage and bandwidth in exchange for compensation.
“When we looked at the utility model, we also became interested in the types of systems that support multiple service providers simultaneously interacting with each other,” explains John D. Kubiatowicz, an assistant professor of computer science at Berkeley. “I might pay Pacific Bell for my service, but in fact, my data is stored in servers at IBM and Sprint and several other companies. There’s a protocol working to allow them to cooperate and give me service. To that end, we were interested in producing an OceanStore prototype that has many of the flexible components that would allow our vision to become reality.”
OceanStore is being built on top of a technology layer that helps the system adapt to failures. This low-level layer is based on Tapestry, a routing infrastructure developed by Berkeley researchers that provides location-independent routing of messages directly to the closest copy of an object or service. “Right now in IP [Internet protocol], you route messages to particular servers that have a given IP address,” the professor explains. “With Tapestry, you’re routing to object names. You address the name of an object in a packet, and a packet is routed through the network until it finds an object with that name. That is very different from routing to a particular server.”
This method increases flexibility in adapting to failures because if a server that stores an object fails, the object may be located on another server. The goal of the project is to allow data to be cached anywhere, any time by members in the federation to optimize locality and availability.
“Locality is important to us,” Kubiatowicz says. “If the object you’re routing to is close to you, then you’d like to take the shortest path to it.” Frequently requested items might be moved to a server closer to a user in the United States so that routed messages find a short path and do not unnecessarily involve system resources in other regions—for example, in Europe.
“Many people are very intrigued with this global-scale, collaborative model of storage,” Kubiatowicz notes. “The disaster recovery angle is an interesting one to look at in OceanStore. You own a company whose data is out there on the Internet and not in one particular place. A disaster may destroy individual components, but the data remains accurate and accessible. The vision is pretty extreme in its viewpoint of having data spread to many servers.”
Changes to an object made in OceanStore spawn a new version of that object. “This is powerful because every version is read-only,” Kubiatowicz shares. “They’re very easy to preserve. In OceanStore, we encode data using erasure codes, which are like data holograms in which we take a given piece of data and produce a number of fragments.”
For example, if an object were divided into 64 fragments, any 16 of them would be enough to reconstruct the object. This method makes the data difficult to destroy because fragments are sent to 64 servers around the world. Forty-nine of the fragments would have to be destroyed for the data to be incapable of being reconstructed.
Additionally, the scientists use a cryptographic hash technology to name all data. Any corruption or malicious data alteration is recognized because the data no longer matches the cryptographic hash. The read-only aspect enables the system to throw out corrupted data.
“You can think of failures as introducing entropy into the system, and we have active processes continually repairing the data and reducing that entropy,” he says. If the system notices that a server holding onto a fragment has failed, the data is reconstructed, and the fragments are regenerated.
The technology is being designed to be self-organizing as well as self-repairing. “OceanStore preserves data for the very long term,” Kubiatowicz adds. A challenging part of using a versioning system is determining how to handle data modification. The system must determine whether a user is authorized to change data and ascertain that the data is being modified correctly even when pieces of the system may be faulty.
One of the unique aspects of the OceanStore system is that it will be constructed from untrusted infrastructure, meaning that one or more of the servers in the system may crash or may not be secure. However, instead of employing servers as passive data managers and data storage, the system enables the servers to participate in decision making.
“We’ve had to do a bit of work to choose the right model, and we’ve decided on something called Byzantine agreement,” Kubiatowicz says. “This technology allows you to take a set of servers and give them the capability to make a collaborative decision even if a few are compromised.” For example, a set of 13 servers would decide as an aggregate whether to accept updates even though three of the servers have been compromised. “No one server is responsible for important decisions,” he explains.
Kubiatowicz notes that data security efforts also extend to privacy. “How do you encrypt data so that people in the infrastructure cannot observe it or get access to it while still allowing the infrastructure to commit the data, cache the data and perform other tasks?” he asks. “We’re still working on that.”
Intel Research, Santa Clara, California, recently donated equipment to the university to allow OceanStore to be tested on a worldwide testbed. PlanetLab, a global overlay network for developing and accessing new network services, consists of 100 machines located at 42 sites around the world. “This is a general collaborative testbed for a variety of institutions such as universities, industry and intelligence laboratories,” Kubiatowicz shares. “What’s nice about PlanetLab is that it lets us look at how OceanStore and other peer-to-peer systems behave in a variety of network environments. It lets us test things like fault tolerance, continuous data repair and security on a global scale.”
While the OceanStore team is still working on the technology, in six months people will be able to plug into a stable piece that will run all of the time. “If you look at our prototype, we have a lot of pieces running,” Kubiatowicz says. “Right now the trick is to get the self-repairing aspects running to keep it always available.”
Additional information on the University of California–Berkeley’s OceanStore project is available on the World Wide Web at http://oceanstore.cs.berkeley.edu.