Semantic Web Ready for Prime Time

April 2006
By Henry S. Kenyon

To fully realize its goals for a network-centric force, the U.S. Defense Department is examining the use of semantic web technology, which permits information to be shared, stored and reused across application, enterprise and community boundaries. Highly scalable, it also allows legacy software tools to interoperate, enhancing joint and coalition operations.
System paves the way for network-centric infrastructure, applications.

A World Wide Web-enabled technology is on the verge of dramatically changing the way people and computers interact and share information. It provides a common architecture that permits data to be communicated and reused across application, enterprise and community boundaries. This automated context mapping capability will allow complex network-centric systems to reach their full potential and to scale beyond present systems.

Current computer networks are reaching the limits of their ability to process information efficiently. This is not an issue of microchip speeds but of programming frameworks. Web searches that bring back thousands of topics are of little use if the data has no context. By creating a methodology to tag and arrange information, researchers are developing an automated system that allows data to be retrieved, cataloged and shared to meet an individual user’s requirements.

Semantic web technology is based on work done by the Defense Advanced Research Projects Agency (DARPA) and international standards bodies such as the World Wide Web Consortium. Key to semantic data searches is the Web Ontology Language, known as OWL. OWL is derived from the DARPA Agent Markup Language (DAML) and the European Commission’s Ontology Interchange Language. It was approved as an international standard in 2004.

According to DARPA, most of the material on the Internet is represented with the Hypertext Markup Language (HTML). Designed to provide information in formatted, human-readable pages displayed on browsers, HTML is limited in describing documents in ways that software can locate and interpret. The other major language used in home pages—Extensible Markup Language (XML)—tags information on Web pages, but it cannot mark the relationships between individual data points.

OWL creates a machine-readable structure for information stored on the Web. It also permits computers to organize the data on OWL-marked pages. The language uses networks of hyperlinked ontologies to represent data. An ontology defines the vocabulary used to describe and represent an area of knowledge. OWL-based ontologies semantically integrate data and automatically bypass interoperability issues between separately developed legacy applications.

DAML and OWL will allow the U.S. government to develop ontologies that can be distributed across multiple systems. The languages can scale up for use in large, complex military applications and allow commercial networks to compose massive, user-directed virtual processes automatically and dynamically.

But OWL represents only the early stage of semantic technology, explains Mills Davis, managing director of Project 10x, a Washington, D.C.-based research, education and consultation organization. He notes that achieving entirely network-centric capabilities involves moving legacy networks, systems and devices to a semantic infrastructure. Developments such as peer-to-peer services, grid computing, radio frequency identification systems and Internet protocol version 6 (IPv6) are all steps toward a fully semantic architecture.

“We are moving from one system development environment with a few big applications to systems that are built up from hundreds of thousands of components. They are going to be based on services and open source applications. It’s a much more complex, richer landscape in which we’re going to be developing things,” he says.

The new systems and applications also will be too complex to manage manually. Instead, they must perform autonomously with little direct human intervention. Current networks need manual maintenance and upgrades, but Davis insists that to create truly robust systems, enough knowledge and awareness must be installed so that they can operate and adapt. He adds that semantic technology will help the information technology industry achieve one of its long-term goals: autonomics—the ability to create self-healing, self-integrating and self-optimizing systems that can scale.

Davis observes that the fastest growing segment of the computer industry is automated systems that are designed to support networks. This trend is important because current hardware and software technologies have plateaued in their ability to cope with the massive increases of scale and complexity that are on the horizon. “We need solutions that are designed for this era of distributed intelligence,” he shares.

Semantic technology already is finding applications in enterprise architecture solutions such as tools and data repositories. The information is available in the form of metamodels that can be represented in graphic and textual modes. These models are executable and integrate directly with the applications and information sources with no additional coding. Data in the models can be linked with other types of application areas.

Because enterprise architecture modernization is a major U.S. government effort, Davis notes, semantic systems may find a ready customer base. However, some challenges exist in this area. Current federal initiatives are determined by a capital planning and investment control process focused on budgets and information technology strategy plans and adhering to the E-Government Act. These requirements create an overhead of additional activities for administrators to carry out. This pressure forces large government agencies to integrate budget and performance and to align information technology and business line priorities. Such an environment makes traditional ontological modeling difficult. “How do you get line of sight on these models, and how do you hook up your performance metrics so that you’re actually pulling them back into the model?” Davis asks.

Semantic web technology achieves these requirements by linking models together. Davis adds that besides eliminating much of the manual labor of network maintenance and management, semantic technology empowers concepts such as business process re-engineering and knowledge management. These ideas previously had no technology to realize them fully. “You are at a point now where you have a technology that can implement these management, operation and network-centric concepts. It requires mature tool sets, but mostly it’s a change of mindset,” he says.

A number of government and commercial organizations began developing systems and applications based on semantic technology after OWL became a standard. Davis notes that although semantic web search tools are mature, they have not yet been deployed on a very large scale. He predicts that self-referencing applications will appear first in infrastructure programs and that they will proliferate with the widespread use of IPv6 and military network-centric systems. Next-generation Web services, business applications, enterprise applications, grid and peer-to-peer computing all will be autonomic.

Davis says that companies such as Oracle and Cisco Systems Incorporated have produced policy and support applications with semantic capabilities. Software AG provides its customers with an entire information technology stack that features semantic business process-level and enterprise information integration applications. He adds that the technology has not yet reached the composite application layer, but that it will follow soon. Commercial firms are working on infrastructure-level semantic systems, which will allow them to sell customers entire software suites, he says.

For example, Digital Harbor produces applications for the intelligence community. The company specializes in open standards platforms that allow users to create composite applications quickly. Its products also enable the development of sophisticated user dashboards so existing applications can be exposed and run in a browser window.

An ontology layer also allows a variety of software tools to operate simultaneously. “You can have this location in a database that you can click on to open a document. It gets built very fast and is very maintainable. You’re essentially federating searches across tens if not hundreds of sources. But you have only one universal interface because the business ontology is what maps the vocabularies together, so you don’t have to know six different ways to search,” he says.

Another area where semantic technologies will be heavily applied is in information-intensive applications where data must be accessed in context. Davis predicts that key tools will be composite applications and semantic search, collaboration and portals, which will support semantic web logs and online encyclopedias. He adds that the World Wide Web Consortium and the international semantic web development community are committed to changing the network. These groups are designing lightweight, user-friendly ontologies that can be distributed widely. As the ontologies proliferate, they also will be applied to individual computers for organizing user applications. “The notion of personal information management is becoming a semantic-level reality,” Davis says.

The World Bank conducted a study on enterprise search capabilities that concluded that semantic search technology is the best way to access information in the organization, Davis notes. Because it is a very large multinational entity, a user must search in context to overriding policy positions and task-oriented needs. The report indicated that this data retrieval and association capability also would reference across the six official working languages spoken by World Bank personnel.

Semantic technologies will allow enterprisewide information search es to shift from data retrieval to discovery and intelligence gathering within a specific context. This work also involves communities such as robotics researchers developing machine intelligences that can ask questions and seek answers from its environment. Davis explains that a key research goal is to develop intelligent behaviors for computers.

The technology is changing organizational behavior. Data searches are more efficient if the system knows the details of the subject. Davis notes that this aspect of semantic systems will operate similar to business intelligence and other corporate applications. This will lead to the development of semantic search and collaboration tools that can identify and link ontological models to seek information.

Additional areas that semantic technologies will influence include systems supporting complex work, automation systems and complex reasoning tools. As information technology systems become more elaborate, it becomes increasingly necessary to automate many functions, he says. New applications include virtual manufacturing, policy guidance applications for government systems and knowledge-enabled functionality.

Some of the most immediate applications involve executable knowledge and reasoning models. These systems will create a need for interoperable, executable knowledge that can be placed in modeling ontologies. The ultimate evolution of semantic technology will be in intelligent systems. Davis notes that DARPA is continuing its research of robust, adaptive, autonomous and autonomic system behaviors with the goal of developing systems that emulate human reasoning.

Data Translation System Emerges

A number of organizations are developing semantic web-based technologies to meet the U.S. Defense Department’s requirements for ontology reading systems. One example is the Lockheed Martin Company, which has created an ontology translation protocol called Ontrapro. According to Dr. Todd Hughes, technical manager at Lockheed Martin’s Advanced Technology Laboratories, Cherry Hill, New Jersey, Ontrapro is a prototype technology demonstrator that is being integrated into some of the firm’s experimental platforms.

Hughes notes that for semantic web systems to support warfighters and decision makers, it is necessary for different data representations, or ontologies, to understand each other. For example, a U.S. transportation database with references for trucks and buses would have difficulty communicating with a British database that refers to these vehicles as lorries and coaches. “The semantic web is supposed to provide the ontological description of these data sources, but since those ontologies reside locally in many cases, they’re going to have idiosyncratic naming conventions and taxonomies associated with them. So there needs to be some translation capability,” he says.

Ontrapro is designed to be a key part of a larger toolkit for enabling rapid translation. It automatically examines both the taxonomies and ontologies of terms and systems to discover their relationships. Hughes explains that Ontrapro translates material in a matter of hours, automating a process that may take a software engineer several weeks to accomplish manually.

Lockheed Martin began work on Ontrapro in 2003 because it anticipated that the Defense Department would soon become interested in ontologies. At that time, this area of computer research was limited to studies by the Defense Advanced Research Projects Agency and the intelligence community. “Nobody was talking about these abstract, formal representations of data that could be used to integrate systems,” Hughes explains.

Now that the U.S. military is developing the technology, the Defense Department has identified data heterogeneity as an ontology mapping problem that must be solved, he says. But because many government systems do not have ontological descriptions of their data, it is difficult to gather information for testing, evaluation and research purposes. These challenges are helping the company to focus its research by proving the validity of theories about semantic data. “We are now trying to orient the technology toward the unique data representation problems associated with actual Defense Department systems,” he relates.


Web Resources
World Wide Web Consortium:
Project 10x:
DARPA Advanced Markup Language:


Enjoyed this article? SUBSCRIBE NOW to keep the content flowing.