Word-Spotting System Catches Wireless Data

September 2007
By Henry S. Kenyon

The WordSpotter identification system is designed to monitor live and recorded wireless communications for predetermined keywords. It can be set to identify words in up to 60 languages and to recognize individual speakers and their regional accents. Besides intelligence applications, the technology can be used to monitor friendly troops’ communications for security breaches and to verify that orders have been received and carried out.
Automated identification tool flags key terms in monitored communications for quick review.

A language analysis system is making it easier for intelligence organizations to identify and track suspicious conversations on military and civilian voice communications networks. The technology identifies keywords in a target language and also can be triggered by a specific regional accent.

Designed to monitor wireless voice communications, WordSpotter can recognize words in some 60 languages and flag conversations for human analysis. It can monitor both live and recorded speech and identifies four parameters: keywords, languages, accents and specific speakers. WordSpotter consists of a language-processing core, developed by Natural Speech Communication Limited, Rishon LeZion, Israel; and a radio noise reduction technology provided by Tadiran Communications, Kiriat Arieh, Israel.

WordSpotter evolved from commercial speech-recognition technologies designed to enhance and streamline the human-machine interface for automated telephone services, explains Shmuel Katz, Tadiran’s vice president for marketing. Katz notes that although the use of speech recognition for keyword identification has been discussed for years, it was the need for government security applications that finally initiated the technology’s development.

An important challenge faced by voice and word recognition systems used for surveillance is that they must constantly sift through massive amounts of data. Katz explains that the volume of information precludes efficient and timely human analysis. WordSpotter is designed to scan and prioritize voice communications so that the most significant conversations can be directed to analysts. “It can be used to highlight suspicious calls and radio transmissions in real time, unveil hidden trends and categorize calls according to keywords for later human listening. The Spotter allows fast, automatic monitoring of thousands of hours of audio streams taken from noisy radio channels to help human analysts focus on those transmissions that need immediate attention,” he says.

The system consists of a combined hardware platform containing a software engine. The platform is a digital signal processor (DSP)-based blade server with a peripheral component interconnect (PCI) interface used to run the word-spotting technology. Each blade can support tens of channels and contains the necessary processors and memory to conduct speech analysis. The software package resides on the blade server and contains the necessary language information to perform keyword recognition in a specific tongue.

The keyword recognition technology consists of software models representing a variety of common accents in a target language, including variations according to the speaker’s gender, age and environment. When the system is set to support a specific language, it will recognize specific words in that language regardless of the speaker. WordSpotter scans for words from a pre-defined list that can contain up to several hundred words. Katz says that it locates words by seeking the best possible match in a conversation for the word’s phonetic transcription.

WordSpotter can operate in a variety of environments. For example, Katz notes that it can be used to analyze and mine recorded conversations in call center databases, multimedia archives or live intercepted communications. The technology can extract information from sources such as ultrahigh frequency, very high frequency and high frequency radio channels; satellite telephone and cellular transmissions; and live microphone feeds.

The key difference between WordSpotter and other voice recognition technologies is that it is not designed to translate languages or to convert speech into text. Katz says that the system identifies only specific user-defined keywords from a conversation. Because it looks for specific words, he maintains that WordSpotter provides analysts with more focused data than other systems. The system also performs simultaneous real-time word identification across many channels.

Because it can be used to monitor wireless communications, the system features a noise reduction application designed to mitigate radio channel effects such as white or colored noise, multipath delays and Doppler shift and spread. Katz relates that reducing noise and distortion in a recording significantly increases the possibility that the system will identify a word from its list. Raw voice data is processed through the noise reduction tool before running through WordSpotter.

Intelligence gathering is a common application for the system because of its ability to filter thousands of hours of recorded audio data and flag the most relevant conversations for human analysts. Other potential uses include monitoring radio communications in real time. Katz explains that military applications can range from scanning soldiers’ wireless conversations to ensure that sensitive information is not being mentioned to monitoring communications during exercises to verify that orders are being followed. WordSpotter also can be used for public safety functions such as 911 calls, which must track emergencies in real time.

Katz says that WordSpotter is mainly intended for use in a centralized communications facility where large volumes of audio data can be gathered and studied by a team of human analysts. However, because the technology is embedded in the blade’s chip set, it can be used in tactical interception equipment for offline or online analysis. He notes that the system’s language models and keyword information can operate without any outside connectivity. The ability to function in a stand-alone environment permits analysts to work independently and flexibly in a variety of operational circumstances.

When the system is used across a large network, administrators can integrate the technology into the existing operating environment and tailor a dedicated-user interface for it. In small-scale implementations, a simple user interface is provided to run both offline and real-time word recognition for live conversations and recordings. The basic interface can be used to edit the word list and perform system evaluation and calibration functions.

One of the challenges of developing WordSpotter is that it must overcome obstacles that do not exist for human-machine interface speech recognition interfaces. Katz says that operating issues include developing an unlimited time window for the recognition process to work in; the ability to scan spontaneous, unstructured speech that often includes words and terms from different languages; how to manage high levels of speech to keywords across a short time period; and how to address noisy environments and low-quality speech. He explains that overcoming these difficulties includes developing robust noise-filtering models and advanced statistical algorithms to identify word patterns.

WordSpotter is modular and can be expanded to meet increased operational needs. Users can add more word-spotting channels by adding more blade servers. Blades also can be added to existing platforms via a PCI interface. Katz notes that the system supports a variety of distributed architecture models. Each of the DSPs on the WordSpotter platform is independent from the others, which allows it to be independently configured to recognize a specific word list. He says that in theory, one blade could be configured to run word recognition tasks in 30 different languages, for 30 parallel calls, with 30 different word lists. “The end user can basically adjust the product, on a daily basis, to different operational scenarios and needs,” he says.

In the future, additional capabilities will be added such as a tool to identify specific speakers biometrically and a language identification technology that will recognize a spoken language and automatically apply the appropriate spotting models to the call. Katz adds that all of the planned new features will run simultaneously on the same DSP-based platform to provide a comprehensive speech processing system.

WordSpotter was introduced in 2004 and was used in several small deployments by intelligence agencies and public safety organizations around the world. Although Katz cannot name the product’s customers, citing security concerns, he notes that in July WordSpotter was selected for a multiyear project by a large European country’s ministry of defense.

Web Resources
Tadiran Communications: www.tadcomm.com
Natural Speech Communication: www.nscspeech.com


Enjoyed this article? SUBSCRIBE NOW to keep the content flowing.