Computers Converse Around Language Barriers

November 2006
By Rita Boland

Speech-to-speech translation software, such as SRI International’s IraqComm system, allows people speaking different languages to communicate without a human interpreter. IraqComm translates spoken English into spoken Iraqi Arabic and has undergone an investigative fielding in Iraq.
A technology undergoing investigation provides communications capabilities without human translators.

U.S. troops in Iraq are performing investigative fieldings of instant speech-to-speech translators as a result of efforts by several government organizations and private companies. The language barriers faced by U.S. forces and Iraqis inhibit training and routine operations. As operation Iraqi Freedom continues, the need for better communication between U.S. troops and Iraqi soldiers and civilians is becoming increasingly important.

The current search for a process to translate from English to Iraqi Arabic and vice versa began when Multinational Security Transition Command–Iraq (MNSTC-I) presented the U.S. Joint Forces Command (JFCOM), Norfolk, Virginia, with an urgent request for two-way speech capability. The JFCOM Capabilities Division, which provides resources to combatant commands when no program of record exists, found that the Defense Advanced Research Projects Agency (DARPA), Arlington, Virginia, had a program for developing speech-to-speech translation.

Officials at JFCOM queried personnel at DARPA to determine whether the translation technology was mature enough for evaluation in an operational environment. DARPA provided JFCOM staff with the results of the studies they had conducted so command personnel could decide which products best suited their need.

Wayne Richards, branch chief, JFCOM Capabilities Division, says he specifies Iraqi Arabic for the translations because most citizens in Iraq do not speak modern standard Arabic, which is used mainly for writing and by the highly educated. For U.S. forces to communicate effectively in Iraq, they need technology programmed specifical ly for Iraqi Arabic.

Based on DARPA’s evaluations of the translation products in its laboratories and limited field utility assessments in the United States, JFCOM identified three translators that showed promise. “We took those three systems into Iraq so that the users who initiated the urgent requirement could give us their opinions of the three devices,” Richards states.

JFCOM provided the systems to the Civilian Police Assistance Training Team, a branch of MNSTC-I, to test during the training they provide to Iraqi civil security forces. Performing the evaluations in this kind of environment has advantages. The evaluations are conducted in a benign environment with a targeted group of users who can study the device under controls that limit the chance of harm to soldiers. “We couldn’t think of a better place to put them,” Richards notes. In addition to training, JFCOM has an interest in using these translators in the medical field.

Of the systems they tested, training team members selected IraqComm, developed by SRI International, Menlo Park, California, for further evaluation. The assessments conducted at JFCOM also supported IraqComm as the best choice for more study.

Richards explains that IraqComm’s limited translation library is a good reason for testing the product in a training environment. The system is programmed to communicate specifically about training needs and is not yet robust enough to use the same program for all situations. “The translations engines are not that mature yet where you can do free-flow conversational speech,” Richards says.

JFCOM personnel returned to Iraq in April 2006 with 32 IraqComm systems to continue investigative fielding. While troops work with IraqComm, the command is preparing to launch an investigative fielding of another product, the Multilingual Automatic Speech-to-Speech Translator (Mastor) created by IBM Research, Yorktown Heights, New York.

Richards emphasizes that the choice to explore these systems further does not mean that they will be selected for acquisition or that they will prove the best products in the future. In addition, field experiments do not constitute an endorsement by the command. JFCOM provides all the feedback it receives from the evaluations to DARPA so that agency can continue to steer improvements of the speech-to-speech translators. Results from the IraqComm fielding will be available to all companies, not just SRI International, because they can affect the development of all the systems. The U.S. Army Test and Evaluation Command is preparing a report on the fielding of IraqComm. “It’s a continual feedback process,” Richards states.

In addition to the 32 IraqComm systems brought to Iraq in April, JFCOM recently provided an additional 25 units to the 3rd Brigade Combat Team, 25th Infantry Division, stationed in theater. JFCOM also sent personnel to provide technical support. They can familiarize troops with the technology, which also comes with a training video installed on the laptop provided with the software and on DVDs. For its purposes, JFCOM ordered complete systems, which include software and all necessary components already loaded onto laptops. At the most basic level of instruction, troops receive a small laminated card with directions to help them begin.

Components of the IraqComm technology have undergone several decades of development, according to Kristin Precoda, director of the speech technology and research laboratory at SRI International. The company worked on a similar product to perform translation between English and Pashtu for medical personnel in Afghanistan. Adjusting the system for use in Iraq presented developers with the challenge of addressing the many dialects of Arabic. “With Iraqi Arabic one of our challenges is learning enough about the language to know what we need to handle,” Precoda states.

IraqComm incorporates three software technologies: automatic speech recognition, machine translation and text-to-speech synthesis. A user speaks into the microphone, and the system records the voice. The automatic speech recognition module processes the recording and displays the speech onscreen. The machine translation component translates the phrase into the target language, and the text-to-speech component produces an audible rendition. In addition, the translation can be viewed on the computer screen.

The U.S. Joint Forces Command has chosen two speech-to-speech translators for investigative fielding in Iraq. To use the technology, personnel speak into a microphone. The software translates from English to Iraqi Arabic and vice versa. The translations are created audibly and graphically on a computer screen.
To produce accurate translations, developers had to factor in variables such as accents and background sound. “Environmental noise is something that we have put a fair amount of effort into handling because we don’t know what the environment might be,” Precoda explains. The translator also supports a broad range of speaking styles. During demonstrations that Precoda has conducted in the United States, IraqComm has not demonstrated sensitivity to accents and has worked with men, women and non-native English speakers. In Iraq, the product has been used in the north, in Mosul and in other locations without operational problems.

SRI designed the system for tactical use, and Precoda states it can translate speech about topics in the areas of force protection and civil affairs and about some very basic medical issues. The software runs on various laptops, including the Panasonic Toughbook CF-18, which weighs about four pounds. Precoda’s vision is to install the software on personal digital assistants or other devices smaller than a laptop and better suited for nontraining scenarios. “It’s not a one-size-fits-all solution,” Precoda says. Other future considerations include making the program hands and eyes free and increasing battery life.

SRI also is looking at ways to make the product more accurate and more robust in different conditions and is researching the system’s capacity to translate other languages.

At IBM, where the Mastor program was developed, researchers have been studying speech translation for more than 30 years. Five years ago they directed their efforts toward speech-to-speech translation. The company focused on domain-specific translation, which has applications in many industries, including the military, health care and tourism. According to David Nahamoo, speech chief technology officer, IBM Research, “We built our technology using meaning as the interlingual.”

The company’s technology is equipped to try to understand the meaning of words and to reconstruct sentences based on that meaning. By using domain- or application-specific expression, users can avoid problems with idiomatic phrases that accompany many translations. Nahamoo explains that by using meaning to translate, the system does not need a perfect understanding of a language to produce an effective, accurate translation. This facet of the system would assist troops using the software to train or to communicate with the average Iraqi citizen on the street or at an entry gate. 

Mastor operates in much the same way as IraqComm. A user speaks in English or Iraqi Arabic; the spoken word becomes text; the text is translated into the target language; and the software speaks the translated words. Mastor offers an additional user interface that provides alternative translation choices to the speaker in the speaker’s language. Users can select one of the other phrases if it better conveys their meaning, and the device will speak that option.

IBM is looking ahead to the possibility of selling this technology to the general public for purposes such as travel. Nahamoo describes Mastor in its current state as somewhere between prototype and a deployable system ready for use by several thousand people.

For now, the company offers it as a tailored product to interested parties. For JFCOM or other organizations within the U.S. Defense Department, Mastor comes as a hardware and software package that has been ruggedized. When IBM fields Mastor with JFCOM, the command’s personnel receive training on the product from IBM so they can support it in theater.

Nahamoo also mentions another possible delivery mode for the translator system—remote access. The technology would reside on a server, which people could access through their computers. In the future, telephone companies could offer it. A person in the United States could call someone in China, and without either party understanding the other’s language, they could communicate using speech-to-speech translation.

Both Precoda and Nahamoo stress that the thrust behind their products is a desire to improve communications among different people. They believe the results of improved communication will have a social impact and lead to better collaboration and understanding.

The automatic translation systems are not intended to replace human translators but instead to augment the small pool of interpreters available. Richards explains that because interpreters cannot be in all places at all times, machine translation tools give personnel with no foreign language training the capability to communicate with someone who does not speak their language.

Richards says JFCOM has demonstrated speech-to-speech translation to U.S. Secretary of State Dr. Condoleeza Rice and to the commander and deputy commander of the U.S. Central Command. Despite this, Richards believes much remains to be accomplished in terms of improving speech-to-speech translators. “A lot of that work is happening because we have the opportunity to put these prototypes in the field in these controlled environments,” he states. Some of this work involves capturing conversations between U.S. and Iraqi forces and sending those recordings to the Defense Language Institute Foreign Language Center in Monterey, California, to build a data translation library.

Richards says that data in the library is government property, so when the technology is completely ready and provided to U.S. forces, the military will have the information it needs to use the software effectively. Product developers hope the research and effort invested in this capability will lead to better communication for everyone. “If you really look at it, the dream is about the ability to open up communications between people of multiple languages,” Nahamoo shares.


Web Resources
U.S. Joint Forces Command:
Defense Advanced Research Projects Agency:
SRI International:
IBM Research:


Enjoyed this article? SUBSCRIBE NOW to keep the content flowing.