Military personnel abroad converse with locals through trainable interpreting tool.
A Star Trek-like communications instrument promises to help penetrate the language barrier by providing automated near-real-time translations. The mobile, lightweight device, which is the size of a cellular telephone and can be clipped to a belt, will translate English paired with Spanish, Korean, Chinese, Arabic, Albanian and Thai as well as other major European languages.
By studying neural, perceptual and cognitive levels of language organization, researchers have created a device that translates in only a few seconds and can understand idioms, accents and words based on context. Because it can be constantly within reach and is more cost effective than language training, this system could become the military’s translator of choice during missions, whether troops are on assignment in a remote tropical village or a busy Asian city.
Minneapolis-based ViA Incorporated was awarded the translator project contract as a part of the congressionally mandated Small Business Innovative Research program. Working with the Office of Naval Research (ONR), Arlington, Virginia, the company is nearing completion of the $750,000 second phase of the enterprise.
Robert Palmquist, ViA’s vice president of innovative technologies, shares that his interest in translation technologies partly stems from a tragedy in St. Paul, Minnesota. “I was reading a newspaper story and saw that there was a tragic fire where several Vietnamese perished,” he says. “There was a comment made that the firefighters weren’t able to communicate effectively. The Vietnamese took a right and walked into the fire, whereas if they had made a left they would have been able to exit the building. I asked why we couldn’t use ViA’s wearable computer to allow people to communicate in different languages. That was the seed of thought for making a voice-to-voice, hands-free translator.”
The translation device looks like the company’s wearable computer, ViA II, a platform that weighs 23 ounces, measures 9.75 inches long by 3.13 inches wide and can be worn on a belt. It is constructed primarily of commercial off-the-shelf products available from a variety of vendors and is battery powered. Transmeta Corporation designed the processing unit, which runs at 600 megahertz, and the device provides 128 megabytes of random access memory. “It has a lot of disk space,” Palmquist offers. “Twenty gigabytes is more than sufficient. That’s impressive for a 23-ounce platform worn on the body.”
When a user speaks into a small microphone, his or her voice is detected by the interpreter’s recognition software. A computing engine then converts the words to text, performs the translation process and outputs the audio version. The process takes approximately five seconds, but once the system starts speaking, the user can simultaneously enter the next phrase, a capability known as full duplex. The technology is bidirectional, so the user can capture the interviewee’s statements as well.
While other translation machines exist, they are limited to a pre-established list of expressions. Joel Davis, program officer, Human Systems Science and Technology Department, ONR, explains that such devices are phrase-based and are usually limited to approximately 500 phrases. “This type of machine is smart enough to know which phrase you want but only within certain limits,” he says. Phrases are concentrated in specialized areas such as the medical and military fields, and users must learn pre-programmed phrases to guarantee a successful translation.
An alternative approach is continuous speech translation, also known as free speech. “It is much more flexible and less brittle,” Davis explains. ViA’s device is equipped with a 100,000-word dictionary with a standard dictionary base. Stackable dictionaries such as personal or medical word lists augment the tool’s vocabulary.
A person could pick up a newspaper and start reading it with the continuous speech method, Palmquist says. “We are just trying to translate the gist of the sentence. We want to convey the meaning. The sentence may not be translated exactly the same, but you’re getting the meaning across, and we view that as success. When we go out to the trials and someone will say that a translation sounds funny or will give me a strange look, I will rephrase the question,” he adds. “I am able to communicate in foreign languages and do it effectively, but sometimes it takes two tries.”
Advantages of the pre-programmed phrase technique include a 100 percent accurate translation. “This is because someone has sat down and figured it out ahead of time,” Palmquist shares. However, the system is not likely to understand a phrase that does not match one that is pre-programmed.
With the free speech method, the interpreter takes into account the context of what is being said. It understands the differences of words that sound the same but have different meanings. If a speaker said, “Write your name on the right side of the paper,” the device is programmed to translate “write” and “right” correctly. Stackable dictionaries also assist the translation process. For example, a “click” to the military is a distance measurement, but to civilians, a click is a short sound. The person talking does not have to slavishly follow the phrases provided to them, Davis adds.
Trainability, which increases efficiency, is a key feature of the invention. Initial setup requires the assistance of a small, portable touch screen where profiles for interviewer and interviewee are entered. These include characteristics such as gender and age. Additionally, if a speaker is male, the device’s audio output will also be in a male voice. Setup can take between five and 30 minutes.
Users can further train the system to duplicate their own voices and accents as well as add new words and idiomatic expressions. The touch screen can then be put away and the system can function solely by voice activation. While the device can operate smoothly by voice activation only, the speaker also can choose to continue to employ the touch screen and watch a transcription as it is being processed. “We don’t want the operator to have to have a computer science degree to be able to run this system,” Palmquist says. “We’re making the system easy enough to use that a person doesn’t have to know what’s going on under the hood.”
There are limitations to the technology, Davis acknowledges. “If a Marine is in Korea and he comes across a Buddhist monk, he’s not going to be able to engage in a conversation about the transcendental quality of dharma,” he shares. “Conversation is going to be about, ‘Have you seen any troops in the area?’ or ‘Can this bridge support the weight of tanks?’”
Military staff from various specialty areas at Camp Lejeune, North Carolina, have tested the technology. “The people who were more interested in foreign language as an intelligence tool were less knocked out about this,” Davis says. An ideal device for the intelligence community would detect nuances of language and analyze voice patterns to detect innuendo or whether a speaker is lying. “The tool is not designed for that,” he admits. “Those people have to continue to rely on native speakers, and that can be tricky because you don’t know where their loyalties lie. Or you can rely on a cadre of warfighters who have gone through the long, involved and expensive language training at the language school in Monterey. Then they have to recycle themselves every few years to maintain their capabilities. This [new tool] is a cost efficient way to solve military problems face to face with foreign nationals.”
The military is looking at using the system for many of its operations in foreign countries, Palmquist notes. “It is very intimidating when a Marine carrying a gun comes up to a civilian and asks a question and the civilian can’t understand it,” he says. “If you could more easily communicate with that person, a lot of tension is relieved. There is a certain benefit when the military is able to communicate with the local populace.” Additional trials are scheduled in the Pacific, South America and Saudi Arabia this summer.
Soldiers of various ranks at Fort Polk, Louisiana, tested the translator, and preliminary evaluation of the translator has been positive. In the category of perceived potential, on a scale from -4 representing dissatisfied to +4 meaning outstanding, the technology received a score of 3.5, perceived operational effectiveness received a 3.0 and potential utility received a 3.2. In the speed of operation category, a score of 2.4 was given for the translation lag time of five seconds.
Currently, the target audience of the project is the military, but civilian uses also abound. Border patrol officers, telephone operators, customs agents and airport personnel are all potential users. “Airport operations is a classic example,” Palmquist says. “You have people coming in from many different countries, and you have to communicate with someone who does not speak English. In order to get a human translator to the site, it takes time and money. To be able to use the machine translator is much easier.”
Next generations will see a continuous increase in capabilities, Davis predicts. “And this is just from a linguistics point of view,” he says. “I also think that as the hardware gets better and better, we will be able to do more interesting things with the translator, including faster translations.
“My interest in the next generation is to be able to do this wireless,” he adds. “As you can carry on a remote conversation, envision the interpreter as a cellular telephone operation. You’re speaking English and the person on the other end is speaking a different language. You have more leeway in the computational capabilities because there is a server that sits somewhere and you don’t have to worry about batteries or weight. Maybe even a supercomputer could be used.” The translator is scheduled to be on the market in late fall.