Innovations from the world’s largest software company deliver and dazzle.
The LucidTouch is a see-through touchscreen for mobile devices that enables users to interact with the information by touching the back rather than the front of the screen. As a result, users do not obscure the specific point they are trying to view on the screen.
Sights, sounds and searches will undergo vast improvements for computer users in the coming years as researchers’ imaginations and know-how take flight and new capabilities hit the marketplace. Novel data visualization tools will enable users to create photo compilations that produce three-dimensional virtual tours of a location. Communications devices will be embedded with arrays of microphones and speakers that craft sound bubbles. And tomorrow’s versions of today’s word processing software will lend a techno-helping hand by automatically searching out previously composed materials and making them available at the click of a mouse. In fact, plans for new man-machine interfaces may make even the mouse-click itself obsolete.
Researchers at Microsoft Corporation,
The company currently employs more than 800 researchers working in five centers located around the world; it is scheduled to open a sixth center in
The numbers are impressive on other levels, too. As much as 17 percent of the company’s total annual revenues are earmarked for research and development. Unlike other companies where individual project groups must fund research projects, Microsoft funds research at the corporate level.
Henrique Malvar, managing director and distinguished engineer, Microsoft Research Redmond, explains that there is a method behind what could be perceived as madness in Microsoft’s approach to research. “In simple terms, we’re hedging our bets because we don’t know where the next super crazy cool idea will come up. So we cast in many, many different areas: in computer interfaces, databases, machine translation, natural language processing, signal processing computations, wireless networking, networking with wires, sensor networks,” he says.
In addition to being diversified in subject matter, research is spread out in terms of risk factors. While some of the research will result in products in the short term—some almost immediately, others during the next two years—others are considered more medium-term and will not be available for three to five years. Still other projects are viewed as long-term possibilities that may take six, eight or even 10 years to move from the drawing board into production. “And a few are completely crazy, and we have no idea [when they will be ready]. Some are super high-risk, and if they ever go somewhere we’ll say ‘Wow.’ Those are some of the more secretive ones as you can imagine,” Malvar says.
Research projects range from capabilities that solve current problems to those that aim more at enhancing users’ experiences with computers—and, in some cases, both. One pervasive problem that is being addressed through new visualization techniques is data overload. “We are all drowning in data, and it’s getting incredibly worse,” Malvar states.
Most people do not even know what is on their own hard drives, even though they themselves put it there, Malvar points out. They may be storing literally thousands or sometimes tens of thousands of pictures, for example, but they run into trouble when they want to find and view certain photos. “How do you actually see your pictures if you have 20,000 of them? You need new kinds of interfaces, new kinds of visualization techniques,” he proposes.
Microsoft researchers have been working on several projects that solve this problem in interesting ways. One solution is the WorldWide Telescope (WWT), which is a visualization environment that works like a virtual telescope by assembling imagery from ground- and space-based telescopes located throughout the world. Using new interfaces, users can then pan and zoom, and a rich navigation infrastructure allows them to add personal comments that can be shared with others.
“Or you can literally create tours. We actually have a nice example of a tour that was created by a six-year-old boy about things that he likes about the universe. And you look at something like that and you say, ‘Wow, if a little kid can do something like that, then even the professionals can do it!’ Even for the astronomers who know this stuff, just the fact that they can see it all at once in front of them and have this whole context surrounding them as they navigate is really powerful,” Malvar shares.
HD View is another data visualization solution Microsoft researchers have developed. Still in the beta stage, HD View enables users to combine many pictures and create a gigantic picture with a great deal of information. “HD View allows you to pick up let’s say a 4 gigapixel to 5 gigapixel picture instead of a megapixel picture. So you now have pictures with billions of pixels in them,” Malvar explains.
Because display technology that would allow users to see billions of pixels simultaneously has not yet been created, HD View provides panning and zooming information, he adds. And when the photographer has taken a host of pictures of the same area from different angles, HD View combines them automatically to create a panoramic picture that includes a multitude of details even when the photos were taken in no particular sequence.
Taking this technology a step further, the company’s researchers created Photosynth, which will enable the melding not only of an individual’s photographs but also of photographs taken by several individuals. Using Photosynth, the user can pick out randomly taken photographs, and the software extracts the structure of the picture. As a result, a three-dimensional (3-D) picture is created that allows viewers to navigate through a location. If, for example, someone took a photograph of a small café down the street from a well-known tourist attraction, a user can navigate through the 3-D picture to take a closer look.
Malvar says that initial versions of these capabilities are already being shipped and improved versions are considered a short-term project. However, in the long term, the next research step will be to match images from different kinds of bird’s-eye views. This technology would enable users to navigate fully in three dimensions all the way down to street level. “That’s very difficult to do, so it may take many years until we can do it in a completely smooth way, but it will happen. You will be able to actually sit on a virtual helicopter and fly around and see things,” Malvar predicts.
The capability also may even change shopping experiences, he adds, as people are able to take pictures of a product then tap into the power of the Web to deliver information about it. “But that’s going to take a while until we put all of these pieces together,” Malvar says.
In addition to enhancing computer users’ viewing experiences, Microsoft researchers are creating capabilities that will enrich their communication experiences as well. By employing an array of three or four microphones in computer equipment, users would be able to participate in meetings over the Web even while sitting in a noisy location—an airport lounge, for example—without other meeting participants being distracted by the background environment.
“So we’re trying to work with, for example, mobile phone companies including and especially those that use our Windows Mobile Systems to make sure that instead of having one little microphone in a telephone we have two or three to allow those technologies,” Malvar explains.
This capability also could advance progress in other technologies. For example, better voice capture would improve the accuracy of voice recognition software and allow people to conduct Internet searches on their cell phones without using the keypad. Malvar believes the improvements in audio-capturing technologies will continue to progress over time.
|The WorldWide Telescope pulls together astronomical imagery from ground- and space-based telescopes and presents it on one screen. Users can then pan and zoom to virtually explore the universe.|
“The next thing would be to use the same speaker array to cancel out your own voice, so that as you’re speaking we attenuate your voice, and the people near you can’t hear it. That we haven’t figured out yet, but it’s in the plans,” he says.
This technology is considered long-term because the goal is to determine a way someone’s computer could “hear and respond” to its owner’s voice. “Once we figure that out, then we could really provide let’s say a cubicle with full live voice environments without disturbing others. It’s almost as if we have virtual sound walls between people. We’re not quite there yet, but the part where we localize the playback sound we can do already and surprisingly well,” Malvar maintains.
But seeing new sights and hearing sounds better are not very useful if the ability to communicate also does not improve. That’s why Microsoft researchers also have been working to perfect language processing. Some of the technology already is available through Windows Live, although Malvar admits the quality is not very good. However, in a few years—not too long, he predicts—the advances made in this area will impress users.
In what Malvar calls a different but related area scientists are working on ways to assist anyone who writes—whether it is a proposal, grant request, and yes, even a magazine article. In addition to recommending better ways to write an entire paragraph, the new Microsoft Word could make locating previously written material a breeze. Researchers are dabbling in finding ways for a software program to point writers to materials they have written before on the same topic. This capability would facilitate locating documents and presentations scattered throughout a hard drive as well as on the Web, so a writer could refer back to or use a piece of it again.
“The idea is that we would basically look at the content of what you’re writing and try to correlate that on-the-fly. Today, you actually have to do a search. In the future, we would implicitly find it for you, for both Word documents and PowerPoint presentations. That could be really powerful,” Malvar relates.
The goal would be to introduce this feature in a scalable fashion. The first step would be finding and pointing to documents on a computer; the next, to locate and deliver documents within an intranet; and the final would be to connect to similar material on the Internet. Malvar categorizes this as a “not really near-term” capability.
Related to this capability would be the ability to cluster related documents so users could ask more high-level questions. With the knowledge that documents on a certain topic exist, a user would be able to ask a general question and the computer would deliver all the relevant items. “So you want to think of that in fuzzy terms and the computer will be able to find it for you because it clusters the documents in the first place. It could show you documents that may be related to some key words that it picked up as well as other documents that the computer determined are related to the first document, even if they don’t contain the terms you’re searching for,” Malvar explains.
The Microsoft research team already has made extensive progress in the capabilities of its Live Search product. While admitting that its competitors were doing a better job in the Web search arena than Microsoft, Malvar relates that the improvements users see today are the direct result of one researcher who suggested that searches should be data-driven. The scientist proposed asking several people to judge the content of Web pages, then employing a neural network that would learn from that information to compute relevance.
“At first we had doubts about whether that would scale to the whole Web, but it did. Today, we run our relevance calculations based on neural network technology, and the cool thing is that once it was clear that it would scale, the product team said ‘Okay, we want to use this now.’ We worked really hard on architecting it, and in less than a year, they have the new system, and the quality of our results has now skyrocketed,” Malvar says.
These capabilities lead to other ideas Microsoft scientists are exploring that are “somewhat long term,” such as the Semantic Web. They involve bringing structure to information on the Web and delivering a richer experience to searchers. For example, when users go to Live Search to look for a product, they would get not only a list of Web pages that relate to their search term but also a picture of the product as well as ratings based on owner reviews. “Rather than individual reviews, what we would show for a review is an aggregated rating because we’re able to dig into the sites and identify an opinion. We would have this aggregated opinion to present to you, and that’s potentially more powerful,” Malvar says. This capability is “not quite working 100 percent” yet because it is difficult to evaluate an opinion since sometimes negative words are used to describe a positive attribute of a product, he adds.
Just because Microsoft is not involved in hardware development or manufacturing does not mean that the company does not explore technologies that involve new devices. One area that Malvar classifies as “a bit more speculative” is sensor networks. “The idea is that we should not only look at the technologies’ algorithms for manipulating information, but we also should go down all the way to how they actually capture the information. And we have, for example, arrays of somewhat standardized sensors that measure things like position, temperature, humidity. The question is: can we make a platform out of those sensors such that it becomes easy to deploy hundreds of them and then there are layers, almost like an operating system for sensors,” he shares.
This idea has its problems, Malvar admits. Among the issues that would need to be resolved are how to design low-cost sensors and then how to process and visualize massive amounts of data. “That’s one particular area where I would say the military is ahead of us. But we look at it and say we could go beyond military applications and really look at commercial applications,” he says. One possible use would be embedding these sensors in vehicles to gather super-fine details about traffic situations, he proposes.
In terms of handheld equipment, the researchers are exploring new ways of coupling devices that could synchronize display screens or facilitate connections between appliances. For the former, Microsoft scientists are interested developing a way to conjoin small cell phone screens to create a larger one so a more comprehensive picture of a map can be viewed, for example.
In the case of the latter, the researchers are reviewing one proposal to develop a “shaking protocol.” Malvar explains that if this protocol could be developed, owners of multiple devices would simply shake one device, such as a Bluetooth headset, next to another one, such as a cell phone, to establish a connection. This technique would be much simpler than the current process, which requires owners to read through pages of instruction books. In addition, it would improve security because it could replace easily guessed passwords, he notes.
Microsoft Corporation: www.microsoft.com
WorldWide Telescope: http://research.microsoft.com/projects/wwt
HD View: http://research. microsoft.com/ivm/hdview.htm
Photosynth demonstration: http://labs.live.com/photosynth
Windows Live Translator: http://translator.live.com/?FORM=LIVSOP&mkt=en-US
Windows Live Search: www.live.com