Clicking through the menu on your iPod demands a significant amount of visual attention, which can be a hassle (while jogging) and even dangerous (while driving). But engineers at the University of Toronto and Microsoft Research are working on software that could make it possible to navigate the menus of gadgets that use circular touch pads, like the iPod, without looking at them--only audio cues would be used.

The researchers have designed an auditory menu technique--called earPod--that provides audio feedback when a person drags his or her finger around the touch pad. Although it's not ready to replace the expansive menus on real iPods, the results are encouraging, says Patrick Baudisch, a research scientist at Microsoft Research, in Seattle, who worked on the project. Within 30 minutes of beginning to use the technology, people can navigate two levels of earPod menus faster than traditional visual menus, and just as accurately.

"Requiring constant visual attention while using a PC is reasonable," says Baudisch, "but if you're using an iPod on the road, [constant visual attention] is unreasonable." In addition to giving people back their eyes, he says, audio menus could help gadgets save battery life by not wasting energy on a screen, and they could add functions to the screen-free devices such as the iPod shuffle.

The idea of using audio menus isn't new. Auditory interfaces can, after all, be found in touch-tone phone menus and in various assisted technologies for seeing-impaired users. But historically, handheld consumer gadgets haven't widely used audio menus. There are a few reasons for this, says Bruce Walker, professor in the school of psychology and college of computing at Georgia Institute of Technology. One reason, he says, is that audio hardware and software have been resource intensive, requiring significant amounts of computation and energy. In addition, audio software has been difficult to program.

But computing power is becoming cheaper, and there is an increasing need to find different ways to interact with handheld devices, says Walker. Within the past 10 years, he says, the ubiquity of mobile devices with small displays "has made us all visually impaired." Currently there are only a handful of researchers who are systematically looking at ways to make better audio interfaces for various devices, but Walker expects the ranks to grow in the coming years.

This first earPod prototype has a two-level menu hierarchy with 8 items per category, for a total of 64 items. To test how well people use the system, the researchers assigned to the first menu level a random assortment of categories: "clothing," "fish," "instrument," "color," and four others. The next level contained eight examples of these items. The iPod analogy would be found in the opening menu, which includes "music," "extras," "settings," and then lower menus that include "playlists," "artists," and "albums," for instance. The earPod approach could be extended to read off a limited number of names of artists and songs as well.

EarPod was designed specifically for gadgets with circular touch pads, says Baudisch. The circular touch pad is evenly divided into eight sectors: it's cut like pieces of a pie, with each menu item associated with each piece. When a person touches the dial of an earPod-equipped gadget, the audio menu responds with a prerecorded human voice. If a person puts his or her finger at 12 o'clock on the touch pad, the voice might say "Color," indicating that the finger is on the color sector. When the finger crosses one of these invisible sector lines, the user hears a clicking sound. As a finger moves, a new menu item is announced. To select an item and go to the next menu level, the user lifts his or her finger and hears a "camera-shutter" sound, which indicates that an item has been chosen.

Because the touch pad is divided into portions, says Baudisch, people can easily learn where menu items are and quickly jump to certain items without having to scroll through a list, as with an iPod. Another feature of earPod, he says, is that a user doesn't need to wait until a menu item is read before moving on to another. When a finger moves to a new sector, the audio is interrupted and the new item is announced.

In the earPod usability study, conducted by Shengdong Zhao, a doctoral student at the University of Toronto, and project lead, the researchers found that people who had no experience using either an iPod or an earPod-equipped device used the devices with equal accuracy. EarPod was 92.1 percent accurate, while the visual system was 93.9 percent accurate, but the difference was not statistically significant. It took people longer to grow accustomed to earPod, but with experience, users' performance on the audio menu became faster. After 30 minutes of training on both devices, subjects could navigate two levels of menu with earPod in 2.1 seconds as opposed to 2.5 seconds with the visual menu.

Georgia Tech's Walker is impressed with the earPod approach and results. "My overall impression is that this is great ... It was inevitable: trying to look at how to take an interface that is purely visual on the iPod and turn it into an interface that's purely auditory, because, after all, the iPod's an auditory device. Why should a person have to pull their player out while they're jogging to look at it?"

Currently, however, earPod could not be a complete replacement for an iPod menu, Walker notes. One reason is that earPod doesn't lend itself to menu flexibility. Once a person learns the position of the menu items, he or she might become frustrated if those positions need to change due to a software update or added playlist. In particular, the approach would not work well for menus such as mobile-phone address books, Walker says.

In addition, adds Baudisch, because the circular track pad is divided into sectors, there are a limited number of menu items that a person can access. If there are 8 sectors, each with 8 menu items, then there are only 64 total items accessible on the device, and this wouldn't be good enough for iPods that hold hundreds of playlists and thousands of songs. However, Baudisch suspects that future prototypes will provide ways to get around the problem. He and his team are exploring how people respond to faster audio output (speeding up the recorded voice) and how people use audio and visual cues simultaneously. Developing an all-encompassing interface for eyes-free operations on auditory devices is still a future project, he says.