earPod Homepage


The visual modality has long dominated research in human computer interaction. However interest in applications for mobile users, and users with visual disabilities is leading to the development of audio-based eyes-free interfaces.

Designing mobile audio eyes-free interfaces is challenging. Speech is serial and slow which can make audio based interaction technique painful to interact with. In addition, mobile applications are often used in multi-task scenarios in dynamic environments where users distribute physical or mental resources among multiple, possibly concurrent, tasks. This is in contrast with the static eyes-free interfaces typically designed for the visually impaired where users can operate the device under a stable environment with dedicated attention. 

earPod  Design


Figure 1: Using earPod. (a, b) Sliding the thumb on the circular touchpad allows discovery of menu items; (c) the desired item is selected by lifting the thumb; (d) faster finger motions cause partial playback of audio.

earPod project investigates the design of a common interface component: hierarchical menus, in mobile eyes-free scenarios. Relevant design is carried out for both input and output, and a prototype is developed which is referred to as the earPod. 

The earPod technique is designed for an auditory device controlled by a circular touchpad whose output is experienced via a headset, as is found, for example, on an Apple iPod. Figure 3 shows how the touchpad area is functionally divided into an inner disc and an outer track called the dial. The dial is divided evenly into sectors, similar to a Pie or Marking Menu.

Our technique is illustrated in Figure 1. When a user touches the dial, the audio menu responds by saying the name of the menu item located under the finger (Figure 1a). Users may continue to press their finger on the touch surface, or initiate an exploratory gesture on the dial (Figure 1b). Whenever the finger enters a new sector on the dial, playback of the previous menu item is aborted. Boundary crossing is reinforced by a click sound, after which the new menu item is played. Once a desired menu item has been reached, users select it by lifting the operating finger, which is confirmed by a “camera-shutter” sound (Figure 1c). Users can abort item selections by moving their finger to the center of the touchpad and releasing it. If a selected item has submenus, users repeat the above process to drill down the hierarchy, until they reach a desired leaf item. Users can skip items rapidly using fast dialing gestures (Figure 1d). earPod is designed to allow fast expert usage. As users gain knowledge of the menu configuration through practice, they tend to use brief corrective gestures (Figure 1b) instead of large exploratory ones (Figure 1d). Eventually, as users remember the exact locations of desired menu items, they select these items by directly tapping on them.

earPod is not a simple replacement of written words using audible speech. It's designed to be fast and easy to learn. As we know, speech is serial and slow. Using speech for user feedback can tremedously slow down interaction speed. earPod use a number of strategies to overcome these problems.


earPod prototype is evaluated, first against the iPod interface and then against a fuller set of competitive techniques that include dual vs. single modality presentations, audio vs. visual modalities, and absolute vs. relative mappings. 

Our first user study indicates that earPod is efficient to use and relatively easy to learn. For static menus of reasonable sizes, earPod is comparable in both speed and accuracy with an iPod-like visual menu selection technique. Although initially slower, earPod outperformed the visual technique within 30 minutes of training (Figure 2).

Figure 2: Selection times for the two techniques by number of menu levels and time period (1 time period = 5 contiguous blocks of trials).

Ongoing & Future Work

We are currently looking into comparing earPod with iPod under the driving scenario, and potential improvement of the Interactive Voice Response (IVR) systems using the earPod approach.

Shengdong Zhao: Ph.D. student, Computer Science Department, University of Toronto, Canada (earPod research is my PhD thesis) 
Pierre Dragicevic: Research Scientist, INRIA, France
Mark Chignell: Professor, Mechanical Industrial Engineering Department, University of Toronto, Canada
Ravin Balakrishnan: Professor, Computer Science Department, University of Toronto, Canada
Patrick Baudisch: Research Scientist, Microsoft Research, Redmond, U.S.A. 

Shengdong Zhao, Pierre DragicevicMark ChignellRavin Balakrishnan, Patrick Baudisch (2007). earPod: Eyes-free Menu Selection with Touch Input and Reactive Audio Feedback. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). p. 1395-1404.

Video & Slides
movie (4 MB wma, 720x480, 29.970 fps, 1:04)
PPT slides of my presentation at CHI 2007.

Selected press coverage:
English: MIT Technology Review, May 2007.
中文:PCOnline Article, May 2007.

Shengdong Zhao [sszhao (at) dgp.toronto.edu]
Mark Chignell [chignell (at) mie.toronto.edu]

Note: this page is created by Shengdong Zhao on August 5, 2007.