Cybersquare
Computer, Vol. 30, No. 2, February 1997

Wearable Computing: A First Step Toward Personal Imaging

Steve Mann
Massachusetts Institute of Technology, Building E15-383, Cambridge, MA02139
Author currently with
University of Toronto,
10 King's College Road, Room 2001,
Toronto, Ontario,
M5S 3G4;
mann@eecg.toronto.edu
(c) 1997 IEEE


Miniaturization of components has enabled
systems that are wearable and nearly invisible, so that individuals can move about and interact freely, supported by their personal information domain.

Can you imagine hauling around a large, light-tight wooden trunk containing a co-worker or an assistant whom you take out only for occasional, brief interaction. For each session, you would have to open the box, wake up (boot) the assistant, and afterward seal him back in the box. Human dynamics aside, wouldn't that person seem like more of a burden than a help? In some ways, today's multimedia portables are just as burdensome.

Let's imagine a new approach to computing in which the apparatus is always ready for use because it is worn like clothing. The computer screen, which also serves as a viewfinder, is visible at all times and performs multimodal computing (text and images).

With the screen moved off the lap and up to the eyes, you can simultaneously talk to someone and take notes without breaking eye contact. Miniaturized into an otherwise normal pair of eyeglasses, such an apparatus is unobtrusive and useful in business meetings and social situations.

Clothing is with us nearly all the time and thus seems like the natural way to carry our computing devices. Once personal imaging is incorporated into our wardrobe and used consistently, our computer system will share our first-person perspective and will begin to take on the role of an independent processor, much like a second brain--or a portable assistant that is no longer carted about in a box. As it "sees'' the world from our perspective, the system will learn from us, even when we are not consciously using it.

Such computer assistance is not as far in the future as it might seem. Researchers were experimenting in related areas well before the late seventies, when I first became interested in wearable computing devices. Much of our progress is due to the computer industry's huge strides in miniaturization. My current wearable prototype,1 equipped with head-mounted display, cameras, and wireless communications, enables computer-assisted forms of interaction in ordinary situations--for example, while walking, shopping, or meeting people--and it is hardly noticeable.

DEVELOPING COMPUTERS TO WEAR

In 1968 Ivan Sutherland described a head-mounted display with half-silvered mirrors that let the wearer see a virtual world superimposed on reality.2, 3 His work, as well as subsequent work by others,4 entailed a serious limitation: Because the wearer was tethered to a workstation, generally powered from an ac outlet, the apparatus was confined to a lab or some other fixed location.

My experiments in attaching a computer, radio equipment, and other devices to myself culminated in a tetherless system that lets me roam about the city. I can receive e-mail and enjoy various other capabilities exceeding those available on a desktop multimedia computer. For example, family members watching remotely can see exactly what I see and, while I am at the bank, remind me by e-mail to get extra cash. Or I can initiate communication, using RTTY (radioteletype), to ask what else I should pick up on the way home.


Click on image for a closer view


(a)

(b)
Figure 1. (a) The unique capabilities of a wearable personal computer-imaging system and lighting kit let me create expressive images that transcend the boundaries of photography, painting, and computer graphics. (b) The system consisted of a battery-powered computer with wireless communications capability, so that I was free to roam untethered.

This new approach to computing arose from my interest in the visual arts--particularly still-life and landscape imaging in which multiple exposures of a static scene could be combined and illuminated by a variety of light sources. Figure 1a shows an image made using this original "light-painting'' application.

To explore such new concepts in imaging and lighting, I designed and built the wearable personal imaging system shown in Figure 1b. At the time (around 1980, while I was still in high school), battery-operated tetherless computing was a new modality, as the laptop computer had not yet been invented. My invention differed from present-day laptops and personal digital assistants in that I could keep an eye on the screen while walking around and doing other things. A CRT on the helmet presented both text and images, and a light similar to a miner's lamp helped me find my way around in the dark. I carried an electronic flash lamp that let me capture images in total darkness. An array of push-button switches on the flash-lamp head controlled the computer, camera, and so forth.

The incredible shrinking computer

Even 10 years later, during my experiments in the early 1990s, the computational power required to perform general-purpose manipulation of color video streams came in packages too unwieldy to be worn in comfortable clothing. I was forced to use special- purpose hardware with good video processing capability remotely by establishing a full-duplex video communications channel between the clothing and the host computer or computers. I used a high-quality communications link to send video from the cameras to the remote computer(s) and a lower quality communications link to carry the processed signal from the computer back to the head-mounted display. Figure 2 diagrams this apparatus, which let me explore applications that will become possible when miniaturization puts today's supercomputer power into tomorrow's clothing.


Figure 2. An experimental apparatus for wearable, tetherless, computer-mediated reality. The camera sends video to a remote supercomputing facility over a high-quality microwave communications link. The computing facility sends back the processed image over a UHF communications link. "Visual filter" refers to the process(es) that mediates the visual reality and that may insert virtual objects into the visual stream.

When I brought my apparatus to MIT in 1991, I installed two antennas on the roof of the tallest building in the city. Later I found that if I moved one of the antennas to another rooftop, the inbound/outbound channel separation improved dramatically. The apparatus provided good coverage on the university campus, moderate coverage over a good part of the city, and some coverage in a nearby city.

Advances in miniaturization helped to streamline the equipment over the years. In Figure 3a I am wearing a 1980 prototype of the experimental system. The 1.5-inch CRT was unwieldy and required a well-fitted helmet to support its weight. (For viewing the CRT, I have used, in various embodiments of the head gear, a lens and mirror at a 45-degree angle, a partly silvered mirror, and reflections off eyeglasses.) Two antennas, operating in different frequency bands, allowed simultaneous transmission and reception of data, voice, or video. Alternative versions of the communications apparatus included a slightly less cumbersome clothing-based antenna array (hanging behind me at the upper right in Figure 3a) comprising wires sewn directly into the clothing. Substituting this clothing-based array let me clear doorways and ceilings during indoor use.


Click on image for a closer view


(a)

(b)

(c)

(d)
Figure 3. Progressive miniaturization by the computer industry has enabled wearable devices to become less obtrusive over the past 16 years. (a) 1980 prototype with a 1.5-inch CRT; (b) late 1980s multimedia computer with a 0.6-inch CRT; (c) a more recent commercially available display; (d) a current, nearly undetectable, prototype consisting of eyeglasses, a handheld control, and a computer worn in back under the shirt.


With the advent of consumer camcorders, miniature CRTs became available, making possible the late 1980s eyeglass-mounted multimedia computer shown in Figure 3b. Here I am using a 0.6-inch CRT facing down (angled back to stay close to the forehead). This apparatus was later transferred to optics salvaged from an early 1990s television set. Though still somewhat cumbersome, the unit could be worn comfortably for several hours at a time. An Internet connection through the small hat-based whip antenna used TCP/IP with AX25 (the standard packet protocol for ham radio operators).

The prototype in Figure 3c incorporates a modern commercial display product made by Kopin, an American manufacturer of head-mounted displays, along with commercially available cellular communications. With the advent of cellular and other commercial communications options, a radio license is no longer needed to experience "online living." Unlike my earlier prototypes, this system was assembled from off-the-shelf components. Though it is much improved, I expect to do even better: The prototype shown in Figure 3d--still under development--is nearly undetectable.

APPLICATIONS

Just as computers have come to serve as organizational and personal information repositories, computer clothing, when worn regularly, could become a "visual memory prosthetic" and perception enhancer.

Edgertonian eyes

Early on, I experimented with a variety of visual filters5 as I walked around. Each of these filters provided a different visual reality. One filter applied a repeating freeze-frame effect to the WearCam (with the cameras' own shutters set to 1/10,000 second). This video sample-and-hold technique let me see the unseeable: writing on moving automobile tires and the blades on a spinning airplane propeller. Depending on the sampling rate of my apparatus, the blades would appear to rotate slowly backward or forward, much as objects do under Harold Edgerton's stroboscopic lights.6

Beyond just enabling me to see things I would otherwise have missed, the effect would sometimes cause me to remember certain things better. There is something very visceral about having an image frozen in space in front of your eyes. I found, for example, that I would often remember faces better, because a frozen image tended to remain in my memory much longer than a moving one. Perhaps intelligent eyeglasses of the future will anticipate what is important to us and select the sampling rate accordingly to reveal salient details.

Finding our way around

We've all been lost at one time or another. Perhaps, at the end of a long day in a new city or a large shopping complex, you can't find your car or subway stop. One way I guard against such lapses is by transmitting a sequence of images to my WWW page. Then if (when) I get lost, I browse my WWW page to find my way back. An advantage of having the image stream on the Web is that friends and relatives with wearable Web browsers can see where I have been and catch up with me. This constitutes a type of shared visual
memory.

Footwear offers yet another opportunity for help in orientation. Mark Weiser of Xerox PARC, commenting on IBM computer scientist Tom Zimmer-man's computerized shoes, predicts that someday customers walking into a store will pick up floor-plan data from their shoes that will guide them to the merchandise they're shopping for.7

Zimmerman was not the first to propose shoe-based computing. In the late 1970s, a group of researchers known as the Eudaemons were building shoe-based computers for use in physical modeling of chaotic phenomena8--or more specifically, for bettering their odds at roulette. One person would enter data (clicking with the toes) while watching the ball; another person would receive the data and try to predict the octant the ball would land in.

Figure 4. Six frames of low-resolution video from a processed image sequence. My computer recognizes the cashier and superimposes a previously entered shopping list on her image. When I turn my head to the right, the list moves to the left on my screen, following the flow-field of the video imagery coming from my camera. Note that the tracking (initially triggered by automatic face recognition) continues even when the cashier's face is completely outside my visual field, because the tracking is sustained by other objects in the room. Thus, the list of items I have purchased from this cashier appears to be attached to her. This functionality can provide a clear recollection of facts during a refund transaction.

Homographic modeling

Recently I reported on a wearable apparatus9 that can help us identify faces by comparing an incoming image to a collection of stored faces. Once the wearer confirms a match, the "video orbits" algorithm10 that I developed enables the system to insert a virtual image11 into the wearer's field of view, creating the illusion that a person is wearing a name tag. As Figure 4 shows, the name tag will stabilize on the person even though the image field moves. The homography of the plane is estimated and tracked throughout, so that even when the objects being recognized fall outside the camera's field of view, tracking continues by the homography alone.

Personal visual assistant for the visually challenged

With its spatial filtering capability, a head-mounted apparatus can assist partially sighted individuals.12 Worn over the eyes, it computationally augments, diminishes, or alters visual perception in day-to-day situations in real time.5 For example, Figure 5 shows how a visual filter might assist in reading. The portable system, made from a battery-powered color stereo display having 480 lines of resolution, is shown in Figure 6. The wearer's only visual stimulus comes from the computer screens, since in this case the glasses are totally opaque. I call this experience fully mediated reality.5

Click on image for a closer view


Figure 5. Using a visual filter such as this in the personal visual assistant may help a person with very poor vision to read. The central portion of the visual field is hyperfoveated for a high degree of magnification while allowing good peripheral vision.


Figure 6. My early 1990s apparatus for wearable, tetherless, computer-mediated reality included a color stereo head-mounted display with two cameras. The intercamera distance and field of view approximately matched my interocular distance and field of view with the apparatus removed. Communications equipment was worn around the waist. Antennas, transmitter, and so forth, were at the back of the head-mount to balance the weight of the cameras in front.

SOCIAL ASPECTS

The early prototypes were quite obtrusive and often made people ill at ease, but more recently the apparatus has been gaining social acceptance. I attribute this partly to miniaturization, which has allowed me to build much smaller units, and partly to dramatic changes in people's attitudes toward personal electronics. With the advent of cellular phones, pagers, and so forth, such devices may even be considered fashionable.

When equipped with truly portable computing, including a wireless Internet connection and an input device like those pictured in Figure 7, I find that people I talk with aren't even distracted by my looking at the screen. In fact, they cannot discern whether I am looking at my screen or at them, because the two are aligned on exactly the same axis. The problem of focal length can generally be managed by setting it so that the screen and anyone I'm talking with are in the same depth plane.

With enough light present, images can be incorporated into the note-taking process in a natural manner, without distracting the other person. Even in low light--for example, while talking with someone outdoors after dark--a small flash, shown in Figure 7a, can be used during a conversation without breaking eye contact. The only distraction is the light from the flash itself, which may startle people at first. (An infrared flash would be less obtrusive.)

Some years after I developed the keyboard/control system in Figure 7a, a commercial product--the mouse shown in Figure 7b--appeared. Its numerous buttons could be used to type or to control various other functions. In the future, of course, we will not need keyboards and mice at all. A goal of personal imaging is to use the camera as the input device. A rough prototype of a "finger mouse," shown in Figure 7c, has already been developed, and it isn't hard to envision a system for inputting data using hand gestures.

Dark glasses mean leave me alone

In the early eighties, my greatest impediment to social interaction while wearing the apparatus was its obtrusiveness. Although the bulky computational and electronics hardware only minimally hindered social interaction when carried in a backpack, my display continued to create a tremendous social barrier. I began experimenting with some of the social protocols likely to be adopted once the display technology became less obtrusive and use of the equipment became more common. For example, I found that the darkness of my eyeglasses could be used to indicate whether or not I'm in the mood for idle chat. When I am sitting on the subway, I set my glasses to a wine-dark opacity, though I can still see through them. This is intended to let others know that I do not wish to be disturbed.



(a)

(b)

(c)
Figure 7. Hand-held keyboards, mice, and controls. (a) My early prototype (incorporating one microswitch for each finger and three possible microswitches for the thumb) was built into the handle of an electronic flash lamp and allowed simultaneous one-handed control of computer, camera, and flash lamp. (b) Modern off-the-shelf mouse/keyboard combination made by Handykey Corp. The mouse consists of a tilt sensor inside the housing. (c) Virtual mouse. Camera in eyeglasses tracks finger, which controls a cursor, allowing the user to look at a luxo lamp through the glasses and draw its outline on the computer screen.

Seeing eye-to-eye

People often disagree because they fail to see something exactly the same way--from the same viewpoint. In the most literal sense, this needn't always be a problem. Two people equipped with clothing-based multimedia computers can not only stay in touch, sending data, voice, and video to each other, but can also exchange perfectly accurate viewpoints. Each person can see exactly what the other person sees overlaid on a portion of his or her own screen.

Safety net

Suppose that instead of just two people we have a networked community of individuals wearing computer clothing. This could be either a homogeneous community, all wearing the same form of the apparatus, or a heterogeneous community wearing several variations. People would most likely focus primarily on their own surroundings, but they could occasionally receive an image from someone sending an important signal. For example, someone anticipating danger might trigger a distress signal to others nearby over the Internet. Alternatively, the clothing itself could trigger the signal. For example, I have a heart rate monitor in my clothing and a footstep activity meter in my shoes. Heart rate divided by the rate of footsteps could register on a "saliency index" that might warn of danger. If someone approaches you on the street, pulls out a gun, and asks for cash, most likely your heart rate would increase and your footsteps slow down, which is contrary to the usual patterns of physiological behavior. A community of individuals networked in this way could look out for each other much like a neighborhood watch.

Such a networked community offers an alternative to the proliferation of government surveillance cameras throughout many cities, particularly in the UK. Even in the US, the city of Baltimore, Maryland, is experimenting with ubiquitous video surveillance to watch over citizens' activities. Two hundred cameras are being installed in the downtown business district as an experiment in crime prevention. Such government surveillance is reminiscent of George Orwell's 1984, with cameras and microphones distributed throughout the environment and two-way television sets watching us as we watch them. Science fiction writer David Brin warns that cameras are coming one way or another and that privacy as we know it will disappear. He argues that the kind of privacy loss one experiences in a small town is less evil than that experienced in an Orwellian society. Thus, citizens would be better off looking out for one another using clothing-based Internet-connected computing. This would require fewer tax dollars and provide a future more like that described in Brin's novel Earth, in which citizens wearing cameras are networked in the cyberspace equivalent of a small town. Wouldn't safety nets be better than surveillance?

Naturally, smart clothing must be owned, operated, and controlled by the individual wearers. A potentially sinister variation--smart uniforms--could entail totalitarian control beyond anything Orwell imagined.

Dependence on computer clothing

Some people fear that we'll become dependent on wearable computing, but I think this fear is unjustified. Wasn't it once said that compilers, assemblers, and even pocket calculators would cause our brains to atrophy? Long ago I could do arithmetic quickly by hand, but now I would be a little slow in doing something as simple as finding the square root of 2 with pencil and paper. I'd also find it hard to program in 6502 machine code, as I did for my first wearable computer system, without the help of an assembler or a compiler. Freeing ourselves from mundane tasks like arithmetic or hand assembly of computer instructions lets us think at a higher level. Tools such as pocket calculators, assemblers, and compilers have greatly extended our capabilities, enabling us to develop a whole new set of higher level abilities.

Indeed, we probably will develop a dependence on readily accessible computing, just as we have developed a dependence on wash-and-wear clothing--and desktop computers, for that matter. The fact that some primitive societies can still survive quite well without clothing while we've probably lost our ability to survive naked in the wilderness in all but the warmest of climates doesn't support the argument that we should do without clothing.

Someday, when we've become accustomed to clothing-based computing, we will no doubt feel naked, confused, and lost without a computer screen hovering in front of our eyes to guide us. By that time, fortunately, increased reliability will be an important part of the design. Just as we would not want our shirt buttons to pop off or our trousers to fall down, we will demand that our computer clothing not go down either.

Although past prototypes have been cumbersome, and even present prototypes remain somewhat obtrusive, miniaturization continues to pack a greater level of functionality into a smaller space. The enabling apparatus will soon be invisible as it disappears into ordinary clothing and eyeglasses. Efforts have already been made to produce wearable computers commercially. Such commercial interest is bound to add impetus to further miniaturization.

Some of my rough prototypes are getting so small that eye movements are the only indication that the wearer may be online. The eye movements of someone reading a virtual image appear somewhat unusual, even though the apparatus itself is almost invisible. Development and commercialization of these more natural-looking systems will overcome initial reluctance among potential users and gradually create a broader user base, just as more powerful and easier-to-use PCs made their way into offices and homes.

Clothing-based computing with personal imaging will blur the boundaries between seeing and viewing and between remembering and recording. Rather than narrowing our focus, living within our own personal information domain will enlarge our scope through shared visual memory that enables us to "remember" something or someone we have never seen.

With computers as close as the shirts on our backs, interaction will become more natural. This will improve our ability to do traditional computing tasks while standing or walking, letting future computing systems function much like a second brain. A computer that is constantly attentive to our environment may develop situational awareness, perceptual intelligence, and an ability to see from the wearer's perspective and thereby assist in day-to-day activities.

Of course, these far-reaching goals will require years of research. Nevertheless, we can expect entirely new modes of human-computer interaction to arise, along with a whole new set of technical, scientific, and social needs that will have to be addressed as we take our first steps toward personal imaging.


Acknowledgments


I thank Thad Starner and Flavia Sparacino for helping me get the cursor-control software running with X Windows on the SGI Reality Engine for finger-tracking, and Joe Paradiso for suggesting I write this "experiential first-person account." I also thank Roz Picard, Hiroshi Ishii, Neil Gershenfeld, Sandy Pentland, Ted Adelson, Jennifer Healey, and many others too numerous to mention here for many interesting and important discussions; Matt Reynolds, KB2ACE, for help in upgrading the outbound ATV system; and Steve Roberts, N4RVE, for many useful suggestions. Thanks to Larry Smarr of the University of Illinois for use of the NCSA supercomputing facility, to Chris Barnhart for special-purpose processing hardware, and to Bran Ferren of Disney. HP Labs, ProComp, VirtualVision, Compaq, Kopin, Colorlink, Ed Gritz, Miyota, BelTronics, M/A-Com, and Virtual Research also deserve thanks for lending or donating additional equipment that made my experiments possible.

Hewlett-Packard Labs, Palo Alto, California, supported the part of my research performed at MIT.

References

  1. S. Mann, "Wearable Wireless Webcam," personal WWW page, http://wearcam.org (http://n1nlf-1.media.mit.edu), 1994.
  2. R.A. Earnshaw, M.A. Gigante, and H. Jones, Virtual Reality Systems, Academic Press, San Diego, Calif., 1993.
  3. I. Sutherland, "A Head-Mounted Three Dimensional Display," Proc. Fall Joint Computer Conf., IEEE CS Press, Los Alamitos, Calif., 1968, pp. 757-764.
  4. S. Feiner, B. MacIntyre, and D. Seligmann, "Knowledge-based Augmented Reality," Comm. ACM, July 1993.
  5. S. Mann, Mediated Reality, Tech. Report TR 260, MIT Media Lab Perceptual Computing Section, Cambridge, Mass., 1994.
  6. H.E. Edgerton, Electronic Flash, Strobe, MIT Press, Cambridge, Mass., 1979.
  7. J. Pitta, "The Soul of the New Machine," Los Angeles Times, Nov. 18, 1996, p. D1.
  8. T. Bass, The Eudaemonic Pie, Houghton Mifflin, Boston, 1985.
  9. S. Mann, "Smart Clothing: 'Wearable Multimedia and Personal Imaging' to Restore the Balance Between People and Their Intelligent Environments," Proc. ACM Multimedia 96, ACM Press, New York, 1996.
  10. S. Mann and R.W. Picard, Video Orbits of the Projective Group: A Simple Approach to Featureless Estimation of Parameters, Tech Report TR 338, MIT Media Lab Perceptual Computing Section, Cambridge, Mass., 1995.
  11. T.S. Huang and A.N. Netravali, "Motion and Structure from Feature Correspondences: A Review," Proc. IEEE, Feb. 1984.
  12. S. Mann, Wearable, Tetherless Computer-Mediated Reality: Wearcam as a Wearable Face-Recognizer, and Other Applications for the Disabled, Tech. Report TR 361, MIT Media Lab Perceptual Computing Section, Cambridge, Mass., 1996.

Steve Mann cofounded the Wearable Computing Project at the MIT Media Lab, where he is completing his PhD in personal imaging. In addition to this work and his interest in "online living," which grew out of a high school hobby, he has a related interest in amateur radio. In his research he has explored a way to characterize the response of objects to arbitrary lighting, created a self-linearizing camera calibration procedure (with photometric image-based modeling), and formulated the first true projective image mosaicking/compositing algorithm. He holds degrees in physics and electrical engineering from McMaster University in Canada.

Contact Mann at MIT E15-338, 20 Ames St. Cambridge MA 02139 steve@media.mit.edu; in 1998, Professor Mann will be at University of Toronto, 10 King's College Road, Toronto, Ontario, Canada, M5S 3G4; mann@eecg.toronto.edu.


| Computer Magazine page | Publications page | Computer Society home page |


Send general comments and questions about the IEEE Computer Society's Web site to: webmaster@computer.org

Copyright © 1997 Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.