Steve Mann,
steve@media.mit.edu, N1NLF,Tel. (416) 946-3387,Fax. (416) 971-2326
MIT E15-389, 20 Ames St., Cambridge MA02139;
Author currently with University of Toronto,
Elec. Eng. Dept.
We are entering a pivotal era in which we will become inextricably intertwined with computational technology that will become part of our everyday lives in a much more immediate and intimate way than in the past. The recent explosion of interest in so-called ``wearable computers'' is indicative of this general trend. The purpose of this paper is to provide an historical account of my wearable computer effort, from the 1970s (WearComp0) to present (WearComp7), with emphasis on a particular variation whose origins were in imaging applications. This application, known as `personal imaging', originated as a computerized photographer's assistant which I developed for what many regarded as an obscure photographic technique. However, it later evolved into a more diverse apparatus and methodology, combining machine vision and computer graphics, in a wearable tetherless apparatus, useful in day-to-day living. Personal imaging, at the intersection of art, science, and technology, has given rise to a new outlook on photography, videography, augmented reality, and `mediated reality', as well as new theories of human perception and human-machine interaction. My current personal imaging apparatus, based on a camera and display built within an ordinary pair of sunglasses, together with a powerful multimedia computer built into ordinary clothing, points to a new possibility for the mass-market.
In this paper, I describe a particular physical arrangement of a computer system, which I call `WearComp'. It has the following characteristics[1]
It is often sufficient that the interface (input and output) alone satisfy this criterion so that some of the computational resources can be remotely located if desired.
Two early wearable computers were built independently. During the 1970s, I built a system (Fig 1(a-c)) for experimental photographic applications, unaware of wearable computers (Fig 1(d)) that were being built by a group of west-coast physicists, known as the Eudaemons[4].
Figure 1:
Two early wearable computers
developed in the 1970s.
(a) A ``photographer's assistant''
system comprising my late 1970s
wearable computer (pictured here with 1980 display).
I integrated the computer into a welded-steel frame worn on my
shoulders (note belt around my waist which directed much of the
weight onto my hips). Power converter hung over one shoulder.
Antennae (originally 3, later reduced to 2),
operating at different frequencies,
allowed simultaneous transmission and reception of computer data,
voice, or video. This system allowed for complete interaction
while walking around doing other things.
(b) Close up of the uppermost end of the lightpaintbrush
handle, showing the end that's held in my right hand.
The collection of six
spring-lever switches, one for each finger and two for thumb,
permits input of data, as well as control of the lightpainting
programs.
(c) Close up of my 1980 display.
(d) The Eudaemons' shoe-based computer which used a vibrotactile
display (described in[5])
as its sole output modality.
Thanks to Doyne Farmer for loan of shoe computer from which I took
this picture.
It is interesting to contrast these two early wearable computing efforts.
In the late 1970s, the Eudaemons designed and built various wearable computers for purposes of assisting a roulette player predict where the ball would land. Rather than attempting to predict the exact number on which the ball would land, they divided the wheel into eight octants, and attempted to predict which octant the ball would land in. It was not necessary to predict this outcome with high accuracy -- it was sufficient to occasionally (with probability only slightly better than pure chance) know which octant the ball would land in. In this manner, the apparatus was successful in providing a small but sufficient increase in betting odds, for purposes of winning money in the game of roulette.
In various implementations, two players would collaborate: one would watch the ball and click a footswitch each time it passed, while the other would receive this timing information wirelessly from the second person who was to place the bet. The shoe-based computer (Fig 1(d)) using a physical model of the roulette table, based on nonlinear dynamics, would indicate one of nine possibilities: a particular octant, or a ninth suggestion that no bet be placed. One of these nine possibilities was presented to the bottom of the foot in the form of vibrations of three solenoids which were each programmed to vibrate at one of three possible vibration rates (slow, medium, or fast). The person placing the bets would need to memorize the numerical values in each octant of the roulette table, as well as learn the nine different vibration patterns that the three solenoids could produce.
During the 1970s, I envisioned, designed, and built the first WearComp (called WearComp0, which evolved into WearComp1) to function as an experimental ``photographer's assistant'' for a new imaging technique. The goal of this technique was to characterize the manner in which scenes or objects responded to light, through the process of gathering differently exposed/illuminated pictures. The procedure resulted in a richer scene description, which allowed more expressive/artistic images to be rendered. This technique was recently described in[6][7]. (See Fig 2.)
Figure:
(a) Image from one of my recent exhibitions,
which I generated from data taken
using my personal imaging system.
The unique capabilities of a completely tetherless
wearable {computer,imaging,graphics} system
facilitate a new
expressive capability with images that could not be created by
any other means. The images transcend the boundary between
photography, painting, and computer graphics.
(b,c) Data sets are collected based on response to various
forms of illumination.
(b) Personal imaging computer in use with
40000 Joule flash lamp operating at 4000V DC with 24kV
trigger.
(c) With 2400 Joule
flash lamp operating at 480V with 6kV trigger.
The photographer's assistant system comprised two portions, a WearComp with light sources (as peripherals), and a base-station with imaging apparatus. The imaging technique, referred to as `lightspace', involved determining the response of a typically static scene or object to various forms of illumination, and recording the response measurements (originally in analog form on photographic emulsion) at the base station.
This project evolved into a system called WearComp2, which also had some degree of graphics capability, digital sound recording and playback capability, and music synthesis capability.
I built WearComp2 (a 6502-based wearable computer system completed in 1981) into a metal frame pack (welded-steel pipe construction), and powered it by lead-acid batteries. The power distribution system was constructed to also provide power for radio communications, both inbound and outbound, as well as for electronic flash (900 volts). I achieved power inversion by a combination of astable multivibrators (manufactured by Oaks and Dormitzer), and a matched pair of germanium power transistors salvaged from the inverter section of a tube-type automotive taxicab radio. At this time, a widespread practical use for battery operated computers had not yet been envisioned, and therefore there were no such ``portable'' computers available. The first portables (such as the Osborne computer, a unit about the size of a large toolbox) were yet to appear in the early 1980s.
Input comprised a collection of pushbutton switches (typically located on the handle of one of my specialized light sources, each one having its own ``keyboard'') which could be used to enter program instructions, as well as to take down notes (e.g. details about a particular photographic situation).
In addition to simple control of imaging commands, I also desired some automated functionality. For example, I wanted the base station to automatically know my wherabouts, so that additional information about each exposure could be kept for later use, as well as for immediate use.
Ideally, a tape measure would be used to measure the location on the ground where I was standing, and the height of the flash lamp above the ground, and instruments would be used to determine its azimuth and elevation (direction of aim). However, due to time constraints, it was more typical that I would count the distance in `paces' (number of footsteps), and report these numbers verbally (using two-way radio) back to my assistant at the base station taking notes. This procedure itself, because it required the attention of an assistant, was undesirable, so I included a pedometer in my shoes to automate this counting process (initially on an electro-mechanical counter manufactured by Veeder Root, and later on the wearable computer with the Veeder Root counter as a backup to occasionally verify my count). Due to count errors (in both the computer and backup), I also experimented with the use of ceramic phono cartridges, piezo elements, and resistive elements, using some hysteresis in the counting circuitry.
Experiments in replacing the pedometers with radiolocation were not particularly successful (this was, of course, long before the days of the Global Positioning System (GPS)). One approach that was somewhat successful was the use of radar for estimating short distances and rough velocities with respect to the ground. For this purpose, I designed and built a small wearable radar system. This invention proved more useful later for other projects, such as a safety device to warn if someone might be sneaking up from behind, and later as an assistant to the visually impaired who could use it to ``feel'' objects ``pressing'' against them before contact. Some years later, in the late 1980s, I presented wearable radar to the Canadian National Institute for the Blind, but it was never widely adopted, mainly due to the poor performance arising from the limited capabilities of my wearable computers of this era. With Today's more sophisticated wearable computers, capable of implementing variations of the chirplet transform[8] in realtime (to obtain radar acceleration signatures of objects in the wearer's vicinity), I hope to revive my ``BlindVision'' project of the 1980s[9].
I designed the analog to digital and digital to analog converters for WearComp2 from resistive networks (and later, in the early 1980s, by using more expensive Analog Devices chips). In both cases, I designed these with sufficient throughput for radar, voice, and music.
Music synthesis capability was envisioned as a means of providing information (such as light levels, exposure settings, etc.) in the form of sound, but also evolved into a portable self-entertainment system (a predecessor of the SONY Walkman).
Voice digitization and playback capability in WearComp was included for experimental purposes, with the ultimate objective of taking down voicenotes pertaining to exposure information and the like. Unfortunately, the system only had 4k of RAM (later expanded to 48k), so the voice digitization capability was of little real practical use in the field. However, because I built the analog to digital and digital to analog converters as separate units, facilitating the possibility of full-duplex audio, I used this capability for simple voice processing, such as reverberation, and the like. Computer programs and data (and later commercial software including an assembler) were stored on audio cassettes. In 1981, a battery powered audio cassette recorder served as the only non-volatile digital storage medium for the wearable computer system. The cassette drive also proved useful for storage of voicenotes in analog form. (Later an 8 inch floppy drive, followed by a 5.25 inch floppy drive, were incorporated.)
Because of the limited technology of the era, the system was ``hybrid'' (part digital and part analog) in many regards. In addition to the audio cassette recorder (used to record analog voice as well as digital programs and data), the communications channel was also hybrid.
Communications comprised a total of four voice transceivers. On the body, was a radio transmitting constantly on one channel, and another radio receiving constantly on a different channel. At the base station, the situation was the same but with the channels reversed.
The modems I constructed, from simple free-running (non coherent) audio frequency oscillators and corresponding filters, operated at a data rate of approximately 45 bits per second. (Another early communication attempt averaged 1500bps but was somewhat unreliable.)
The hybrid communications channel was capable of sending and receiving full-duplex voice (e.g. to talk to an assistant at the base station) or data (e.g. to control the imaging apparatus at the base station). Video capability was also included in the system, but at a much higher frequency (not going through the radio channel that was used for voice and data). Video was an important aspect of the system, as I desired to see what the scene looked like from the perspective of the base station.
Less important, but still useful, was the desire that, during times at which there was an assistant at the base station, the assistant be able to experience my point of view (for example, to see how well the flashlamp provided coverage of a particular object, and to observe the nature of this illumination). I developed a device, known as the `aremac' (camera spelled backwards), comprising an electronic flash which had a viewfinder to preview its field of coverage. The situation in which I could look through the imaging apparatus at the base station while the assistant at the base station could look through my ``eyes'' (aremac) was found to be useful as a communications aid, especially when combined with full-duplex audio communication. I referred to this mode of operation as `seeing eye-to-eye'.
My wearable computer efforts of the 1970s were a success in some ways, and a failure in others. The success was in demonstrating the functionality of an early prototype of wearable computer system, and in formulating a practical application domain for wearable computing. However, there were many technical failures. In particular, the bulky nature of the apparatus rendered it often more of a photographer's burden than a photographer's assistant. Furthermore, the reliability problems associated with so many different system components were particularly troublesome, given the nature of typical usage patterns: walking around on rough terrain where wiring and the like would be shaken or pulled loose. Interconnections between components were found to be one of the major hurdles.
However, much was learned from these early efforts, and I decided that a new generation of wearable computers would be necessary if the apparatus were to be of real benefit. In particular, I decided that the apparatus needed to be more like clothing than like a backpack. To this end, I attempted spreading components somewhat uniformly over ordinary clothing (Fig 3). This effort succeeded in providing a truly helpful photographer's assistant, which, later, was to help me win ``Best Color Entry'', two years in a row (1986 and 1987), in the National Fuji Film Photography Competition.
Figure 3:
Generation-2 WearComps were characterized by
distributed components, with wires sewn into clothing.
(a) In my dressing room, testing flash sync
with the waist-worn television set as display medium.
Note the array of pushbutton switches on
the handle of the electronic flash which comprise a
``keyboard'' (data-entry device) of sorts.
Here I am partially dressed (a black jacket
with the remainder of the communications antennas sewn into
it has yet to be put on). Note the absence of the backpack,
which has been replaced by a more distributed notion of
wearable computing.
(b) Completely dressed, in the field. Note
my newer generation-2 flash system (homogeneous
array of 8 small lightweight electronic flashlamps)
in keeping with the
distributed philosophy of generation-2 WearComp.
(Figure courtesy of Campus Canada, 1986 and
Kent Nickerson, 1985)
Generation-2 WearComp (I referred to Gen-2 and Gen-3 as `smart clothing') was characterized by a far greater degree of comfort, with a tradeoff in setup time (e.g. the increased comfort was attained at the expense of taking longer to get into and out of the apparatus). Typical `smart clothing' comprised special pants, a special vest, and special jacket, which I connected together, such that much time and care were required to assemble everything into a working outfit. A side-effect of generation-2 WearComp, was the fact that it was more inextricably intertwined with the wearer. For example, instead of running wires from the sensors in the shoes up to the backpack, the shoes could simply be plugged into special pants that had a wiring harness and wiring already sewn in. These pants were then plugged into the compute vest, which was in turn plugged into the radio jacket, establishing a wireless connection from the shoes to the base station, without a tangled mess of wires as was characteristic of my generation-1 WearComp. Later, other experiments were facilitated by the clothing-based computing framework (for example, electrical stimulation of the muscles, attempts at monitoring the heart and other physiological parameters, etc.) which it was hoped might add a new dimension to WearComp.
Because of the more involved dressing and undressing procedures, I built a special dressing area, comprising rows of hangers to hang up the clothing, and floor-to-ceiling shelves (visible in the background of Fig 3(a). Much of my generation-1 apparatus remained, and my generation-2 components were made in such a way that they continued to be compatible with generation-1 components. For example, NTSC as well as certain variations of NTSC remained the dominant computer display format. In Fig 3(a), a mixture of generation-1 and generation-2 components are being tested. Note the waist-mounted display, which was found to be more comfortable than the display of Fig 1(a). In particular, the older generation of display was very uncomfortable due to its front-heavy nature, and its large moment of inertia.
The new generation of display was found to be much more comfortable (e.g. could be worn for several hours at a time), but it lacked a certain kind of interactional constancy that can best be described as cyborgian[10]. Although it was always on during operation, the fact that light from the display was not always entering the eye was found to detract from its utility in certain applications. Use of the waist-mounted television was somewhat reminiscent of a camera with a waist-level viewfinder (e.g. the old Brownie Hawkeye, the Rolleiflex, or the more modern Hasselblad).
With the advent of the consumer video camera, the consumer electronics industry created a newer generation of miniature CRTs for camera viewfinders. These were particularly suitable for eyeglass-based wearable computing displays[11], allowing the waist-mounted television display to be abandoned.
Table 1
Table 1: Generations of author's `WearComp' built for personal imaging:
past, present, and predicted future. Note that Gen-2 and Gen-3
overlap substantially. Gen-4 is completely hypothetical.
provides a comparison of my generation-1 and generation-2 wearable computers together with present and hypothetical future generations.
Ivan Sutherland described a head-mounted display with half-silvered mirrors so that the wearer could see a virtual world superimposed on reality [12]. Sutherland's work, as well as more recent work by others [13] was characterized by its tethered nature. Because the wearer was tethered to a workstation which was generally powered from an AC outlet, the apparatus was confined to a lab or other fixed location. Another reason for augmented reality systems being confined to some limited space was/is the use of 2-part trackers that require that the user, wearing one part, be near the other part, which is typically heavy and often carefully positioned relative to surrounding objects[13].
Generation-1 and 2 WearComp was typically characterized by a display over only one eye. Early wearable displays (such as the one in Fig 1(c)) were constructed with a CRT above the eye, aimed down, and then a mirror was used, at a 45-degree angle, to direct the light through a lens, into the eye. In some cases, the lens was placed between the mirror and the CRT, and a partially silvered mirror was used. Regardless of whether a see-through display or a display that blocked one eye completely was used, the perceptual effect was to see the computer screen as though it was overlaid on reality. This resulted in an augmented reality experience, but differed from previous work by others, in the sense that I was free to roam about untethered, while engaged in this experience.
Much more recently, Reflection Technology introduced a display called the ``Private Eye'', making it possible to put together a wearable computer from off-the-shelf board-level components, giving rise to a number of independent wearable computing research efforts appearing simultaneously in the early 1990s[14]. Thad Starner, one of those who began wearable computing work in the 1990s, refer to the experience of seeing text on a non-see-through display as a form of augmented reality[15], because, even though the text completely blocks part of the vision in one eye, there is an illusion of overlay on account of the way the brain perceives input from both eyes as a combined image.
Unfortunately, the Private Eye-based wearable computers, unlike my earlier video-based systems, were primarily text-based systems. The Private Eye display consisted of a row of red LEDs that could be switched on and off rapidly, and a vibrating mirror that would create an apparent image, but this could not, of course, produce a good continuous-tone greyscale image. Thus I did not adopt this technology, but instead, envisioned, designed, and built the third-generation of WearComp based on the notion that it should support personal imaging, yet be at least as small and unobtrusive as the generation of off-the-shelf solutions built around the Private Eye.
My current wearable computer/personal imaging systems (See, for example, Fig 4) are characterized by their almost unobtrusive (visually undetected by a large number of people) nature.
Figure 4: Current state of the WearComp/WearCam
invention comprises a complete multimedia computer, with
cameras, microphones, and earphones, all built into
an ordinary pair of sunglasses except for some of the
electronics items sewn into the clothing.
This system is typical of generation-3 of my
WearComp project, and is suitable for wearing in
just about any situation (other than bathing or during
heavy rainfall). I've even fallen asleep with the unit on
from time to time.
With the system pictured here,
for fully-mediated reality environments,
I needed to close one eye, though I have built other similar
two-eyed units.
This rig is currently running the Linux 2.0 operating
system, with XFree86 (variant of X-windows), and has
a realtime bi-directional connection to the Internet.
The most recent WearComp prototype[16], equipped with head-mounted display, camera(s), and wireless communications, enables computer-assisted forms of interaction in ordinary day-to-day situations, such as while walking, shopping, or meeting people.
While the past generations have been very cumbersome and obtrusive, current functionality has ``disappeared'' from view and been subsumed into ordinary clothing and ordinary sunglasses.
In the early 1980s, I had already been experimenting with some unobtrusive radio communications systems based on conductive threads, as well as clothing-based computers, such as a speech-controlled LED lightpaintbrush (Fig 5(d)) which I also wore to high-school dances, and the like, as a fashion item. Currently, I am trying to improve this approach to using clothing itself as a connectivity medium. I experimented with two approaches to making ``smart fabric'': additive and subtractive. In additive, I start with ordinary cloth and sew fine wires or conductive threads into the clothing. I implemented the subtractive form using conductive cloth, of which I have identified four kinds which I call BC1, IC1, BC2, IC2 (conductive one direction, and conductive in both directions, either bare or insulated, respectively). See Fig 5(a). Some of these have been used in certain kinds of drapery for many years, the conductive members woven in for appearance and stiffness, rather than electrical functionality. Ordinary cloth I call C0 (conductors in zero directions). Smart clothing may have multiple layers, e.g. BC2 as RF shield, followed by one of the following possibilities:
Figure 5: An early smart clothing effort as possible future generation
of WearComp. (a) Four kinds of conductive fabric (see main text
of article for description). (b) Back of LED shirt showing
where one of the LEDs is soldered directly to type-BC1 fabric
(the joint has been strengthened with a blob of glue).
Note the absence of wires leading to or from the glue blob,
since the fabric itself acts as conductor.
Typically one layer of BC1
is put inside the shirt, while the other
is outside the shirt. Alternatively, either an undergarment is
used, or a spacer of type-C0 between the two layers.
(c) Three LEDs on type-BC1 fabric, bottom two lit, top one off.
(d) LED shirt driven by wearable computer. (C) 1985 by Steve Mann;
thanks to Renatta Bererra for assistance.
The compact unobtrusive nature of the apparatus, and the corresponding ability for long-term wear, has led to a new genre of cinematography, and the possibility of personal documentary exhibited in real-time. Wearable Wireless Webcam (the author's Internet-connected personal imaging workstation transmitting to an online gallery) was one example of a long-term (two year) personal documentary of day-to-day experience, transmitted, for realtime consumption, to a remote audience[16].
Mediated reality[18] differs from augmented reality in the sense that not only can visual material be ``added'' to augment the real world experience, but reality may also be diminished or otherwise altered if desired. One example application, the correction of visual deficiencies, was presented in[19].
In addition to further insights in human perception, some new inventions arose out of this work. For example, the ``life through the screen'' experience, over a three-year period, caused the author to visually evolve into a 2-D thought paradigm. (Others have reported ``living in video-mediated reality''[20], but only over short time-periods, perhaps due to the more cumbersome nature of their apparatus, which also did not contain any computational capability.)
``Life through the screen'' gave rise to a manner of pointing at objects where the author's finger would be aligned with the object on the screen, yet others who were watching the author point at something (like an exit sign or surveillance camera up in the air), would indicate that, from their vantage, the author's finger appeared to be pointing in another direction.
This ``life through the screen''[18] resulted in some new observations, among them, that the finger would make a useful pointing device for a personal imaging system. This pointing device, called the fingermouse, was reported in [11] and [15].
The motivation for homographic modeling arose through the process of marking a reference frame[21] with text or simple graphics, where it was noted that by calculating and matching homographies of the plane, an illusory rigid planar patch appeared to hover upon objects in the real-world, giving rise to a form of computer-mediated reality which was described in[11].
In homographic modeling, a spouse may leave a virtual ``Post-it'' note upon the entrance to a department store , (Fig 6),
Figure 6:
Recipient of the virtual ``Post-It'' note approaches
entrance to the grocery store and sees
the message through his ``smart sunglasses''.
Illusion of attachment (registration);
the virtual note appears to be truly attached to the ``Star
Market'' sign. (Thanks to Jeffrey Levine for
ongoing collaboration on this project.
)
or one might leave a grocery list on the refrigerator, that would be destined for a particular individual (not everyone wearing the special glasses would see the message -- only those wearing the glasses and on the list of intended recipients).
Note that the message is sent right away, but ``lives'' dormant on the recipient's WearComp until an image of the desired planar surface ``wakes up'' the message, at which point it appears in frames that lie within the same orbit of the projective group of coordinate transformations in which it was placed.
When the incoming video falls into the orbit (as defined by the projectivity + gain group of coordinate transformations[22]) of one of the templates, the comparison switches modes from comparing each incoming frame with all of the templates to comparing each incoming frame with just the one that has been identified as being in the orbit.
Personal imaging suggests that the boundaries between seeing and viewing, and between remembering and recording will blur. Shared visual memory will begin to enlarge the scope of what the visual memory currently provides, for it may be possible to `remember' something or someone that one never saw.
Personal imaging has evolved beyond being a useful photographer's assistant, toward new paradigms in cinematography, wearable tetherless computer-mediated reality, and a graphical user interface on reality.
Special thanks is due to Rosalind Picard for suggesting that I write this detailed historical account of my `personal imaging' efforts and experiences, and to Steven Feiner for much help in getting it all organized; both Picard and Feiner were instrumental in causing this 20 year effort to be come together in this document. I'd also like to thank Hiroshi Ishii, Neil Gershenfeld, Sandy Pentland, and Ted Adelson of MIT, as well as Chuck Carter and Simon Haykin of McMaster University, and many others, too numerous mention here, for various insightful discussions germane to this work. Thanks also to Thad Starner, Jeffrey Levine, Flavia Sparacino, and Ken Russell for various collaborative efforts, to Matt Reynolds (KB2ACE) for help in upgrading the outbound ATV system, Steve Roberts (N4RVE) for much useful feedback and suggestions, and to Kent Nickerson for help with much of the earlier tone-decoders, radar systems, and the like. Additional Thanks to VirtualVision, HP labs, Compaq, Kopin, Colorlink, Ed Gritz, Miyota, Chuck Carter, and Thought Technologies Limited for lending or donating additional equipment that made these experiments possible. Finally, I thank my brother, Richard, for long and detailed discussions from which the term `personal imaging', and its general framework emerged.
An historical account of the `WearComp' and `WearCam' inventions developed for applications in `Personal Imaging'
This document was generated using the LaTeX2HTML translator Version 96.1-h (September 30, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 historic-split0.
The translation was initiated by Steve Mann on Tue Jan 6 23:35:20 EST 1998