Jump to Content
T2RERC  

home > publications > forum proceedings > communication enhancement > input technologies

Forum Proceedings

Stakeholder Forum on Communication Enhancement

Input Technologies: White Paper

 

Technology Area | Market Needs | State-of-the-Practice | Issues to Consider | References

Technology Area

Numerous consumers, researchers, and manufacturers identified improved input technology as a high priority need in the Augmentative and Alternative Communication (AAC) device industry. The characteristics and capabilities of these technologies are a critical determinant of communication rate and clarity. The need for appropriate input technology is especially important for persons with severe disabilities, specifically AAC users with limited motoric abilities. Improved input technology would enhance interpersonal interactions, naturalness, and ease of communication. Some persons with severe disabilities find it difficult to use augmentative communication devices with current input technology. Factors to consider when designing improved input systems include: the user, the device, the method of inputting information, and the environment in which it is used. Input performance impacts both training the device user by a Speech Language Pathologist (SLP) as well as day-to-day use by the augmented communicator.

[ Top of Page ]

Market Needs

People with a wide variety of physical conditions benefit from the use of AAC devices. The needs of this population vary widely based on condition and capabilities. Numerous communication devices are currently available to meet these needs. The number and variety of these devices continues to grow as computer technology continues to become smaller and less expensive. While the device technology continues to evolve, input technology has remained relatively simple. Most AAC users employ some sort of pointing (physical pressure or non-contact pointing) or switch technology to interface with their device. [1]

For augmented communicators with full use of their hand(s), input methods are not an issue. They are able to use a keyboard, mouse, touch screen, or a combination of these devices. In general, as the motor capabilities of an augmented communicator decrease or become unreliable, their input options also decrease. Many of the input devices available to people with limited motor control use relatively simple technology. For example, 4% of augmented communicators use head wands to perform direct selection [2]. These devices can be cumbersome and have the potential to cause repetitive stress injuries. Significant opportunities exist to use new technology to improve the quality and performance of the input interface for this population.

A number of technologies have potential for improving the input interface to an AAC device. These technologies may be used by themselves or in combination to allow the user to have faster and more accurate input into their device. Promising technology includes:

  • Dysarthric Speech Recognition
  • Gesture Recognition
  • Eye Gaze or Eye Tracking o Virtual Reality Controls (e.g. virtual reality gloves)
  • Neurophysiological Controls (e.g. brain wave controls)
  • Multi-modal Input

Many AAC users would benefit from dysarthric speech recognition technology. Commercially available speech recognition technology is designed for people with no speech impairments. Some of these systems may be able to recognize speech of individuals with mild impairments. [3] Dysarthric speech poses a unique challenge for speech recognition systems that may be overcome by a combination of improved hardware and software. The population that would benefit from these improvements is relatively small. Therefore, to this point very little commercial effort and resources have been committed to this effort.

The population that would benefit from the other input methods depends largely on the quality of the technology. High-quality technology could have applications for the general population as well as AAC users. For example, high quality eye gaze or speech recognition technologies may be used to make selections on a personal digital assistant (PDA) such as a Palm Pilot. If these applications were widely used in the general population, the quality of such technology would continue to improve while the price would decrease. The AAC population that will benefit from new input technology will be determined by the functionality of the device and the ease at which the technology can be adapted by clinicians to meet the needs of the AAC user.

[ Top of Page ]

State-of-the-Practice

The means by which augmented communicators provide input to an AAC system is broadly divided into direct access (e.g. keyboard, mouse, touch screen, joystick, infrared head pointer, etc.) and indirect access methods (e.g. switch based scanning). In general, augmented communicators with good strength, stamina and precision are more likely to use direct access interfaces. Augmented communicators with more impaired motoric capabilities are more likely to use electronically assisted direct access interfaces or indirect, switch-based scanning interfaces. Whatever interface is being employed, the augmented communicator uses input to select from among items on the AAC display.

AAC system displays are static, dynamic or mixed (some characteristics of both static and dynamic displays). For static displays the user selects from among a fixed set of items. Static displays typically have as few as 4 items to more than 128 items from which to select. AAC devices with static displays often employ static display overlays that can be replaced. For dynamic displays, selection choices are continuously changing as the user inputs and selects items. The user often navigates through pages of displayed items. Selection of a page item can result (for example) in a new page being displayed (branching), a phrase or sentence being further constructed or voice output being produced.

Overall rate of communication is dependent upon 1) the rate at which items are selected, 2) the rate at which the AAC device transforms these inputs into acceptable linguistic units (letters, words, phrases, sentences) and 3) the rate at which voice output is produced. Increasing the selection rate will improve overall communication rate. Input technologies that are easy to use and allow the user to maintain a comfortable body posture will reduce fatigue and strain. Input technologies that allow the augmented communicator to maintain eye contact with the interlocutor will provide non-verbal communication cues and improve the quality and naturalness of communication.

Selecting an input technology for an augmented communicator should be performance based. That is, given the physical and cognitive capabilities of the augmented communicator, the input technology should optimize such factors as communication rate, ease of use, comfort, and ability to maintain eye contact. In addition, user preferences and the environment in which the device is used (i.e. word boards for home vs. electronic devices for conferences) must always be taken into account. In many cases, an augmented communicator will use different input technologies that vary depending upon their location, physical status (e.g. fatigue) and activity.

Augmented communicators having good strength, precision, range of motion and endurance generally use direct selection technologies such as keyboards, touch screens, mouse and joystick. However, persons with tremor and spasticity (e.g. many individuals with cerebral palsy) often find it very difficult to use either a mouse or joystick. Most mouse and joysticks familiar to augmented communicators are isotonic - controlled by movement, direction and speed under approximately constant force. Isometric joysticks - controlled by directed force without movement - are available for gaming and industrial applications. Early research has compared cursor positioning with isotonic and isometric joysticks by persons with cerebral palsy. This research suggests that for some persons, the use of isometric joysticks can improve cursor-positioning performance [4,5].

Other direct selection technologies such as infrared head pointers and eye gaze systems are sometimes employed by augmented communicators who lack the range of motion, endurance, or strength required to use hands, feet, and/or head (i.e. for head wand) when utilizing more physically demanding input technologies. In addition, eye gaze or infrared systems may be considered more socially acceptable to use in group situations.

Eye gaze systems are constantly being modified and improved. The incorporation of eye gaze systems allows augmented communicators to access their AAC device using discrete eye movements. Eye gaze systems can employ galvanometric sensors (which measure voltages across the eye) or video image processors that examine optical images of the eye. [6]

Eye gaze systems are broadly divided into two categories: head mounted and remote. Remote mounted systems are easier and less obtrusive because they do not need to be hooked up to the user, they are remotely mounted cameras that measure eye movement. Eye gaze systems generally work by placing an infrared light, which is centered at the surface of the eye's cornea, creating a reflection off the retina. The camera lens records this reflection, and the computer calculates the person's gaze point in relation to their display screen. This information is then used to access the device system using a human computer interaction system. [6] Systems are becoming more refined and can be accurate to within 1 cm, can identify the eye 60 times a second, and can interface to (work with) other computer software systems currently available on the market. [7]

Most infrared head pointer systems have an infrared transmitter whose light is bounced off a reflective dot worn on the forehead of the user. The reflected light is tracked by a set of infrared receivers. Information from these receivers is used to triangulate the head orientation relative to the display and to establish cursor position. The relationship between the user's head orientation and display cursor position is typically established by having the user move their head left, right, up and down to (or beyond) the display's border. Other technologies use ultrasonic sound in a similar fashion.

The signal produced by an infrared head pointer mimics the signal produced by standard mouse and pointer systems can provide full access to the graphical user interface (GUI) of a personal computer or compatible AAC device. For example, slow directed head movement might be used to position the cursor while a quick up-down nod might signify a "right mouse click." More commonly however, the infrared pointer is used to position the cursor and a left-right push button emulates the mouse buttons. Infrared head pointer systems typically utilize a serial or USB port. The sun is a powerful source of infrared light that can overwhelm the infrared transmitter. For this reason infrared systems generally cannot be used in daylight. [8]

Indirect selection methods such as single-switch scanning, auditory scanning, and message encoding (e.g. morse code, chart-based, etc.) are often used by augmented communicators who have severe functional limitations (e.g. limited range of motion, precision, endurance, or strength). Single switch scanning is a system used for AAC users with limited motoric capabilities. The device scans items on a display screen and the user accesses these items using a single switch. This method of input can be tedious and slow for an augmented communicator.

Encoding allows users, by selecting display items in an appropriate sequence, to generate a large number of stored messages from just a few "hits", access longer messages and enhance their rates of communication. Some form of encoding is available across all AAC device categories. An important clinical indicator for language encoding is the user's ability to learn (and recall) the codes. Clinical indicators for encoding are determined through the AAC assessment process and relate to the type and number of codes to be memorized. Encoding methods are generally implemented in three ways: memory-based, chart-based, and display-based. Memory-based coding requires the user to memorize the codes for the communication device, which may become very difficult. Chart-based coding eliminates the need for memorization by listing the codes on the AAC display. In display-based encoding, the display presents possible items from which to choose. The user responds to a display rather than sending a code that is memorized or selected from a chart. [9]

Auditory scanning is a special technique used for communicating with people who have an inability to speak, coupled with severely limited visual and motor skills. The way auditory scanning works is that a list of vocabulary items is presented auditorily (e.g., read aloud) for the person with a disability. [10] Morse code, one method of coding, is an international symbol system that represents letters, numbers, and punctuation marks using a series of dots and dashes.

Morse Code is becoming an increasingly popular input method for individuals with disabilities because it can be very fast. [11]

In "linear scanning" and "circular scanning," items in the selection set are presented to the user one at a time. In linear scanning, items are presented and selected by the user, scanning then begins again with the first item. In "circular scanning" items are presented one at a time, scanning then continues with the next item in the scan sequence. Linear and circular scanning are the slowest of all possible scanning methods; therefore, they are primarily used for training, assessment, use by individuals with a limited vocabulary, or by those who are in need of more processing time. [10]

"Multi-dimensional (group-item) scanning" was developed to increase selection rate. In group-item scanning, there are several items in each group and several groups to select from. The groups are presented one at a time and the user selects the group having the desired item. The items within this group are then scanned one-at-a-time until the user selects the desired item. The most common type of group-item scanning is row-column (so-called two-dimensional) scanning. In row-column scanning items are arranged in rows and columns. Rows are presented one-at-a-time until the user selects the row with the desired item. The row items are then scanned until the desired item is selected. [10]

New input technologies for AAC devices include: neurophysiological signals, virtual reality, dysarthric speech recognition, gesture recognition, and multi-modal input.

Input technologies can utilize biological signals using a combination of eye movements, facial muscle movements, and brain wave bio-potentials. These signals are then amplified, digitized, and translated into commands that can be used to control the computer. [12] Signals produced by the brain, nervous system and muscles can be used to control computers and AAC devices. Examples include: Electroencephalographic (EEG - produced by brain activity), Electrooculargraphic (EOG - signals produced during eye movement), and Electromyographic (EMG - signals produced during muscle contraction and relaxation) signals are used to collect specific information from various body-brain systems simultaneously for control and command of the device. The brain wave biopotentials can be used for discrete on/off control of program commands, switch closures, keyboard commands, and the function of left and/or right mouse buttons. [12]

Virtual reality technologies (i.e. Virtual reality glove, eyeglasses, etc.) are being used today in the design of web sites, for training purposes, for computer access, etc. These systems have not yet been used in correspondence with AAC devices as a system for input. They hold a great deal of potential in terms of the discrete access and feedback that they can provide for an AAC user. Virtual reality gloves are electro-mechanical devices that translate digital information into physical sensations. Various types of virtual reality gloves, for example, have been developed to perform a variety of functions, such as, the transformation of hand and finger motions into real time digital data (e.g. Cyberglove), the ability of the glove to provide vibro-tactile feedback (e.g. Cybertouch), and force feedback options that use the human computer interface technology to literally reach and grab computer generated objects via the human hand (e.g. Cybergrasp). [13] The inclusion of various types of virtual reality gloves into AAC systems would provide a user with new ways to input information into their device, using a more natural method of access via the hand.

Research is underway to determine the effectiveness of dysarthric speech recognition systems. These systems would be able to provide a dysarthric speaker with the ability to use their own voice to access an AAC device. The device would then recognize the users voice and process the input. A speech output device would then be used to convey the message. The ENABL system is one example of a device that is being used for dysarthric speech recognition. The system initiates with a spoken command that is detected by the system. First it is analyzed by the speech recognition module, which uses acoustic models, grammar and lexicon (vocabulary). This information is then fed to an output recognizer and parser (a program that dissects source code so that it can be translated into object code), which translates the command. [14]

Other dysarthric speech recognition research focuses on creating teachable interfaces for individuals with dysarthric speech and other severe physical disabilities, which would be capable of translating unintelligible vocalizations into effective actions or clearly articulated synthesized speech (e.g. Toco the Toucan). Current systems that use speech recognition may be adapted to dysarthric speech by incorporating an adaptation process based upon a repertoire of dysarthric speaker's voices in relation to severity and type of dysarthria. Further research and technology needs to be developed so that this system can be incorporated into advanced input interfaces for AAC devices. This technology stands to benefit not just dysarthric speakers but can be applied to people with other communication impairments. [3]

The development of gesture recognition systems in AAC devices is being explored and has recently gained popularity. It is assumed that this method of device input is useful since gestures are readily used as a method of communication, which makes them both efficient and intuitive for the AAC user. Gesture recognition devices can provide access to a variety of computer technologies, such as, PC programs, Internet, etc. Various types of gestures (continuous vs. discrete) can be used and read by the devices (e.g. head gestures, hand gestures), which in turn are translated by the system for device access. Results from research show that with a short training time gesture recognition rates are 80%-90%. Systems such as the Head Gesture Recognition System (HGRS) and JesterMouse are two examples of gesture recognition devices that have been developed for individual's with disabilities to use in order to access computer software technology. [15,16].

Multi-modal technology development has emerged in the field of AAC in order to address the needs of a variety of users. Multi-modal systems would provide a user with numerous methods of input for their device interfaces. In addition, the incorporation of wireless technologies into the multimodal access systems would increase the number and ease of additional input systems that the user could access. The combination of several input systems such as speech recognition devices, gesture recognition, eye gaze, infrared, etc. can be used to access AAC devices. These would allow an AAC user to modify their device access based on environment (noise level), context (classroom, home), and access needs in order to speed rate and efficiency of communication.

Researchers are looking to combine multiple simultaneous gestural inputs with other access techniques (e.g. voice input, switches) in order to improve device input and access for the user. [15] The combination of direct manipulation plus speech is intended to use the strengths of one modality to overcome the weaknesses of the other. [17]

Systems are being developed (e.g. Stanford University's Archimedes Project) that use multi-modal input to address two crucial access problems: 1) the problem of a particular individual's access to one computer, and 2) the problem of that individual accessing any computer. The system created is called the Total Access System (TAS). This system consists of two main components, the Personal Accessor and the Total Access Port (TAP). Personal Accessors vary from person to person according to the user's abilities and preferences. TAPs link the Personal Accessor to any host computers that the user wants to work on. The Accessor can serve as a communication aid for face-to-face conversation by transferring the user's inputs to an output device such as a speech synthesizer or connecting directly with another accessor used by a conversational participant. [18] Programs such as the Archimedes project are creating multi-modal input systems that have the means by which compensation can occur with additional benefits in terms of speed, efficiency, and naturalness of communication for the user.

[ Top of Page ]

Issues to Consider

The Need

  • What are the important, unmet (or poorly met) user needs related to input interfaces for AAC devices?
  • What populations or demographics (severity of communication impairment, type of disability, level of cognition) are most affected regarding advanced input interface technology use for AAC devices?
  • In which environments would advanced input interfaces be beneficial for the AAC user?

State-of-the-Practice

  • What types of input systems can best benefit AAC user in their communicative interactions (infrared, eyegaze, gesture recognition, etc.)?
  • What is the strength of advance input interfaces in AAC devices in terms of performance, cost, etc.?
  • What are the weaknesses of advanced input interfaces in AAC devices in terms of performance, cost, etc.?

Future Technology and Products

  • What type of technology needs to be developed or implemented into AAC devices to improve input interfaces?
  • What technical barriers must be overcome in order to incorporate advanced input interfaces into AAC devices?
  • What breakthrough technologies might better address the needs and problems for input interfaces that are currently not on the market?

[ Top of Page ]

References

  1. Beukelman, David R. & Mirenda, Pat (1998). Augmentative and Alternative Communication. 2nd Edition. Paul H. Brookes Publishing Co. Baltimore MD
  2. Murphy, Joan et al. "Augmentative and Alternative Communication Systems Used by People with Cerebral Palsy in Scotland: Demographic Survey" AAC Augmentative and Alternative Communication Volume 11, March 1995
  3. Patel, Rupal & Roy, Deb. Teachable Interfaces for Individuals with Dysarthric Speech and Severe Physical Disabilities. [Online:http://dkroy.www.media.mit.edu/people/dkroy/papers/pdf/aaai_assistive98.pdf ]
  4. T.A. Stapleford, R.M. Mahoney, "Improvement in Computer Cursor Positioning Performance for People with Cerebral Palsy," RESNA'97 June 20-24, 1997, pages 321-323.
  5. R.S. Rao, S. Rami, T. Rahman, P. Benvunuto, "Evaluation of an Isometric Joystick as an Interface Device for Children with CP," RESNA'97 June 20-24, 1997, pages 327-329.
  6. Cleveland, Nancy. (1994) Eyegaze Human-Computer Interface for People with Disabilities. [Online: www.eyegaze.com/doc.cathuniv.htm]
  7. Department of Systems Engineering at the University of Virginia. (5/4/01) [Online: http://www.sys.virginia.edu/research/erica.html]
  8. Boost Technology. (2001) The Boost Tracer. [online]. Available: http://www.boosttechnology.com/home.html. (April 27, 2001)
  9. Wright State University College of Engineering and Computer Science. (1999). Selection Systems. [online]. Available: http://www.cs.wright.edu/bie/rehabengr/AAC/selectmethod.htm. (January 24, 2001)
  10. Penn State, School of Education, Department of educational and School Psychology and Special Education. (1999). Auditory Scanning & Alternative and Augmentative Communication. [online]. Available: http://espse.ed.psu.edu/SPLED/McN/auditoryscanning/home.html. (February 1, 2001)
  11. Wright State University College of Engineering and Computer Science. (1999). Selection Set. [online] Available: http://www.cs.wright.edu/bie/rehabengr/AAC/selectionset.htm. (January 24, 2001)
  12. Brain Actuated Technologies, Inc. (5/4/01) [Online: http://www.brainfingers.com/featured_research.htm]
  13. Immersion. (5/4/01) [Online: http://www.immersion.com/products.html]
  14. Rosengren, Elisabet. (2000). "Perceptual analysis of dysarthric speech in the ENABL project" TMH-QPSR, KTH 1/2000, pgs. 13-18.
  15. Keates & Perricos. Gestures as a Means of Computer Access. (5/4/01) [Online: http://rehab-www.eng.cam.ac.uk/papers/1sk12/cm96/]
  16. Keates, Potter, Perricos, & Robinson. Gesture Recognition- Research and Clinical Perspectives. (5/4/01) [online: http://rehab-www.eng.cam.ac.uk/papers/1sk12/resna97/resnag.htm]
  17. Grasso & Finin. Task Integration in Multimodal Speech Recognition Environments. (5/4/01) [online: http://www.csee.umbc.edu/~mikeg/papers/report02.html]
  18. Stanford University: Archimedes Project. (5/10/01). [online: http://archimedes.stanford.edu//arch.html]

[ Top of Page ]