Jump to Content
T2RERC  

home > publications > forum proceedings > communication enhancement > communication processing

Forum Proceedings

Stakeholder Forum on Communication Enhancement

Communication Processing: Forum Data

 

Market Needs | Current Technology | Needed Technology | Barriers to Achieving Needed Technology | References

The following is the raw data collected during the T2RERC's Stakeholder Forum. It reflects the comments and needs as expressed by the forum participants.

1. Needs (Unmet needs of consumers, clinicians, etc.)

GENERAL

  • AAC users are extremely diverse: readers/non-readers; spellers/non-spellers; highly literate/acquiring language; cognitively intact/impaired; motor intact/impaired; interest/disinterest in communicating; high personal achievements/isolated and little education/highly educated. User diversity makes the design of AAC devices very difficult.
  • Language (knowledge and understanding of) may not be needed to use an AAC system.
  • Communication (as a general concept) is too broad - should narrow focus down to the specific communication requirements of persons who use computer-based AAC systems, their families and other communication partners.
  • Need to clarify distinction between language (e.g. literacy) and communication skills.
  • AAC user and interlocutor need to maintain eye contact during dialogue. Focusing on the AAC device or having the AAC device physically between the user and interlocutor interferes with communication.
  • Need to be able to follow dialogue and quickly participate in this dialogue. (E.g. it is not natural to follow conversation, break your attention away to compose speech and then output this speech. The discussion may have passed on to another topic or listeners may become distracted and loose their train of thought.)
  • Need AAC system to provide appropriate flexible capabilities across communication situations. In some situations (e.g. social setting with familiar persons) person may augment their speech with the AAC system. In other situations (e.g. formal presentation to an unfamiliar audience) the AAC system may be the dominant means of speech production.
  • Optimal language processing probably needs to include a mix of generative and pre-stored capabilities.
  • Need language processing to be flexible to "do whatever the person needs to do" (e.g. presentations, discussion, etc) quickly (accomplish each task as fast as possible at an appropriate level of precision).
  • Need language processing to prevent problems - shift burden of responsibility from user to device (e.g. for error correction, correcting verb tense, etc).
  • Need language processing tools that allow users to produce speech both quickly and precisely (i.e. In some situations speed is critical. In other situations precision is critical. Therefore, AAC system must support both.).
  • Need AAC system (language processing) that performs well when the user is anxious or under stress - often when there is an urgent need to communicate (e.g. under time pressure, in emergencies).
  • Need to minimize the meta-linguistic processing demand. For example, dynamic displays are not part of "normal" language processing. Ideal AAC device would not require any meta-linguistic processing. User should not have to integrate too much information from the device. If meta-linguistic processing demands are high, user may find device too hard to use and give up.
  • Need to augment and support user abilities (e.g. residual ability to speak, gestures, ability to use hands and fingers). · Need to minimize the cognitive loading required to use an input modality (e.g. memory, attention, vigilance, complex reasoning, etc.).
  • Need to optimize language representation/manipulation strategies (i.e. Physical limitations often make it difficult for a user to access the selection set - for example, scanning is generally very slow).
  • Need symbolic representation of language. Some question as to whether communication can be accomplished without the use of symbols (e.g. icons, pictures and text are all 'symbols')
  • Need transparency - the ability to use AAC system "out-of-the-box" and be able to say what you want. Basic capabilities should available immediately. It should not be required to learn advanced features in order to use the device. It should be a natural progression to learn advanced capabilities when user has the time or need for more advanced capabilities. It was noted: "no one would buy a computer if they had to be familiar with 20% of the features in order to use the computer" and "people often use only a small number of the features available from a word processor."
  • Need high automoticity - that is, AAC systems should be very "natural to use" (i.e. difference between riding a bike after years of practice versus learning to ride a bike. Includes such things as low vigilance, low precision, low cognitive burden, low stress, etc).
  • Automoticity can cause problems (e.g. if the user is very weak and the system produces communication that requires frequent correction - may increase overall workload and diminish efficiency).
  • Need codependence between AAC system (e.g. input interface, language processing) and user's abilities (e.g. physical, sensory, cognitive) in order to maximize the speed, efficiency and effectiveness of language production.
  • Need language processing that allows the user to transparently switch from one input device (eye gaze, joystick, touch screen, etc) to another (e.g. language processing should perform equally well with any type of input. Note: This does not imply that the rate of selection is unchanged.)

LANGUAGE STORAGE AND RETRIEVAL

  • Need different language processing tools for different situations - a mix of pre-stored and other capabilities are best.
  • Need language processing to adjust or adapt as the child acquires language skills. During the early stages of language acquisition, storage and retrieval strategies that support basic communication may be most appropriate. (Note: may apply for any individual gaining or reacquiring language skills)
  • Need improved storage and retrieval techniques in order for whole language approaches to be most effective.
  • Need to be able to navigate efficiently through pre-stored text in order to mimic natural conversation (i.e. quickly get to speech/language that is used "over and over again").
  • Need to be able to improve and refine (customize) stored speech - way of producing generative speech.
  • Need to be able to efficiently retrieve pre-stored text. (i.e. Use of pre-stored text can speed communication rate provided that stored text can be retrieved efficiently.)
  • Need to explore other approaches for constructing communication. Language processing could generate prototypical sentences with word slots and word alternatives for these slots as selections. Prototypical sentences might be context dependent.
  • Need language processing to have strong generative capabilities - a person can "say whatever they want to say." Language construction should not be constrained.
  • Need to simplify error correction and shift the burden of error correction from the user to the AAC system (e.g. in spelled text, the AAC system should identify the error location and provide a means for the user to quickly correct get to and correct this error).

CONTEXT RECOGNITION

  • Need language processing to adapt in response to interlocutor speech (i.e. use of speech recognition to establish context) but should not cause the augmented communicator to lose control of conversation.
  • Need language processing to use information from the environmental context to make word prediction more powerful (e.g. the words commonly used in a chemistry lab are likely to be much different than words commonly used around the dinner table).
  • Need language processing to adapt to the contextual information embedded in the user's speech output (e.g. A person may be at a football game but talk about chemistry class. Note: How much dialogue must be tracked in order to establish that you are speaking about chemistry class rather than football?).
  • Word prediction should reflect the AAC user's communication "intent" (e.g. prior word usage, context of communication).

TRAINING AND SYSTEM CUSTOMIZATION

  • Need ability to adjust and optimize language-processing capabilities easily (e.g. quick, simple, intuitive) and transparently (i.e. requiring little or no attention from the user).
  • Need persons using AAC systems to be able to do more of the setup and optimization themselves (increase independence). Unlikely that system users will be able to do everything themselves and different users will need different levels of support.
  • Need AAC systems (language processing capabilities) to support educational goals (e.g. why not take the specific need of the specific user and recognize that different people need different degrees of support).
  • AAC system capabilities should smoothly transition (ramp up, become available) with the user's ability to utilize more advanced capabilities.
  • AAC systems need an input interface that adapts (or can quickly be adapted) to accommodate changing user abilities or preferences.
  • Need language processing capabilities that are easy to learn, use and access through a variety of input interfaces.
  • Need language processing to assist development of capabilities (tools) wanted by the user - rather than telling the user how to use the system (e.g. If a person wants to use codes but has difficulty with recalling them, language processing could provide/suggest strategies to help the user recall codes).
  • Need to evaluate meta-linguistic skills (e.g. logical awareness) during language development. Tools exist to measure these skills, including through observation of their ability to tune in to other's meta-linguistic signals (i.e. facial expression, tone of voice and body language).
  • Need language processing to provide smart training for people learning how to use the AAC system (e.g. recognize problems and mistakes; provide feedback and guidance on how to correct these mistakes; perhaps some sort of a wizard that would help you when you get lost).
  • Need training program that teaches language in a stepwise fashion (e.g. "We are giving a programming language to the user and expecting them to learn it. Provide a program that focuses on learning the first 20 nouns.")
  • Need language processing to be user friendly - especially during training (e.g. encourage use, explore capabilities without severe "penalties").
  • Need to provide "natural language assistance" to users - AAC system explains the language to the user (e.g. "verb tense is used to.")
  • Need AAC device to provide a learning environment in which the user can develop his or her communication skills. Device should allow the user to improve old skills or utilize new skills. Direct construction of language (i.e. spell out words) is not possible for many users. Learning might involve non-spelling based skills.

PERFORMANCE MONITORING

  • Need performance monitoring to evaluate the rate and quality of communication and language production.
  • Need active performance monitoring - sense performance and mistakes and adjust (language processing, display) parameters to reduce errors and improve performance.
  • Need language processing to help prevent errors during training (e.g. guide the user by providing reasonable alternatives rather than requiring the user to correct errors after they've already been made).

[ Top of Page ]

2. State-of-the-Practice (current technology, strengths, weaknesses, etc.)

GENERAL

  • Few augmented communicators are currently utilizing all of their abilities or realizing the full potential of the AAC technology.
  • AAC users have a wide range of physical skills, cognitive capabilities, language capabilities and meta-linguistic skills.
  • In order to utilize (most) AAC systems a person must: have physical control over some body part (e.g. finger, head, eyes); understand how to act upon the AAC system; find the use of the AAC system interesting or compelling; and have a desire to communicate.
  • The attention required to make selections is related to a users physical abilities. Fast direct-selectors may require less conscious effort than persons who aren't able to access the device as quickly (e.g. indirect-selectors).
  • Ability to speak may decrease dependency on the AAC system. If a person can communicate with residual speech, gestures, facial expressions, etc., these modes of communication may be more efficient and perhaps preferable to AAC assisted communication.
  • A participant stated that 80% of disabled users also have a learning disability - impacts ability to learn and utilize AAC devices.
  • Two distinct populations use AAC systems: 1) persons who are cognitively intact but have motor impairments, and 2) persons who are motorically intact but have cognitive impairments.
  • The communication partner often "corrects speech errors" - some users accept this "filling-in" as a reasonable compensatory strategy while others prefer to correct errors themselves.
  • AAC devices become more transparent as training, instruction and practice increase.
  • Communication can be linguistic (oral) or non-linguistic (e.g. gestures, motions - point at a picture to get something).
  • Communication includes any systematic form for representing, manipulating, or transferring information. Communication may be conventional or non-conventional.
  • Communication is the ability to convey meaning to establish mutual understanding for some purpose.
  • Communication speed of persons using AAC systems is "correlated" to perceptions of competence by the system user (self-perception), their communication partner(s) and observers.
  • Communication rate is affected by cultural and situational conventions (e.g. slang and dialect specific words won't show up in "standard" word prediction lists which may force person using the AAC system to employ alternative means of composition).
  • Communication quality degrades with the time needed to compose a sentence increases (e.g. the AAC user can lose their train of thought; the communication partner may become distracted or impatient).
  • Communication is the attribution of meaning to messages; is what people interpret a message to mean; is establishing and moving common ground; is a mutual understanding; is an exchange of information between people to serve a purpose (not limited to transactions); is a 2-way understanding (a 1-way message may be interpreted in a way not intended by the person producing the message).
  • Language is the most powerful form of communication.
  • Natural Language is the language that we learned from our parents. It is conventional.
  • Natural language draws upon a lot of knowledge that is not associated with words (e.g. body posture, expressions, context, history, etc.).
  • Literacy is the ability to build complex, complete, logical, detailed and grammatically correct verbal and written dialogue; is the ability to generate, read and comprehend text; may be symbol based rather than "standard" text. (Note: standard text is a type of symbols).
  • People lacking linguistic competence show great variation in their linguistic and non-linguistic skills. Not a uniform problem.
  • Some people lacking linguistic competence can access and use AAC systems based on procedure (visual, spatial, actions) rather than knowledge of and competency using language. People lacking linguistic competency include children, first time learners and adults who have never learned to read or write and do not have an "intact language system". [Note: also includes persons who have lost linguistic competence due to trauma or disease.]
  • Participants disagreed as to whether behaviors and memorized actions are communication. Some participants stated that behaviors are communicative. Other participants stated that behaviors are not communicative and that memorized actions cannot be used to produce generative communication. (E.g. A teacher asks a student to "point to the yellow object." The student points at the correct object and gets rewarded by the teacher "very good, that's correct!" This is an example of behavior. Is it also communication?)
  • Difficult for individuals lacking linguistic competency to learn and utilize AAC systems.
  • It is possible to have speech loss without language loss and vice versa.
  • People lacking linguistic competency sometimes communicate by "circumlocution" - they get to the topic with "signals and clues."
  • Participants disagreed about the relative merits of generative and whole language approaches.
  • Communication focus for children is often less on language and more on engaging the child in social settings.
  • Many rate enhancement tools (e.g. spell checking, grammar checking, punctuation management, text string search) are already part of word processing software.
  • Persons using PC-based systems have access to Thesaurus, but it is not part of AAC device capabilities. Capability is well developed in non-AAC products.
  • Some participants find current rate enhancement tools (e.g. abbreviation expansion, word prediction) to be "mechanical" (i.e. unnatural, uncomfortable). These participants preferred to spell out words.
  • Some participants questioned the effectiveness of current rate enhancement tools (e.g. word prediction can be slower than spelling).
  • Some participants felt that typing out words was the only effective way to "say what you want to say."
  • Some participants thought that abbreviation expansion was not effective. May be a "training issue" (i.e. users are unfamiliar or unpracticed with technique). Overall, abbreviation expansion is "a very simple technique."
  • Many "high-end" AAC users construct communication with a combination of tools that include spelling, word prediction and symbols (E.g. One participant stated that they effectively use word prediction in conjunction with Minspeak to construct communication.)
  • Efficiency of error correction and text editing is limited by the user's selection rate.
  • The Language Sampling Library is a resource for LP research. [1]

MESSAGE CREATION

  • In 'natural communication,' the listener uses their natural language processing capabilities to expand upon and understand a person's telegraphic expression. For speedwriting and abbreviation expansion, the AAC system must expand upon and understand the telegraphic expression.
  • Systems employing telegraphic communication need to find the right words or fewest words to accomplish their communication task. Finding the right or fewest words is difficult when the communication task is constantly changing.
  • Compansion (compression/expansion) is one important area of NLP research. Telegraphic input is expanded into complete phrases. (E.g. VG= very good; I went store = I went to the store yesterday). Variation on compansion is flexible ("on the fly") abbreviation expansion (Enkidu). [Note: Compansion (COMpressed message exPANSION) is a language processing technique in which telegraphic speech ("John Eat Apple") is transformed into well-formed sentences ("John has eaten the apple"). Compansion may support higher communication rates without requiring a great deal of cognitive effort.
  • Compansion has been used as a writing, therapy or learning tool for different user populations (e.g. expand icons/abbreviations to make phrases).
  • Telegraphic speech may not be appropriate for some situations (e.g. A conference presentation requires complete accurate statements. This is very different than a conversation with a family member at dinner.)
  • Speedwriting - sentences constructed with contractions, abbreviations and missing words (e.g. articles like 'the' and 'a') on the fly and expanded into complete phrases and sentences by system. Might also provide syntactic information (e.g. correct verb tense, plurals).
  • Speedwriting is related to abbreviation expansion - around since the 1980s.
  • Abbreviation expansion is similar in concept to macros used in Microsoft Word
  • Speedwriting might be a better writing tool than conversational tool.
  • Speedwriting might be useful in some conversational contexts - for example, producing comments like "tough, oh!" "I'm sorry, etc." However, this is more retrieval than generation. · Speedwriting has a trade-off between precision and speed.
  • Speedwriting must be reviewed and modified by the user (e.g. to correct and modify). There is a cost for the user to read, comprehend, and do this modification. Phrase modification could lose these time gains.
  • Speed writing "strains working memory." Extra cognitive burden distracts from communication. · Some people may find the extra difficulties associated with using speedwriting worth it. Others may not.
  • Speedwriting might require a different kind of interface.
  • AAC systems could have two modes of operation selection mode (standard interface) and generation mode (e.g. for speed writing).
  • Disambiguation (Disambiguation can be letter-by-letter or word level with multiple letters or symbols on each selectable item. For word-level disambiguation, as items are selected the system attempts to predict the target word based upon relative word frequency. Selecting additional items eliminates possible target words. For example, selecting 'a-b-c-d' could mean 'dog' or 'duck,' whereas selecting 'a-b-c-d' followed by 'l-m-n-o' can only mean 'dog'). · Disambiguation could combine letters and symbols (icons).
  • Disambiguation now requires that words be spelled correctly (i.e. no error correction - 'someb' would not be recognized as 'something').
  • People can recognize words that are misspelled or incomplete (e.g. one or two letters are wrong or missing) because these words have excess (redundant) information that still differentiates them from other possible words.
  • Disambiguation can be used to reduce the physical distance between selections (important when person using AAC system has limited range of motion).
  • Disambiguation allows bigger keys to be placed on smaller displays with same selection set reducing selection precision.
  • Disambiguation interface has fewer items to select from and may be easier to memorize.
  • Disambiguation with word prediction or abbreviation expansion can reduce keystrokes.
  • When a disambiguation is an input strategy, the user may be forced to choose from amongst viable alternative words (phrases). This takes time and may also confuse the user.
  • Disambiguation is available on some phones.
  • Disambiguation is available on some AAC devices (see Enkidu products).
  • A large disambiguation system can be used as an accessible writing tool for someone who's just learning.
  • AAC systems employing iconic statements, the user selects an icon sequence and the AAC device completes the sentence. User gets to words through the iconic sequence.
  • Minspeak Application Programs (MAP's) are prearranged vocabulary sets using icon sequences used in some AAC devices rather than programming in vocabulary for each individual user.
  • Keystroke savings are more beneficial to some users (i.e. persons utilizing very slow indirect selection techniques) than others (i.e. persons utilizing fast direct selection techniques)
  • Scan-based selection is very slow and requires the person to follow the cursor position (using visual or auditory cues).
  • Scanning is an exhausting process for persons with ALS (common users of scanning interfaces). Scan-based selection is even more difficult for those with additional auditory problems.
  • Scanning generally utilizes items arranged on a grid (items arranged in rows and columns).
  • Higher dimensional scanning. "Two dimensional scanning" - under each selectable item is another grid of selectable items. "Three dimensional scanning" - under each selectable second layer item is another grid of selectable items.
  • Two and three dimensional scanning place a high cognitive and memory burden on the user.
  • A Forum participant noted that one 'client' memorized 3 levels (items have a grid or list of items under them) with auditory scanning. A great memory burden - but impressive · "Adaptive scanning" might be problematic. (E.g. A 'correct' but low probability item could be 'placed' farther away. For two or three-dimensional scanning, items are found by remembering where they are. How can an item's position be changed that still allows it to be easily found?)
  • Speedwriting with fixed scanning tables may not provide significant benefits because of the increased cognitive, memory burden and reading correction to use this interface.
  • Speedwriting with adaptive scanning tables may add additional cognitive/memory and reading correction burden beyond fixed scanning tables.
  • Research on optimization of word and letter boards for scanning (first done by Zygo).
  • Lots of research for "computers understanding language." AAC systems are a practical application of this research.

MESSAGE REFINEMENT

  • Real-time message refinement is practically impossible with scanning, Morse code and other switch-based input.
  • Error correction slows down communication rate. The navigational issue is a big part of it, not to mention actually making the change. Getting to the spot where the repair needs to be made would be useful.
  • Some AAC devices provide some support for spelling and grammar checking.

LANGUAGE STORAGE AND RETRIEVAL

  • Pre-stored text (on AAC systems) extends user memory. In natural language, people don't generate novel utterances all the time. They often recall utterances or fragments of utterances for modification and reuse.
  • One participant stated that in social conversation (by any person) a large proportion of overall speech is predictable (e.g. reused, formalisms).
  • Research was conducted on an AAC system with both pre-stored text and generative capabilities. Pre-stored text was used 93% of the time while new text was generated 7% of the time.
  • One participant stated that with current AAC systems, "high-end" users rely on pre-stored vocabulary and sentences less than 10% of the time.
  • One participant stated that AAC systems with pre-stored text have been tried many times and people generally aren't constructing speech using pre-stored text capabilities (exceptions - stylistic or scripted speech).
  • Some value in a limited set of fixed alternative phrases (e.g. "wait a second", "let me think a moment"). This type of communication is non-generative but could help to hold attention, pace and control conversation.
  • Some AAC systems allow the user to store and retrieve speech.
  • Stored text on AAC systems is not systematically organized (e.g. by context, by relationships, etc).
  • Retrieving speech stored on an AAC system is similar to completing a file search on a personal computer.
  • For AAC systems that do support speech storage, the speech doesn't remain in memory for long (i.e. systems frequently "flush stored speech from the buffer").
  • Prediction capabilities of AAC devices are currently limited to letters and words.
  • Current letter and word prediction uses little language processing (e.g. predicting the third letter probability given the occurrence of the first two letters.)
  • Character prediction can be based upon frequency of use (e.g. given the letter sequence 'th' what is the likelihood that the next letter is an 'e' or 'q'?)
  • Word prediction is a (simple) form of text storage and retrieval.
  • Improved word prediction is being developed that utilizes information from recent phrases and conversational context.
  • Word prediction can be based on independent frequency of word usage, word usage given previous 1-3 words, word usage dependent upon discussion content, etc. Prediction reduces keystrokes but doesn't necessarily increase rate (e.g. time, also required to see word list, decide upon best word and choose word. If a word is not present in the list, it must be constructed in some way.)
  • Use of standardized message structures (i.e. canonical forms) could reduce the complexity of the AAC interface and enhance communication rate.

CONTEXT RECOGNITION

  • Combining multiple streams of contextual information (i.e. own speech, interlocutor speech, time, place, etc) may be very useful for persons who have great difficulty producing communication.
  • Context recognition (derived from the body of composed text) may be very useful for literacy (writing) applications where contextual information may change slowly and systematically.
  • Language processing (e.g. recognition and adaptation based upon contextual information) could support phrase and sentence prediction [Note: Would also improve letter and word prediction.] · Conversational context can shift quickly. There may not be enough usable information (no matter how context is obtained) to adapt language processing before it shifts to the next.
  • Context recognition has applications besides word prediction (e.g. make different pages available - conditional tree structure, create pages "on the fly" populated with items that reflect the communication context - visit to the doctors office)
  • Contextual information from interlocutor speech supports real-time adaptation of language processing, and is an important source of contextual information. Research with limited vocabulary has demonstrated this.
  • Interlocutor speech could provide contextual information (e.g. thorough speech recognition) that would place constraints on language processing (e.g. identify topic and alter word or phrase prediction in response).
  • Speech recognition technology can be used to follow interlocutor speech. Speech recognition is sufficiently advanced to begin achieving this now.
  • Speech recognition performance depends upon environmental factors (e.g. single versus multiple speakers, quiet versus noisy, speaker position and orientation relative to the microphone, speech quality etc) and the speech recognition engine (hardware, software, microphone).
  • Developments in image recognition technology may aid in context recognition (e.g. interlocutor recognition, environmental recognition).
  • The Global Positioning System (GPS) can provide time and location information. Current GPS technology is only suitable outside of buildings (i.e. GPS signal is interfered with by buildings, geographical features, etc).
  • Physical environment (time, place from GPS) provides little information about the language environment (e.g. talking about a hair cut at McDonalds).
  • Research on dysarthric speech recognition is related to both speech recognition to establish context and language processing generally.
  • Dynamic pages on current AAC systems are not context sensitive (i.e. branching and page content does not change with context).
  • Work is taking place in AAC industry to improve the performance of context sensitive word prediction.
  • Participants disagreed as to whether contextual information improves keystroke efficiency on current word-based AAC systems (e.g. context dependent word prediction).
  • Use of contextual information for word or phrase prediction is similar to an email filter where messages are sorted by source, date, etc.
  • AAC should utilize microphones to adjust the output sound level to adapt to the ambient noise. Car stereos currently adjust volume in this way
  • Casi-Reader is a proximity card and key reader that has a union point for the complete wiring of microcontrollers, readers, and door controls. The box standardizes and simplifies wiring connections by bringing all reader, microcontroller, door lock, and digital input connections to one point.

VOICE OUTPUT

  • AAC voice output is very unnatural - information carried in "normal" speech (prosody for instance) is missing. Voice output is frequently misinterpreted or misunderstood. AAC voice output does not reflect discourse and pragmatics.
  • Word pronunciation is problematic - even when words are spelled correctly. Some words need to be misspelled in order to be pronounced correctly (e.g. 'horsez' - sounds fine but 'horses' - does not).
  • Some AAC systems support the use of homographic words (e.g. one of two or more words with same spelling but which differ in origin, meaning, and sometimes pronunciation).
  • Words must often be pronounced differently depending upon their context of use. AAC systems do not recognize contexts and modify word pronunciation.
  • Automatic text-to-speech synthesis that uses word context to determine pronunciation is being employed outside of the AAC industry.
  • Some AAC systems allow the user to add words to their dictionary and customize word pronunciation. Most people currently don't use these capabilities and that's a problem. Requires user to be a programmer and can't be done in "real time."
  • It is possible to add words or change word spelling and pronunciation in the AAC dictionary, but participants stated that it is a heavy burden on clinicians and users.

TRAINING AND SYSTEM CUSTOMIZATION

  • There exists a continuum between being illiterate and being literate.
  • AAC systems do not do a good job of supporting language acquisition. Some systems are trying to be language teachers for restricted language settings.
  • Mainstream education has writing to read programs to support emerging literacy (teaches critical thinking in literary skills, phonemic awareness, writing, composing, etc.)- some children develop that, so there are kids who could get a sound approximation. [4]
  • Communication rate is closely related to the amount of learning (practice) a user has on the AAC device and the amount of instruction provided for the use of this AAC device.
  • Language representation and acquisition with AAC systems does not work very well for 2 to 3 year old children.
  • Children are introduced into the classroom environment too soon (i.e. before they have sufficient mastery of language).
  • Icon-based language is especially difficult to learn (e.g. Multiple-meanings for the same picture. Meaning changes with context in which it appears. User must memorize and understand all this). No effective strategies available to teach icon-based language.
  • Some children acquiring language prefer letter spelling as opposed to use of pictures. "Lingraphica" [5] is an assistive and therapeutic device for persons with aphasia that utilizes a graphical interface map. Lingraphica provides novel, important, and useful support for improving communication in real-life situations. It does this by providing access to an extensive database of more than 2000 word concepts, each of which can display meaning in graphics, text, and voice.
  • Clinicians often find AAC systems too complicated (e.g. clinician focus may be on learning how to set up and optimize an AAC system rather than how to use the AAC system as a clinical intervention).
  • Manufacturers "train" clinicians and users on how to optimize the performance of their (the manufacturer's) systems. Training doesn't generalize from one manufacturer's products to another's.
  • Clinicians use AAC programs/capabilities as provided by the manufacturer. Clinicians often don't know how to customize programs/capabilities in a way they feel is most appropriate for their client. Some participants believed that "this was not the problem of AAC device, but rather a clinician training issue."

PERFORMANCE MONITORING

  • Performance monitoring (e.g. Language Activity Monitor) is a recent innovation in AAC devices (e.g. track keystrokes, page changes, output, etc). LAM has an external switch to automatically send data.
  • AAC systems should support remote performance evaluation (e.g. via modem and the Internet).
  • Language Activity Monitor (LAM) - language activity events yield summary report. Some families are using LAM without clinical intervention. Useful for identifying core verses extended vocabulary. Core vocabulary consistent across (all) environments and times. Useful for optimizing access to different words.
  • Language Activity Monitor (LAM). Tracks language usage events, language context and the time to generate this context (core vocabulary, extended vocabulary, etc). Assists clinical process by monitoring communication as it varies with time, location, age, etc.
  • LAM does not currently assist directly with language generation.
  • LAM could be a valuable clinical and research tool.
  • LAM can be used to flag communication problems (e.g. absence of 'do' family of words, no wh** questions).
  • LAM can identify the types of syntactic structures people are using to communicate.
  • LAM can identify areas of weakness in language generation.
  • LAM can suggest the kind of intervention needed to improve language generation. · Communication performance can be predicted based upon a LAM profile.
  • LAM can track communication performance in different situations, activities and environments. · LAM information can be used to help optimize device usage (and setup).
  • LAM can provide basic performance data for artificial intelligence and language processing researchers.
  • Performance monitoring is currently available only for higher end devices with synthesized speech.
  • Log files should be encrypted and under the user's control (e.g. user can erase or edit log file). Use of interlocutor speech in the log file also has related privacy concerns.
  • AAC systems should have log files for performance analysis.
  • LAM can provide performance data (Some manufacturers are working with ASHA to develop performance standards for AAC systems.)
  • AAC industry is working towards a standardized data format for performance log files [6]
  • Software (ACQUA) has been developed to analyze performance data . [7]
  • An orthographic language library has been established [1]

[ Top of Page ]

3. Needed Technology (refinements, innovations, etc.)

GENERAL

  • Need codependence between physical ability and language representation / management strategies (i.e. language representation is "adjusted" to reflect physical ability).
  • Need dynamic codependence - language representation/management adapts as the users physical abilities change (i.e. through the day, throughout life, following the progression of a disease).
  • Need language processing to be transparently "general purpose" so that it works well across tasks and environments.
  • Effective communication may be carried out with poorly formed, sentence fragments. What does communication partner accept as adequate? What strategy do users employ to achieve good communication rate? AAC capabilities should take advantage of these strategies.
  • Generative language processing approaches may be more useful for literacy while pre-stored/whole language approaches may be more useful for communication.
  • Need to have generative language capabilities in order to build literacy. Some participants stated that AAC devices based upon "whole language" do not facilitate language acquisition.
  • Need flexible communication and literacy options available to the user for different environments and tasks especially when communicating with unfamiliar communication partners.
  • Language processing capabilities should not introduce delays in communication or literate writing.
  • Need AAC system to provide range of capabilities associated with a PC (e.g. word processing, databases, modems, full Internet access, etc.). Capabilities would support becoming a better writer, enhance marketable computer skills and increase access to communication and knowledge resources.
  • Need AAC system to provide phonebook, organizer and calendar and related capabilities. · Need capability of training "story telling" to AAC user.
  • May be possible to construct AAC that are accessed by procedural memory (action sequence) in the spatial or visual domain and map this input to the verbal domain. Unknown how to accomplish this mapping however.
  • An "ideal AAC device" is a complete AI system that knows (recognizes, reflects, is optimized for) you, utilizes context and information (own speech, time, place, interlocutor). For short conversations (or conversations whose topic quickly changes), you will not have enough information to identify the context.

MESSAGE CREATION

  • Disambiguation could reduce input interface complexity.
  • Disambiguation could significantly increase selection rate for many selection techniques (e.g. scanning rates could be '4 times faster' if there were four symbols per item).
  • Disambiguation could have great potential with eye gaze (i.e. It is difficult to select small items with eye gaze. Larger items are easier to select making eye gaze more practical. Enables eye gaze to be utilized with smaller displays. Example of codependence between physical abilities [input technique] and language processing capabilities [disambiguation]).
  • Icon based disambiguation should be investigated (e.g. as icons are selected possible meaning of this icon sequence narrows).
  • Icon-based compansion (now text-based, telegraphic sentence production) should be investigated (short sequence of icons expanded into a complete sentence).
  • Disambiguation interfaces are relatively simple now. They need to have more sophisticated capabilities in order to be attractive to a wider population of persons using AAC systems.
  • Possible language processing application is speedwriting - sentences constructed with contractions, abbreviations and missing words (e.g. articles like 'the' and 'a') on the fly and expanded into complete phrases and sentences by system. Might also provide syntactic information (e.g. correct verb tense, plurals).
  • Need language processing to automatically correct spelling errors, expand contractions, provide punctuation, fill in common omissions, and correct verb tense. The user should be able to see and accept/reject the corrections. Too much burden is placed upon the user to do these things. · Need support for message creation (e.g. thesaurus, archive messages, context-based storage and recovery of messages)
  • Need to be able to find synonyms (to improve quality and clarity of speech).
  • Need to have access to a Thesaurus (to improve quality and clarity of speech)
  • Need a way to quickly elaborate on a topic. (E.g. Like a web site - underlined words have "hyperlinks" to word definitions or concept explanations. "Click" on underlined word to access the elaboration. Very useful when interlocutor is not familiar with a topic.).
  • Need language processing to recognize and reflect the communication goal. Word choice and sentence structure should differ depending upon whether a person is asking or answering a question.
  • Need language processing to keep track of a user's conversation and offer help in selecting the next thing to say.
  • Need language processing to recognize your linguistic intent (creating a question, speaking in past tense, etc) and support sentence construction accordingly.
  • Ideally, messages would be created in a way that reflects the concept you are trying to communicate and have the proper syntax (noun, verb, punctuation, etc).

MESSAGE REFINEMENT

  • Need language processing to support smart semantic editing (e.g. change verb to reflect ownership or past/present tense)
  • Language processing should identify the error location and "take" the user to that location (e.g. eliminate time spent backspacing to the error).
  • Language processing should identify errors and offer alternative selections/corrections (e.g. eliminate less efficient methods of error correction).
  • It should be easy to change sentences midway (e.g. be able to participate in two Forums running simultaneously. It should not be necessary for the user reformulate the whole message.).
  • Need to adapt to very poor spelling (e.g. letter reversal, dyslexia, idiosyncratic). System should recognize individual spelling (misspelling) tendencies. User should not have to be a great speller in order to use the AAC device.
  • Need support for message refinement (e.g. on-the-fly/real-time spelling error correction, efficient editing of stored messages)

LANGUAGE STORAGE AND RETRIEVAL

  • Need to have a logged history of what is said and make available for processing/selection during conversation. Should be transparent.
  • Need language processing to reduce, recycle and retrieve frequently used language.
  • Need to be able to store, recall and edit dialogue
  • Need to be able to store speech for later use and refinement.
  • Need to reduce storage demands by eliminating some of the previously stored speech (e.g. as more messages are created and stored it may become more difficult recovering the most appropriate stored speech).
  • Need tracking system for stored speech (e.g. store speech by its context and history of use).
  • Need to be able to efficiently locate and retrieve stored speech (e.g. "on line" knowledge representation that automatically creates hyperlinks to stored speech).
  • Need standard phrases to hold people's attention that can be output while sentence construction is still ongoing (e.g. "Just a moment please")
  • Need smart language editing for stored speech (e.g. "touch a button to change statement from present tense to past tense").

CONTEXT RECOGNITION

  • Need language processing to provide candidate words and phrases to the AAC user based upon the dialogue (e.g. conversational context suggests information or communication needed)
  • Need to combine the various streams of contextual information (e.g. interlocutor speech, own speech, location, activity, time, day, season, etc) in order to have the greatest impact on language processing and communication.
  • Perhaps the user could manually identify context and language processing would adapt based upon this input. How to best use this contextual information is a difficult problem.
  • Interlocutor speech is an important source of contextual information for language processing. There are two 'parts' to the problem. First, speech recognition is improving rapidly but isn't quite good enough for all environments and circumstances. Second, once speech is recognized, need to determine how to employ it for language processing.
  • Speech recognition must be capable of isolating the communication partner(s) from background speech. [Note: Adaptive beam forming microphones have been employed to improve performance of commercial speech recognition software.]
  • Interlocutor speech can be used to predict phrases (e.g. find noun traces in the interlocutor's speech and generate open ended questions based upon these noun traces).
  • Need interlocutor speech recognition to prime AAC devices to provide appropriate responses in certain settings (e.g. provide a person with 4 choices (phrases utterances) to keep them in a conversation. If person is talking about Mary then it may prompt things about Mary.)
  • Need to recognize time and location and adapt language processing (e.g. if you're in the kitchen, in the morning, then breakfast foods are predicted) [Note: participants disagreed on the impact context recognition would have on word prediction performance as measured by keystroke efficiency.]
  • Need language processing to adapt to environments and situations (Child versus adult, at home, at work, weather conditions, time of day).
  • Need language processing to adapt to time of day/week context (e.g. Monday, not a holiday, 11 AM - work dialogue is used).
  • Need to sense ambient noise level through use of microphone and adjust output volume (louder room/car/outdoor setting, louder speech).
  • Need to sense ambient light levels (e.g. have a light sensor for this purpose) and adapt brightness and contrast.

VOICE OUTPUT

  • Need language processing to control speech synthesis. The interface between the message (language processing) and the speech synthesizer (speech output) needs to be refined.
  • Need language processing to control speech output - insert pauses (e.g. in response to punctuation, separate phrases), control prosody (intensity, first formant, stress) based upon discourse and pragmatics.
  • AAC device should be programmable "on the fly" (e.g. add a word to the dictionary with pronunciation and definition).

USER TRAINING AND SYSTEM CUSTOMIZATION

  • Need automated (computer-based) testing (standardized performance testing with and without the intervention of AAC system).
  • Users need the ability to easily program AAC device (e.g. set up, modify, optimize, select capabilities)
  • Need service (e.g. repair, upgrades, battery replacement) for AAC devices to be cheap and readily available (i.e. easy to get device to service, quick turnaround).
  • Device should power up when the user is awake or active and power down when the user is sleeping or inactive.
  • Need intelligent language processing that recognizes problems (e.g. in set up, performance, optimization) and provides feedback to correct these problems.
  • Need to shorten the learning curve for practitioners (clinicians) by building in training strategies, tutorials, help capabilities, etc.
  • Need comprehensive training for clinicians and users to be built into the AAC device (e.g. how to set up and optimize the AAC device in order to enhance communication, literacy, education etc.).
  • Need built-in tools (for setup and optimization) beyond preprogrammed pages.
  • Need a "wizard" to help set up, optimize and correct problems (e.g. similar to wizard that comes with many Microsoft products.
  • Need special software "programs" that run on AAC devices to meet specific/targeted needs (e.g. help attain educational goals). AAC devices are general-purpose tools that may not be well suited to these purposes.
  • Need AAC systems to be flexible, support growing communication skills while remaining transparent. Transparency should not be sacrificed in order to achieve flexibility and potential growth.
  • Need AAC systems to provide scaffolding support for language acquisition. Compensate for language deficits as new language skills are acquired.
  • AAC systems should take advantage of the natural progression of literacy skills (e.g. communication based on symbols, symbols and letters, symbols letters and syllables, etc).
  • Language processing should provide a scaffolding to learn icon-based languages (e.g. add icon meanings as person becomes more skilled).
  • Need language processing to provide a smooth transition from simple symbol-based communication to powerful communication languages such as Minspeak (One participant suggested that Minspeak might be taught as a second language in schools).
  • Language processing should provide a learning environment for picture-based languages.
  • "Semantic features" should be grouped together with icons to help persons who can't remember the word they wanted.
  • Need icon based language acquisition to employ icons that "suggest" the word of interest (e.g. thermometer + sun = hot day).
  • Need to be able to recognize and adapt to individual spelling capabilities (e.g. letter reversal- dyslexia for instance")
  • Need alternative information representations for children ("same content, different register")
  • Should consider grounded language learning for AAC applications (learn the meaning of a word by hearing it and simultaneously seeing what's going on).
  • Dream idea: listen to speech and match which icon means what to one person - would be a complete AI system.
  • Need training and instruction to be built into AAC device. Advanced capabilities should be easy to learn. For example, it is easy to pick up the advanced features of Microsoft Word when you need them (e.g. built-in Help functions)
  • A separate display and input interface accessory (or a second AAC device) is needed when teaching an augmentative communicator how to use their AAC system. "[You] don't want to use his device to communicate with him - even without speech output it would be like 'using his mouth to talk to him.'" SYSTEM MONITORING
  • AAC systems should have log files for performance analysis.
  • The mere action of writing and typing suggests a higher level of formality than occurs in conversation.
  • AAC users spend more time "writing" than producing output, but the conversation is surprisingly natural when you look at a transcript.
  • Performance monitoring could provide the basis for "smart optimization." AAC devices could identify the strengths and weaknesses of individual users and provide custom recommendations.
  • Need AAC system to track user performance (e.g. selection errors, spelling errors) and adapt or adjust features (e.g. display setup, increased word prediction list length) in response to changing user abilities (e.g. decreased physical precision, increased number of spelling errors). Should check with the user to see if they think the changes would be helpful.
  • Need to be able to turn adaptive language processing features ON and OFF (i.e. People may think that machine is doing the thinking. Use "language orthosis" at certain times and not others.).

[ Top of Page ]

4. Barriers (to obtaining technology, to developing technology, etc.)

  • AAC system (computer) capabilities place limits on the kinds of language processing capabilities that can be offered.
  • Advanced language processing capabilities (e.g. common sense, reasoning ability) will require powerful computer processing capabilities and large memory resources that are not present on current AAC systems.
  • AAC capabilities are limited by hardware and system limitations (e.g. memory, processing speed, speech recognition, etc).
  • A simple, flexible, easy to use AAC system may require very complex and powerful hardware and software capabilities.
  • As systems become more flexible they can also become more difficult to manage (e.g. where do you 'go' when you want to turn off Word processing features?)
  • Now we have simple devices and a lot of features, but we have yet to offer options as to what the tech thinks they can handle. We want all of the features, but we want it transparent. Ex- Mac is an intuitive system
  • "Apple Interface Design Book" [8] is a good source of information on interface design.
  • Context-based language processing capabilities/tools that facilitate writing (literacy) may not be useful for conversation (e.g. when writing, there is a large coherent body of contextual information that may evolve gradually and logically).
  • Communication depends upon many sources of information rather than text (e.g. non-verbal cues, history, intent, environment, etc).
  • Performance monitoring requires information from "outside of the AAC device" (e.g. non-verbal information, interlocutor speech).
  • Speech recognition works well with a trained user in a controlled environment (e.g. PC voice interface) or with a limited vocabulary of untrained speech (e.g. phone and voice portal applications). Better speech recognition technology is needed in order to recognize speech in 'uncontrolled' speech environments.
  • Speech recognition must not only identify words but must also 'understand' how the words were said. Speech recognition technology has not reached this level of sophistication.
  • Better speech recognition technology - a sentence is more than a string of words. Speech recognition technology that is only capable of recognizing individual words loses information.
  • Some participants thought that a great deal of effort would be required in order for contextual information to have a major impact on AAC performance.
  • Language processing is AI complete- all related problems must be solved (e.g. natural language uses context, history, non-verbal cues, speaker intent, isolates the interlocutor, filters out irrelevant information, etc.) Language processing cannot approximate natural language until it has similar capabilities.
  • It is unclear which of the language processing approaches being pursued by manufacturers have the most potential to improve communication and literacy. Research needs to establish the effectiveness of these approaches before manufacturers will be willing to invest in development and implementation.
  • AAC systems need intensive testing and data gathering (to determine the effectiveness of innovations and suggest further design changes).
  • In order to improve AAC device performance, need better understanding of AAC users and how they are using their technology now. This understanding needs to be based upon quantifiable performance measures.
  • Need to understand why AAC users are only using a small portion of AAC device capabilities. What prevents them from realizing the full potential of the AAC device now?
  • Augmentative communicators currently evaluate AAC systems rather than shaping the design of these systems. Much research utilizes 'able-bodied' individuals rather than augmented communicators.
  • Difficult getting access to large groups of AAC users for testing and evaluation. Perhaps remote testing of these systems is possible.
  • Language processing that intelligently anticipates words, phrases, sentence structure, etc., might bias the choice of vocabulary and may cause the user to lose control of the conversation.
  • Language representation, language processing, rate enhancement etc are hard problems that require basic research.
  • Need to know what it means to 'represent language.'
  • Increasing evidence that representing language in a box has huge limitations and we have a lot of work to do.
  • Language processing research for AAC applications needs to draw upon other sources of knowledge and related disciplines (e.g. psycho-linguistics, Lowen-Lavelle, Herb Clarks, human factor, ergonomics, cognitive and developmental psychology).
  • Need to draw upon work being done in other rapidly evolving fields (e.g. Internet - hyper-linked information, voice portals, etc).

References

  1. AAC Institute. [Online: http://www.aacinstitute.org.]
  2. McCoy K, Pennington, C, Luberoff Badman A, "Compansion: From Research Prototype to Practical Integration" Natural Language Engineering, 1998, v 4(1), pages 73-95; http://www.asel.udel.edu/natlang/nli.html.]
  3. Reference: M. King, C. Kushler, D. Glover, "JustTypeT - Efficient Communication with Eight Keys," RESNA Proceedings 1995, pages 94-96.
  4. Camden Street Schoo. "WRITE to READ Program." [Online: http://www.nps.k12.nj.us/camden_st/write_to_read.htm
  5. Steele, Richard. "A Journal from Concept to Commercialization - Lingraphica." OnCenter Technology Transfer News. Issue No. 5. May 1993
  6. AAC-RERC. Communication Performance Assessment (CPA). "Logfile Protocol." [Online: http://www.aac-rerc.org/performance.html#Logfile]
  7. AAC-RERC. Communication Performance Assessment (CPA). "ACQUA." [Online: http://www.aac-rerc.org/performance.html#Logfile]
  8. Apple Interface Design Book.

[ Top of Page ]