 |
Market Needs |
Current Technology |
Needed Technology |
Barriers to Achieving Needed Technology |
References
The following is the raw data collected during the T2RERC's
Stakeholder Forum. It reflects the comments and needs as expressed by
the forum participants.
1. Needs (Unmet needs of consumers, clinicians, etc.)
GENERAL
- AAC users are extremely diverse: readers/non-readers; spellers/non-spellers;
highly literate/acquiring language; cognitively intact/impaired;
motor intact/impaired; interest/disinterest in communicating; high
personal achievements/isolated and little education/highly educated.
User diversity makes the design of AAC devices very difficult.
- Language (knowledge and understanding of) may not be needed to
use an AAC system.
- Communication (as a general concept) is too broad - should narrow
focus down to the specific communication requirements of persons
who use computer-based AAC systems, their families and other communication
partners.
- Need to clarify distinction between language (e.g. literacy) and
communication skills.
- AAC user and interlocutor need to maintain eye contact during dialogue.
Focusing on the AAC device or having the AAC device physically between
the user and interlocutor interferes with communication.
- Need to be able to follow dialogue and quickly participate in this
dialogue. (E.g. it is not natural to follow conversation, break
your attention away to compose speech and then output this speech.
The discussion may have passed on to another topic or listeners
may become distracted and loose their train of thought.)
- Need AAC system to provide appropriate flexible capabilities across
communication situations. In some situations (e.g. social setting
with familiar persons) person may augment their speech with the
AAC system. In other situations (e.g. formal presentation to an
unfamiliar audience) the AAC system may be the dominant means of
speech production.
- Optimal language processing probably needs to include a mix of
generative and pre-stored capabilities.
- Need language processing to be flexible to "do whatever the person
needs to do" (e.g. presentations, discussion, etc) quickly (accomplish
each task as fast as possible at an appropriate level of precision).
- Need language processing to prevent problems - shift burden of
responsibility from user to device (e.g. for error correction, correcting
verb tense, etc).
- Need language processing tools that allow users to produce speech
both quickly and precisely (i.e. In some situations speed is critical.
In other situations precision is critical. Therefore, AAC system
must support both.).
- Need AAC system (language processing) that performs well when the
user is anxious or under stress - often when there is an urgent
need to communicate (e.g. under time pressure, in emergencies).
- Need to minimize the meta-linguistic processing demand. For example,
dynamic displays are not part of "normal" language processing. Ideal
AAC device would not require any meta-linguistic processing. User
should not have to integrate too much information from the device.
If meta-linguistic processing demands are high, user may find device
too hard to use and give up.
- Need to augment and support user abilities (e.g. residual ability
to speak, gestures, ability to use hands and fingers). · Need to
minimize the cognitive loading required to use an input modality
(e.g. memory, attention, vigilance, complex reasoning, etc.).
- Need to optimize language representation/manipulation strategies
(i.e. Physical limitations often make it difficult for a user to
access the selection set - for example, scanning is generally very
slow).
- Need symbolic representation of language. Some question as to whether
communication can be accomplished without the use of symbols (e.g.
icons, pictures and text are all 'symbols')
- Need transparency - the ability to use AAC system "out-of-the-box"
and be able to say what you want. Basic capabilities should available
immediately. It should not be required to learn advanced features
in order to use the device. It should be a natural progression to
learn advanced capabilities when user has the time or need for more
advanced capabilities. It was noted: "no one would buy a computer
if they had to be familiar with 20% of the features in order to
use the computer" and "people often use only a small number of the
features available from a word processor."
- Need high automoticity - that is, AAC systems should be very "natural
to use" (i.e. difference between riding a bike after years of practice
versus learning to ride a bike. Includes such things as low vigilance,
low precision, low cognitive burden, low stress, etc).
- Automoticity can cause problems (e.g. if the user is very weak
and the system produces communication that requires frequent correction
- may increase overall workload and diminish efficiency).
- Need codependence between AAC system (e.g. input interface, language
processing) and user's abilities (e.g. physical, sensory, cognitive)
in order to maximize the speed, efficiency and effectiveness of
language production.
- Need language processing that allows the user to transparently
switch from one input device (eye gaze, joystick, touch screen,
etc) to another (e.g. language processing should perform equally
well with any type of input. Note: This does not imply that the
rate of selection is unchanged.)
LANGUAGE STORAGE AND RETRIEVAL
- Need different language processing tools for different situations
- a mix of pre-stored and other capabilities are best.
- Need language processing to adjust or adapt as the child acquires
language skills. During the early stages of language acquisition,
storage and retrieval strategies that support basic communication
may be most appropriate. (Note: may apply for any individual gaining
or reacquiring language skills)
- Need improved storage and retrieval techniques in order for whole
language approaches to be most effective.
- Need to be able to navigate efficiently through pre-stored text
in order to mimic natural conversation (i.e. quickly get to
speech/language that is used "over and over again").
- Need to be able to improve and refine (customize) stored speech
- way of producing generative speech.
- Need to be able to efficiently retrieve pre-stored text. (i.e.
Use of pre-stored text can speed communication rate provided that
stored text can be retrieved efficiently.)
- Need to explore other approaches for constructing communication.
Language processing could generate prototypical sentences with word
slots and word alternatives for these slots as selections. Prototypical
sentences might be context dependent.
- Need language processing to have strong generative capabilities
- a person can "say whatever they want to say." Language construction
should not be constrained.
- Need to simplify error correction and shift the burden
of error correction from the user to the AAC system (e.g. in spelled
text, the AAC system should identify the error location and provide
a means for the user to quickly correct get to and correct this
error).
CONTEXT RECOGNITION
- Need language processing to adapt in response to interlocutor speech
(i.e. use of speech recognition to establish context) but should
not cause the augmented communicator to lose control of conversation.
- Need language processing to use information from the environmental
context to make word prediction more powerful (e.g. the words commonly
used in a chemistry lab are likely to be much different than words
commonly used around the dinner table).
- Need language processing to adapt to the contextual information
embedded in the user's speech output (e.g. A person may be at a
football game but talk about chemistry class. Note: How much dialogue
must be tracked in order to establish that you are speaking about
chemistry class rather than football?).
- Word prediction should reflect the AAC user's communication "intent" (e.g.
prior word usage, context of communication).
TRAINING AND SYSTEM
CUSTOMIZATION
- Need ability to adjust and optimize language-processing capabilities
easily (e.g. quick, simple, intuitive) and transparently (i.e. requiring
little or no attention from the user).
- Need persons using AAC systems to be able to do more of the setup
and optimization themselves (increase independence). Unlikely that
system users will be able to do everything themselves and different
users will need different levels of support.
- Need AAC systems (language processing capabilities) to support
educational goals (e.g. why not take the specific need of the specific
user and recognize that different people need different degrees
of support).
- AAC system capabilities should smoothly transition (ramp up, become
available) with the user's ability to utilize more advanced capabilities.
- AAC systems need an input interface that adapts (or can quickly
be adapted) to accommodate changing user abilities or preferences.
- Need language processing capabilities that are easy to learn, use
and access through a variety of input interfaces.
- Need language processing to assist development of capabilities
(tools) wanted by the user - rather than telling the user how to
use the system (e.g. If a person wants to use codes but has difficulty
with recalling them, language processing could provide/suggest strategies
to help the user recall codes).
- Need to evaluate meta-linguistic skills (e.g. logical awareness)
during language development. Tools exist to measure these skills,
including through observation of their ability to tune in to other's
meta-linguistic signals (i.e. facial expression, tone of voice and
body language).
- Need language processing to provide smart training for people learning
how to use the AAC system (e.g. recognize problems and mistakes;
provide feedback and guidance on how to correct these mistakes;
perhaps some sort of a wizard that would help you when you get lost).
- Need training program that teaches language in a stepwise fashion
(e.g. "We are giving a programming language to the user and expecting
them to learn it. Provide a program that focuses on learning the
first 20 nouns.")
- Need language processing to be user friendly - especially during
training (e.g. encourage use, explore capabilities without severe "penalties").
- Need to provide "natural language assistance" to users - AAC system
explains the language to the user (e.g. "verb tense is used to.")
- Need AAC device to provide a learning environment in
which the user can develop his or her communication skills. Device
should allow the user to improve old skills or utilize new skills.
Direct construction of language (i.e. spell out words) is not possible
for many users. Learning might involve non-spelling based skills.
PERFORMANCE
MONITORING
- Need performance monitoring to evaluate the rate and quality of
communication and language production.
- Need active performance monitoring - sense performance
and mistakes and adjust (language processing, display) parameters
to reduce errors and improve performance.
- Need language processing
to help prevent errors during training (e.g. guide the user by
providing reasonable alternatives rather than requiring the user
to correct errors after they've already been made).
[ Top of Page ]
2. State-of-the-Practice (current technology, strengths, weaknesses, etc.)
GENERAL
- Few augmented communicators are currently utilizing all of their
abilities or realizing the full potential of the AAC technology.
- AAC users have a wide range of physical skills, cognitive capabilities,
language capabilities and meta-linguistic skills.
- In order to utilize (most) AAC systems a person must: have physical
control over some body part (e.g. finger, head, eyes); understand
how to act upon the AAC system; find the use of the AAC system interesting
or compelling; and have a desire to communicate.
- The attention required to make selections is related to a users
physical abilities. Fast direct-selectors may require less conscious
effort than persons who aren't able to access the device as quickly
(e.g. indirect-selectors).
- Ability to speak may decrease dependency on the AAC system. If
a person can communicate with residual speech, gestures, facial
expressions, etc., these modes of communication may be more efficient
and perhaps preferable to AAC assisted communication.
- A participant stated that 80% of disabled users also have a learning
disability - impacts ability to learn and utilize AAC devices.
- Two distinct populations use AAC systems: 1) persons who are cognitively
intact but have motor impairments, and 2) persons who are motorically
intact but have cognitive impairments.
- The communication partner often "corrects speech errors" - some
users accept this "filling-in" as a reasonable
compensatory strategy while others prefer to correct
errors themselves.
- AAC devices become more transparent as training, instruction and
practice increase.
- Communication can be linguistic (oral) or non-linguistic (e.g.
gestures, motions - point at a picture to get something).
- Communication includes any systematic form for representing, manipulating,
or transferring information. Communication may be conventional or
non-conventional.
- Communication is the ability to convey meaning to establish mutual
understanding for some purpose.
- Communication speed of persons using AAC systems is "correlated" to
perceptions of competence by the system user (self-perception), their
communication partner(s) and observers.
- Communication rate is affected by cultural and situational conventions
(e.g. slang and dialect specific words won't show up in "standard" word
prediction lists which may force person using the AAC system to employ
alternative means of composition).
- Communication quality degrades with the time needed to compose
a sentence increases (e.g. the AAC user can lose their train of
thought; the communication partner may become distracted or impatient).
- Communication is the attribution of meaning to messages; is what
people interpret a message to mean; is establishing and moving common
ground; is a mutual understanding; is an exchange of information
between people to serve a purpose (not limited to transactions);
is a 2-way understanding (a 1-way message may be interpreted in
a way not intended by the person producing the message).
- Language is the most powerful form of communication.
- Natural Language is the language that we learned from our parents.
It is conventional.
- Natural language draws upon a lot of knowledge that is not associated
with words (e.g. body posture, expressions, context, history, etc.).
- Literacy is the ability to build complex, complete, logical, detailed
and grammatically correct verbal and written dialogue; is the ability
to generate, read and comprehend text; may be symbol based rather
than "standard" text. (Note: standard text is a type of symbols).
- People lacking linguistic competence show great variation in their
linguistic and non-linguistic skills. Not a uniform problem.
- Some people lacking linguistic competence can access and use AAC
systems based on procedure (visual, spatial, actions) rather than
knowledge of and competency using language. People lacking linguistic
competency include children, first time learners and adults who have
never learned to read or write and do not have an "intact language
system". [Note: also includes persons who have
lost linguistic competence due to trauma or disease.]
- Participants disagreed as to whether behaviors and memorized actions
are communication. Some participants stated that behaviors are communicative.
Other participants stated that behaviors are not communicative and
that memorized actions cannot be used to produce generative communication.
(E.g. A teacher asks a student to "point to the yellow object."
The student points at the correct object and gets rewarded by the
teacher "very good, that's correct!" This is an example of behavior.
Is it also communication?)
- Difficult for individuals lacking linguistic competency to learn
and utilize AAC systems.
- It is possible to have speech loss without language loss and vice
versa.
- People lacking linguistic competency sometimes communicate by "circumlocution"
- they get to the topic with "signals and clues."
- Participants disagreed about the relative merits of generative
and whole language approaches.
- Communication focus for children is often less on language and
more on engaging the child in social settings.
- Many rate enhancement tools (e.g. spell checking, grammar checking,
punctuation management, text string search) are already part of
word processing software.
- Persons using PC-based systems have access to Thesaurus, but it
is not part of AAC device capabilities. Capability is well developed
in non-AAC products.
- Some participants find current rate enhancement tools (e.g. abbreviation
expansion, word prediction) to be "mechanical" (i.e. unnatural,
uncomfortable). These participants preferred to
spell out words.
- Some participants questioned the effectiveness of current rate
enhancement tools (e.g. word prediction can be slower than spelling).
- Some participants felt that typing out words was the only effective
way to "say what you want to say."
- Some participants thought that abbreviation expansion was not effective.
May be a "training issue" (i.e. users are unfamiliar or unpracticed
with technique). Overall, abbreviation expansion is "a very simple
technique."
- Many "high-end" AAC users construct communication with a combination
of tools that include spelling, word prediction and symbols (E.g.
One participant stated that they effectively use word prediction
in conjunction with Minspeak to construct communication.)
- Efficiency of error correction and text editing is limited by the
user's selection rate.
- The Language Sampling Library is a resource for LP
research. [1]
MESSAGE CREATION
- In 'natural communication,' the listener uses their natural language
processing capabilities to expand upon and understand a person's
telegraphic expression. For speedwriting and abbreviation expansion,
the AAC system must expand upon and understand the telegraphic expression.
- Systems employing telegraphic communication need to find the right
words or fewest words to accomplish their communication task. Finding
the right or fewest words is difficult when the communication task
is constantly changing.
- Compansion (compression/expansion) is one important area of NLP
research. Telegraphic input is expanded into
complete phrases. (E.g. VG= very good; I went store = I went to the
store yesterday). Variation on compansion is flexible ("on the fly") abbreviation expansion
(Enkidu). [Note: Compansion (COMpressed message exPANSION) is a
language processing technique in which telegraphic speech ("John
Eat Apple") is transformed into well-formed sentences ("John has
eaten the apple"). Compansion may support higher
communication rates without requiring a great
deal of cognitive effort.
- Compansion has been used as a writing, therapy or learning tool
for different user populations (e.g. expand icons/abbreviations
to make phrases).
- Telegraphic speech may not be appropriate for some situations (e.g.
A conference presentation requires complete accurate statements.
This is very different than a conversation with a family member
at dinner.)
- Speedwriting - sentences constructed with contractions, abbreviations
and missing words (e.g. articles like 'the' and 'a') on the fly
and expanded into complete phrases and sentences by system. Might
also provide syntactic information (e.g. correct verb tense, plurals).
- Speedwriting is related to abbreviation expansion - around since
the 1980s.
- Abbreviation expansion is similar in concept to macros used in
Microsoft Word
- Speedwriting might be a better writing tool than conversational
tool.
- Speedwriting might be useful in some conversational contexts -
for example, producing comments like "tough, oh!" "I'm sorry, etc."
However, this is more retrieval than generation. · Speedwriting
has a trade-off between precision and speed.
- Speedwriting must be reviewed and modified by the user (e.g. to
correct and modify). There is a cost for the user to read, comprehend,
and do this modification. Phrase modification could lose these time
gains.
- Speed writing "strains working memory." Extra cognitive burden
distracts from communication. · Some people may
find the extra difficulties associated with using
speedwriting worth it. Others may not.
- Speedwriting might require a different kind of interface.
- AAC systems could have two modes of operation selection mode (standard
interface) and generation mode (e.g. for speed writing).
- Disambiguation (Disambiguation can be letter-by-letter or word
level with multiple letters or symbols on each
selectable item. For word-level disambiguation, as items are selected
the system attempts to predict the target word based upon relative
word frequency. Selecting additional items eliminates possible target
words. For example, selecting 'a-b-c-d' could mean 'dog' or 'duck,'
whereas selecting 'a-b-c-d' followed by 'l-m-n-o' can only mean 'dog'). · Disambiguation
could combine letters and symbols (icons).
- Disambiguation now requires that words be spelled correctly (i.e.
no error correction - 'someb' would not be recognized as 'something').
- People can recognize words that are misspelled or incomplete (e.g.
one or two letters are wrong or missing) because these words have
excess (redundant) information that still differentiates them from
other possible words.
- Disambiguation can be used to reduce the physical distance between
selections (important when person using AAC system has limited range
of motion).
- Disambiguation allows bigger keys to be placed on smaller displays
with same selection set reducing selection precision.
- Disambiguation interface has fewer items to select from and may
be easier to memorize.
- Disambiguation with word prediction or abbreviation expansion can
reduce keystrokes.
- When a disambiguation is an input strategy, the user may be forced
to choose from amongst viable alternative words (phrases). This
takes time and may also confuse the user.
- Disambiguation is available on some phones.
- Disambiguation is available on some AAC devices (see Enkidu products).
- A large disambiguation system can be used as an accessible writing
tool for someone who's just learning.
- AAC systems employing iconic statements, the user selects an icon
sequence and the AAC device completes the sentence. User gets to
words through the iconic sequence.
- Minspeak Application Programs (MAP's) are prearranged vocabulary
sets using icon sequences used in some AAC devices rather than programming
in vocabulary for each individual user.
- Keystroke savings are more beneficial to some users (i.e. persons
utilizing very slow indirect selection techniques) than others (i.e.
persons utilizing fast direct selection techniques)
- Scan-based selection is very slow and requires the person to follow
the cursor position (using visual or auditory cues).
- Scanning is an exhausting process for persons with ALS (common
users of scanning interfaces). Scan-based selection is even more
difficult for those with additional auditory problems.
- Scanning generally utilizes items arranged on a grid (items arranged
in rows and columns).
- Higher dimensional scanning. "Two dimensional scanning" - under
each selectable item is another grid of selectable items. "Three
dimensional scanning" - under each selectable
second layer item is another grid of selectable
items.
- Two and three dimensional scanning place a high cognitive and memory
burden on the user.
- A Forum participant noted that one 'client' memorized 3 levels
(items have a grid or list of items under them)
with auditory scanning. A great memory burden - but impressive · "Adaptive scanning" might
be problematic. (E.g. A 'correct' but low probability item could
be 'placed' farther away. For two or three-dimensional scanning,
items are found by remembering where they are. How can an item's
position be changed that still allows it to be easily found?)
- Speedwriting with fixed scanning tables may not provide significant
benefits because of the increased cognitive, memory burden and reading
correction to use this interface.
- Speedwriting with adaptive scanning tables may add additional cognitive/memory
and reading correction burden beyond fixed scanning tables.
- Research on optimization of word and letter boards for scanning
(first done by Zygo).
- Lots of research for "computers understanding language." AAC
systems are a practical application of this research.
MESSAGE REFINEMENT
- Real-time message refinement is practically impossible with scanning,
Morse code and other switch-based input.
- Error correction slows down communication rate. The navigational
issue is a big part of it, not to mention actually making the change.
Getting to the spot where the repair needs to be made would be useful.
- Some AAC devices provide some support for spelling
and grammar checking.
LANGUAGE STORAGE AND RETRIEVAL
- Pre-stored text (on AAC systems) extends user memory. In natural
language, people don't generate novel utterances all the time. They
often recall utterances or fragments of utterances for modification
and reuse.
- One participant stated that in social conversation (by any person)
a large proportion of overall speech is predictable (e.g. reused,
formalisms).
- Research was conducted on an AAC system with both pre-stored text
and generative capabilities. Pre-stored text was used 93% of the
time while new text was generated 7% of the time.
- One participant stated that with current AAC systems, "high-end" users
rely on pre-stored vocabulary and sentences less than 10% of the
time.
- One participant stated that AAC systems with pre-stored text have
been tried many times and people generally aren't constructing speech
using pre-stored text capabilities (exceptions - stylistic or scripted
speech).
- Some value in a limited set of fixed alternative phrases (e.g. "wait a second", "let me think a moment").
This type of communication is non-generative but could help to hold
attention, pace and control conversation.
- Some AAC systems allow the user to store and retrieve speech.
- Stored text on AAC systems is not systematically organized (e.g.
by context, by relationships, etc).
- Retrieving speech stored on an AAC system is similar to completing
a file search on a personal computer.
- For AAC systems that do support speech storage, the speech doesn't
remain in memory for long (i.e. systems frequently "flush stored
speech from the buffer").
- Prediction capabilities of AAC devices are currently limited to
letters and words.
- Current letter and word prediction uses little language processing
(e.g. predicting the third letter probability given the occurrence
of the first two letters.)
- Character prediction can be based upon frequency of use (e.g. given
the letter sequence 'th' what is the likelihood that the next letter
is an 'e' or 'q'?)
- Word prediction is a (simple) form of text storage and retrieval.
- Improved word prediction is being developed that utilizes information
from recent phrases and conversational context.
- Word prediction can be based on independent frequency of word usage,
word usage given previous 1-3 words, word usage dependent upon discussion
content, etc. Prediction reduces keystrokes but doesn't necessarily
increase rate (e.g. time, also required to see word list, decide
upon best word and choose word. If a word is not present in the
list, it must be constructed in some way.)
- Use of standardized message structures (i.e. canonical
forms) could reduce the complexity of the AAC interface and enhance
communication rate.
CONTEXT RECOGNITION
- Combining multiple streams of contextual information (i.e. own
speech, interlocutor speech, time, place, etc) may be very useful
for persons who have great difficulty producing communication.
- Context recognition (derived from the body of composed text) may
be very useful for literacy (writing) applications where contextual
information may change slowly and systematically.
- Language processing (e.g. recognition and adaptation based upon
contextual information) could support phrase
and sentence prediction [Note: Would also improve letter and word
prediction.] · Conversational
context can shift quickly. There may not
be enough usable information (no matter how context is obtained)
to adapt language processing before it shifts to the next.
- Context recognition has applications besides word prediction (e.g.
make different pages available - conditional tree structure, create
pages "on the fly" populated with items that reflect the communication
context - visit to the doctors office)
- Contextual information from interlocutor speech supports real-time
adaptation of language processing, and is an important source of
contextual information. Research with limited vocabulary has demonstrated
this.
- Interlocutor speech could provide contextual information (e.g.
thorough speech recognition) that would place constraints on language
processing (e.g. identify topic and alter word or phrase prediction
in response).
- Speech recognition technology can be used to follow interlocutor
speech. Speech recognition is sufficiently advanced to begin achieving
this now.
- Speech recognition performance depends upon environmental factors
(e.g. single versus multiple speakers, quiet versus noisy, speaker
position and orientation relative to the microphone, speech quality
etc) and the speech recognition engine (hardware, software, microphone).
- Developments in image recognition technology may aid in context
recognition (e.g. interlocutor recognition, environmental recognition).
- The Global Positioning System (GPS) can provide time and location
information. Current GPS technology is only suitable outside of
buildings (i.e. GPS signal is interfered with by buildings, geographical
features, etc).
- Physical environment (time, place from GPS) provides little information
about the language environment (e.g. talking about a hair cut at
McDonalds).
- Research on dysarthric speech recognition is related to both speech
recognition to establish context and language processing generally.
- Dynamic pages on current AAC systems are not context sensitive
(i.e. branching and page content does not change with context).
- Work is taking place in AAC industry to improve the performance
of context sensitive word prediction.
- Participants disagreed as to whether contextual information improves
keystroke efficiency on current word-based AAC systems (e.g. context
dependent word prediction).
- Use of contextual information for word or phrase prediction is
similar to an email filter where messages are sorted by source,
date, etc.
- AAC should utilize microphones to adjust the output sound level
to adapt to the ambient noise. Car stereos currently adjust volume
in this way
- Casi-Reader is a proximity card and key reader that
has a union point for the complete wiring of microcontrollers, readers,
and door controls. The box standardizes and simplifies wiring connections
by bringing all reader, microcontroller, door lock, and digital
input connections to one point.
VOICE OUTPUT
- AAC voice output is very unnatural - information carried in "normal" speech
(prosody for instance) is missing. Voice output is frequently misinterpreted
or misunderstood. AAC voice output does not reflect discourse and
pragmatics.
- Word pronunciation is problematic - even when words are spelled
correctly. Some words need to be misspelled in order to be pronounced
correctly (e.g. 'horsez' - sounds fine but 'horses' - does not).
- Some AAC systems support the use of homographic words (e.g. one
of two or more words with same spelling but which differ in origin,
meaning, and sometimes pronunciation).
- Words must often be pronounced differently depending upon their
context of use. AAC systems do not recognize contexts and modify
word pronunciation.
- Automatic text-to-speech synthesis that uses word context to determine
pronunciation is being employed outside of the AAC industry.
- Some AAC systems allow the user to add words to their dictionary
and customize word pronunciation. Most people currently don't use
these capabilities and that's a problem. Requires user to be a programmer
and can't be done in "real time."
- It is possible to add words or change word spelling
and pronunciation in the AAC dictionary, but participants stated
that it is a heavy burden on clinicians and users.
TRAINING AND
SYSTEM CUSTOMIZATION
- There exists a continuum between being illiterate and being literate.
- AAC systems do not do a good job of supporting language acquisition.
Some systems are trying to be language teachers for restricted language
settings.
- Mainstream education has writing to read programs to support emerging
literacy (teaches critical thinking in literary skills, phonemic
awareness, writing, composing, etc.)- some children develop that,
so there are kids who could get a sound approximation. [4]
- Communication rate is closely related to the amount of learning
(practice) a user has on the AAC device and the amount of instruction
provided for the use of this AAC device.
- Language representation and acquisition with AAC systems does not
work very well for 2 to 3 year old children.
- Children are introduced into the classroom environment too soon
(i.e. before they have sufficient mastery of language).
- Icon-based language is especially difficult to learn (e.g. Multiple-meanings
for the same picture. Meaning changes with context in which it appears.
User must memorize and understand all this). No effective strategies
available to teach icon-based language.
- Some children acquiring language prefer letter spelling as opposed
to use of pictures. "Lingraphica" [5]
is an assistive and therapeutic device
for persons with aphasia that utilizes
a graphical
interface
map. Lingraphica provides novel, important,
and useful support for improving communication
in real-life situations. It does this
by providing access to an extensive
database of more than 2000 word concepts,
each of which can display meaning in
graphics, text, and voice.
- Clinicians often find AAC systems too complicated (e.g. clinician
focus may be on learning how to set up and optimize an AAC system
rather than how to use the AAC system as a clinical intervention).
- Manufacturers "train" clinicians and users on how to optimize the
performance of their (the manufacturer's) systems. Training doesn't
generalize from one manufacturer's products to another's.
- Clinicians use AAC programs/capabilities as provided
by the manufacturer. Clinicians often don't know how to customize
programs/capabilities in a way they feel is most appropriate for
their client. Some participants believed that "this was not the problem
of AAC device, but rather a clinician training issue."
PERFORMANCE
MONITORING
- Performance monitoring (e.g. Language Activity Monitor) is a recent
innovation in AAC devices (e.g. track keystrokes, page changes,
output, etc). LAM has an external switch to automatically send data.
- AAC systems should support remote performance evaluation (e.g.
via modem and the Internet).
- Language Activity Monitor (LAM) - language activity events yield
summary report. Some families are using LAM without clinical intervention.
Useful for identifying core verses extended vocabulary. Core vocabulary
consistent across (all) environments and times. Useful for optimizing
access to different words.
- Language Activity Monitor (LAM). Tracks language usage events,
language context and the time to generate this context (core vocabulary,
extended vocabulary, etc). Assists clinical process by monitoring
communication as it varies with time, location, age, etc.
- LAM does not currently assist directly with language generation.
- LAM could be a valuable clinical and research tool.
- LAM can be used to flag communication problems (e.g. absence of
'do' family of words, no wh** questions).
- LAM can identify the types of syntactic structures people are using
to communicate.
- LAM can identify areas of weakness in language generation.
- LAM can suggest the kind of intervention needed to improve language
generation. · Communication performance can be predicted based upon
a LAM profile.
- LAM can track communication performance in different situations,
activities and environments. · LAM information can be used to help
optimize device usage (and setup).
- LAM can provide basic performance data for artificial intelligence
and language processing researchers.
- Performance monitoring is currently available only for higher end
devices with synthesized speech.
- Log files should be encrypted and under the user's control (e.g.
user can erase or edit log file). Use of interlocutor speech in
the log file also has related privacy concerns.
- AAC systems should have log files for performance analysis.
- LAM can provide performance data (Some manufacturers are working
with ASHA to develop performance standards for AAC systems.)
- AAC industry is working towards a standardized data format for
performance log files [6]
- Software (ACQUA) has been developed to analyze performance data
. [7]
- An orthographic language library has been established [1]
[ Top of Page ]
3. Needed Technology (refinements, innovations, etc.)
GENERAL
- Need codependence between physical ability and language representation
/ management strategies (i.e. language representation is "adjusted" to
reflect physical ability).
- Need dynamic codependence - language representation/management
adapts as the users physical abilities change (i.e. through the
day, throughout life, following the progression of a disease).
- Need language processing to be transparently "general purpose" so
that it works well across tasks and environments.
- Effective communication may be carried out with poorly formed,
sentence fragments. What does communication partner accept as adequate?
What strategy do users employ to achieve good communication rate?
AAC capabilities should take advantage of these strategies.
- Generative language processing approaches may be more useful for
literacy while pre-stored/whole language approaches may be more
useful for communication.
- Need to have generative language capabilities in order to build
literacy. Some participants stated
that AAC devices based upon "whole
language" do not facilitate language acquisition.
- Need flexible communication and literacy options available to the
user for different environments and tasks especially when communicating
with unfamiliar communication partners.
- Language processing capabilities should not introduce delays in
communication or literate writing.
- Need AAC system to provide range of capabilities associated with
a PC (e.g. word processing, databases, modems, full Internet access,
etc.). Capabilities would support becoming a better writer, enhance
marketable computer skills and increase access to communication
and knowledge resources.
- Need AAC system to provide phonebook, organizer and calendar and
related capabilities. · Need capability of training "story telling" to
AAC user.
- May be possible to construct AAC that are accessed by procedural
memory (action sequence) in the spatial or visual domain and map
this input to the verbal domain. Unknown how to accomplish this
mapping however.
- An "ideal AAC device" is a complete AI system that
knows (recognizes, reflects, is optimized for) you, utilizes context
and information (own speech, time, place, interlocutor). For short
conversations (or conversations whose topic quickly changes), you
will not have enough information to identify the context.
MESSAGE
CREATION
- Disambiguation could reduce input interface complexity.
- Disambiguation could significantly increase selection rate for
many selection techniques (e.g. scanning rates could be '4 times
faster' if there were four symbols per item).
- Disambiguation could have great potential with eye gaze (i.e. It
is difficult to select small items with eye gaze. Larger items are
easier to select making eye gaze more practical. Enables eye gaze
to be utilized with smaller displays. Example of codependence between
physical abilities [input technique] and language processing capabilities
[disambiguation]).
- Icon based disambiguation should be investigated (e.g. as icons
are selected possible meaning of this icon sequence narrows).
- Icon-based compansion (now text-based, telegraphic sentence production)
should be investigated (short sequence of icons expanded into a
complete sentence).
- Disambiguation interfaces are relatively simple now. They need
to have more sophisticated capabilities in order to be attractive
to a wider population of persons using AAC systems.
- Possible language processing application is speedwriting - sentences
constructed with contractions, abbreviations and missing words (e.g.
articles like 'the' and 'a') on the fly and expanded into complete
phrases and sentences by system. Might also provide syntactic information
(e.g. correct verb tense, plurals).
- Need language processing to automatically correct spelling errors,
expand contractions, provide punctuation, fill in common omissions,
and correct verb tense. The user should be able to see and accept/reject
the corrections. Too much burden is placed upon the user to do these
things. · Need support for message creation (e.g. thesaurus, archive
messages, context-based storage and recovery of messages)
- Need to be able to find synonyms (to improve quality and clarity
of speech).
- Need to have access to a Thesaurus (to improve quality and clarity
of speech)
- Need a way to quickly elaborate on a topic. (E.g. Like a web site
- underlined words have "hyperlinks" to word definitions or concept
explanations. "Click" on underlined
word to access the elaboration.
Very useful when interlocutor
is not familiar with a topic.).
- Need language processing to recognize and reflect the communication
goal. Word choice and sentence structure should differ depending
upon whether a person is asking or answering a question.
- Need language processing to keep track of a user's conversation
and offer help in selecting the next thing to say.
- Need language processing to recognize your linguistic intent (creating
a question, speaking in past tense, etc) and support sentence construction
accordingly.
- Ideally, messages would be created in a way that reflects
the concept you are trying to communicate and have the proper syntax
(noun, verb, punctuation, etc).
MESSAGE REFINEMENT
- Need language processing to support smart semantic editing (e.g.
change verb to reflect ownership or past/present tense)
- Language processing should identify the error location and "take" the
user to that location (e.g. eliminate time spent backspacing to the
error).
- Language processing should identify errors and offer alternative
selections/corrections (e.g. eliminate less efficient methods of
error correction).
- It should be easy to change sentences midway (e.g. be able to participate
in two Forums running simultaneously. It should not be necessary
for the user reformulate the whole message.).
- Need to adapt to very poor spelling (e.g. letter reversal, dyslexia,
idiosyncratic). System should recognize individual spelling (misspelling)
tendencies. User should not have to be a great speller in order
to use the AAC device.
- Need support for message refinement (e.g. on-the-fly/real-time
spelling error correction, efficient editing of stored messages)
LANGUAGE
STORAGE AND RETRIEVAL
- Need to have a logged history of what is said and make available
for processing/selection during conversation. Should be transparent.
- Need language processing to reduce, recycle and retrieve frequently
used language.
- Need to be able to store, recall and edit dialogue
- Need to be able to store speech for later use and refinement.
- Need to reduce storage demands by eliminating some of the previously
stored speech (e.g. as more messages are created and stored it may
become more difficult recovering the most appropriate stored speech).
- Need tracking system for stored speech (e.g. store speech by its
context and history of use).
- Need to be able to efficiently locate and retrieve stored speech
(e.g. "on line" knowledge representation that automatically creates
hyperlinks to stored speech).
- Need standard phrases to hold people's attention that can be output
while sentence construction is still ongoing (e.g. "Just a moment
please")
- Need smart language editing for stored speech (e.g. "touch
a button to change statement from present tense to past tense").
CONTEXT
RECOGNITION
- Need language processing to provide candidate words and phrases
to the AAC user based upon the dialogue (e.g. conversational context
suggests information or communication needed)
- Need to combine the various streams of contextual information (e.g.
interlocutor speech, own speech, location, activity, time, day,
season, etc) in order to have the greatest impact on language processing
and communication.
- Perhaps the user could manually identify context and language processing
would adapt based upon this input. How to best use this contextual
information is a difficult problem.
- Interlocutor speech is an important source of contextual information
for language processing. There are two 'parts' to the problem. First,
speech recognition is improving rapidly but isn't quite good enough
for all environments and circumstances. Second, once speech is recognized,
need to determine how to employ it for language processing.
- Speech recognition must be capable of isolating the communication
partner(s) from background speech. [Note: Adaptive beam forming
microphones have been employed to improve performance of commercial
speech recognition software.]
- Interlocutor speech can be used to predict phrases (e.g. find noun
traces in the interlocutor's speech and generate open ended questions
based upon these noun traces).
- Need interlocutor speech recognition to prime AAC devices to provide
appropriate responses in certain settings (e.g. provide a person
with 4 choices (phrases utterances) to keep them in a conversation.
If person is talking about Mary then it may prompt things about
Mary.)
- Need to recognize time and location and adapt language processing
(e.g. if you're in the kitchen, in the morning, then breakfast foods
are predicted) [Note: participants disagreed on the impact context
recognition would have on word prediction performance as measured
by keystroke efficiency.]
- Need language processing to adapt to environments and situations
(Child versus adult, at home, at work, weather conditions, time
of day).
- Need language processing to adapt to time of day/week context (e.g.
Monday, not a holiday, 11 AM - work dialogue is used).
- Need to sense ambient noise level through use of microphone and
adjust output volume (louder room/car/outdoor setting, louder speech).
- Need to sense ambient light levels (e.g. have a light
sensor for this purpose) and adapt brightness and contrast.
VOICE
OUTPUT
- Need language processing to control speech synthesis. The interface
between the message (language processing) and the speech synthesizer
(speech output) needs to be refined.
- Need language processing to control speech output - insert pauses
(e.g. in response to punctuation, separate phrases), control prosody
(intensity, first formant, stress) based upon discourse and pragmatics.
- AAC device should be programmable "on the fly" (e.g.
add a word to the dictionary with
pronunciation and definition).
USER TRAINING AND SYSTEM CUSTOMIZATION
- Need automated (computer-based) testing (standardized performance
testing with and without the intervention of AAC system).
- Users need the ability to easily program AAC device (e.g. set up,
modify, optimize, select capabilities)
- Need service (e.g. repair, upgrades, battery replacement) for AAC
devices to be cheap and readily available (i.e. easy to get device
to service, quick turnaround).
- Device should power up when the user is awake or active and power
down when the user is sleeping or inactive.
- Need intelligent language processing that recognizes problems (e.g.
in set up, performance, optimization) and provides feedback to correct
these problems.
- Need to shorten the learning curve for practitioners (clinicians)
by building in training strategies, tutorials, help capabilities,
etc.
- Need comprehensive training for clinicians and users to be built
into the AAC device (e.g. how to set up and optimize the AAC device
in order to enhance communication, literacy, education etc.).
- Need built-in tools (for setup and optimization) beyond preprogrammed
pages.
- Need a "wizard" to help set up, optimize and correct problems (e.g.
similar to wizard that comes with many Microsoft products.
- Need special software "programs" that run on AAC devices to meet
specific/targeted needs (e.g. help attain educational goals). AAC
devices are general-purpose tools that may not be well suited to
these purposes.
- Need AAC systems to be flexible, support growing communication
skills while remaining transparent. Transparency should not be sacrificed
in order to achieve flexibility and potential growth.
- Need AAC systems to provide scaffolding support for language acquisition.
Compensate for language deficits as new language skills are acquired.
- AAC systems should take advantage of the natural progression of
literacy skills (e.g. communication based on symbols, symbols and
letters, symbols letters and syllables, etc).
- Language processing should provide a scaffolding to learn icon-based
languages (e.g. add icon meanings as person becomes more skilled).
- Need language processing to provide a smooth transition from simple
symbol-based communication to powerful communication languages such
as Minspeak (One participant suggested that Minspeak might be taught
as a second language in schools).
- Language processing should provide a learning environment for picture-based
languages.
- "Semantic features" should be grouped together with icons to help
persons who can't remember the word they wanted.
- Need icon based language acquisition to employ icons that "suggest" the
word of interest (e.g. thermometer + sun = hot day).
- Need to be able to recognize and adapt to individual spelling capabilities
(e.g. letter reversal- dyslexia for instance")
- Need alternative information representations for children ("same
content, different register")
- Should consider grounded language learning for AAC applications
(learn the meaning of a word by hearing it and simultaneously seeing
what's going on).
- Dream idea: listen to speech and match which icon means what to
one person - would be a complete AI system.
- Need training and instruction to be built into AAC device. Advanced
capabilities should be easy to learn. For example, it is easy to
pick up the advanced features of Microsoft Word when you need them
(e.g. built-in Help functions)
- A separate display and input interface accessory (or a second AAC
device) is needed when teaching an augmentative communicator how
to use their AAC system. "[You] don't want to use his device to
communicate with him - even without speech output it would be like
'using his mouth to talk to him.'" SYSTEM MONITORING
- AAC systems should have log files for performance analysis.
- The mere action of writing and typing suggests a higher level of
formality than occurs in conversation.
- AAC users spend more time "writing" than producing output, but
the conversation
is surprisingly natural when you look at a transcript.
- Performance monitoring could provide the basis for "smart optimization." AAC
devices could identify the strengths and weaknesses of individual
users and provide custom recommendations.
- Need AAC system to track user performance (e.g. selection errors,
spelling errors) and adapt or adjust features (e.g. display setup,
increased word prediction list length) in response to changing user
abilities (e.g. decreased physical precision, increased number of
spelling errors). Should check with the user to see if they think
the changes would be helpful.
- Need to be able to turn adaptive language processing features ON
and OFF (i.e. People may think that machine is doing the thinking.
Use "language orthosis" at certain times and not others.).
[ Top of Page ]
4. Barriers (to obtaining technology, to developing technology, etc.)
- AAC system (computer) capabilities place limits on the kinds of
language processing capabilities that can be offered.
- Advanced language processing capabilities (e.g. common sense, reasoning
ability) will require powerful computer processing capabilities and
large memory resources that are not present on current AAC systems.
- AAC capabilities are limited by hardware and system limitations
(e.g. memory, processing speed, speech recognition, etc).
- A simple, flexible, easy to use AAC system may require very complex
and powerful hardware and software capabilities.
- As systems become more flexible they can also become more difficult
to manage (e.g. where do you 'go' when you want to turn off Word processing
features?)
- Now we have simple devices and a lot of features, but we have yet
to offer options as to what the tech thinks they can handle. We want
all of the features, but we want it transparent. Ex- Mac is an intuitive
system
- "Apple Interface Design Book" [8] is a good source of information
on interface design.
- Context-based language processing capabilities/tools that facilitate
writing (literacy) may not be useful for conversation (e.g. when writing,
there is a large coherent body of contextual information that may evolve
gradually and logically).
- Communication depends upon many sources of information rather than
text (e.g. non-verbal cues, history, intent, environment, etc).
- Performance monitoring requires information from "outside of the
AAC device" (e.g.
non-verbal
information,
interlocutor
speech).
- Speech recognition works well with a trained user in a controlled
environment (e.g. PC voice interface) or with a limited vocabulary
of untrained speech (e.g. phone and voice portal applications). Better
speech recognition technology is needed in order to recognize speech
in 'uncontrolled' speech environments.
- Speech recognition must not only identify words but must also 'understand'
how the words were said. Speech recognition technology has not reached
this level of sophistication.
- Better speech recognition technology - a sentence is more than
a string
of words. Speech recognition technology that is only capable of recognizing
individual words loses information.
- Some participants thought that a great deal of effort would be
required
in order for contextual information to have a major impact on AAC performance.
- Language processing is AI complete- all related problems must be
solved (e.g. natural language uses context, history, non-verbal
cues, speaker
intent, isolates the interlocutor, filters out irrelevant information,
etc.) Language processing cannot approximate natural language until
it has similar capabilities.
- It is unclear which of the language processing approaches being
pursued
by manufacturers have the most potential to improve communication and
literacy. Research needs to establish the effectiveness of these approaches
before manufacturers will be willing to invest in development and implementation.
- AAC systems need intensive testing and data gathering (to determine
the effectiveness of innovations and suggest further design changes).
- In order to improve AAC device performance, need better understanding
of AAC users and how they are using their technology now. This understanding
needs to be based upon quantifiable performance measures.
- Need to understand why AAC users are only using a small portion
of AAC
device capabilities. What prevents them from realizing the full potential
of the AAC device now?
- Augmentative communicators currently evaluate AAC systems rather
than shaping the design of these systems. Much research utilizes
'able-bodied' individuals rather than augmented communicators.
- Difficult getting access to large groups of AAC users for testing
and evaluation. Perhaps remote testing of these systems is possible.
- Language processing that intelligently anticipates words, phrases,
sentence structure, etc., might bias the choice of vocabulary and may
cause the user to lose control of the conversation.
- Language representation, language processing, rate enhancement
etc are hard problems that require basic research.
- Need to know what it means to 'represent language.'
- Increasing evidence that representing language in a box has huge
limitations and we have a lot of work to do.
- Language processing research for AAC applications needs to draw
upon other
sources of knowledge and related disciplines (e.g. psycho-linguistics,
Lowen-Lavelle, Herb Clarks, human factor, ergonomics, cognitive and
developmental psychology).
- Need to draw upon work being done in other
rapidly evolving fields (e.g. Internet - hyper-linked information,
voice portals, etc).
- AAC Institute.
[Online:
http://www.aacinstitute.org.]
- McCoy
K, Pennington,
C, Luberoff
Badman A, "Compansion: From
Research
Prototype to Practical Integration" Natural
Language
Engineering,
1998,
v 4(1),
pages
73-95;
http://www.asel.udel.edu/natlang/nli.html.]
- Reference:
M. King, C.
Kushler, D.
Glover, "JustTypeT - Efficient
Communication
with Eight Keys," RESNA
Proceedings
1995,
pages
94-96.
- Camden
Street Schoo. "WRITE to READ Program." [Online:
http://www.nps.k12.nj.us/camden_st/write_to_read.htm
- Steele,
Richard. "A Journal from Concept to Commercialization
- Lingraphica." OnCenter
Technology Transfer
News. Issue
No. 5.
May 1993
- AAC-RERC.
Communication
Performance
Assessment
(CPA). "Logfile
Protocol." [Online:
http://www.aac-rerc.org/performance.html#Logfile]
- AAC-RERC.
Communication
Performance
Assessment
(CPA). "ACQUA." [Online:
http://www.aac-rerc.org/performance.html#Logfile]
- Apple
Interface
Design Book.
[ Top of Page ]
|
 |