Personal tools

Interested in disability history? Check out what happened Today in AT History!

Voice input systems

From ATWiki

Jump to: navigation, search

Voice input computer systems (or speech recognition systems) learn how a particular user pronounces words and uses information about these speech patterns to guess what words are being spoken. Voice input systems are useful for the following problems:

  • Trouble physically using the keyboard - Voice input systems can allow a person to operate a computer without using a keyboard or mouse. This can help people who are not able to use their hands at all, as well as others who can use their hands but are limited by speed, fatigue, or pain (e.g., carpal tunnel syndrome, one-handed keyboarders).
  • Trouble creating text - Voice input systems can help a person, who has difficulty spelling words, create text. This can be particularly useful for some people with learning disabilities. However, the person's reading abilities need to be strong enough to recognize when the computer displays the wrong word.



How much does it cost?

Most full systems cost about $200. For exact prices, check the manufacture web sites listed below. There are also several systems for less than $100, but these cheaper versions may not include all of the dictation features such as hands-free access or a text read-back feature. Check to make sure that the features that you want to use are supported.

How fast can I dictate?

It depends on the user and on the task being performed. Some users are now reporting speeds that are close to 100 wpm, however, some manufacturers do not consider correction time in their calculations (most people report 90-98% accuracy). Rate also varies with the task being performed. Timings taken while reading a page of text are also typically faster than the speeds achieved when new text is being 'thought up'. In addition, voice 'macros' which enter a whole phrase or paragraph will produce high entry rates, while tasks requiring the spelling of names and addresses will result in lower entry rates.

Is a mouse required?

Voice input systems usually allow a person to activate program menus via voice commands. The Dragon products (i.e., Naturally Speaking) are the most 'hands-free' systems currently on the market, particularly with a feature called 'MouseGrid' that permits fine control of the mouse cursor. The other systems currently require some limited use of a mouse (or alternative).

What type of microphone should I use?

Most systems will come with a microphone, however improved accuracy may be achieved by switching to a better microphone. Some recommended microphones include:

Does a person have to sit in front of the computer in order to use it?

The user needs to be able to see the monitor to find out whether words and commands are being correctly recognized. Otherwise, it is possible to use voice input while sitting, standing, or reclining. Although headset microphones are typically shipped with these products, it is also possible to use desktop microphones or other setups that don't require daily assistance with setup.

Are dictation systems difficult to learn?

Not really, but it takes a week or two of frequent use for the computer to adjust to your voice and for you to adjust to the commands. This can be an overwhelming task for a person who is also trying to learn the basics of computer operation. Our staff has always seen better results when the user already has some knowledge of computer basics.

Can a computer lab use it with several students?

Some software products are designed for a single user--be careful which version you are buying if this is an issue. These systems can be set up for multiple users, but be aware that each set of user files takes up several Mb of storage space. To avoid filling up the hard drive, it may be necessary to limit the number of users, and to erase files for students who have graduated.

If users have the software at home, can they copy their voice files to a work computer?

Yes, but the files are fairly large, so use a CD or memory stick.

What are the reading requirements?

Voice input systems do not have 100% recognition. They rely on the user being able to recognize when a word is incorrectly 'guessed' and make corrections. Often, if an error is not corrected, the computer will continue to substitute the wrong word and overall accuracy will get worse. Users who frequently miss errors, or forget to correct them, should look to see if the program has an 'adapt only on correction' option so that it does not learn from those mistakes. In general, a fourth grade reading level seems to be required.

Can a voice input system be used by people with visual impairments?

It is a common assumption that people with visual impairments would benefit from voice input because they do not need to look at the keys on a keyboard. This is incorrect. Due to the need to make corrections, people who are not able to see the monitor well enough fix the computer's errors may have trouble using this type of technology. Although voice output technology can be combined with voice input technology to 'echo' each word that is spoken (e.g., JawBone product), this can be technically challenging and mentally fatiguing. We generally recommend that voice input be used by people with visual impairments only when the individual also has a physical disability that makes standard entry methods (such as touch typing) impossible.

Can a voice input system provide a transcript of a class for a person with a learning disability? Can a voice input system 'caption' things for a person who is deaf?

People are experimenting with both ideas. Voice input systems are speaker-dependent (require the speaker to go through the training period on the computer), are usually only 95-98% accurate, and require slightly slower speech, therefore, it is not currently advisable to attach a microphone to a speaker to try to produce a transcript or carry on a conversation. In addition, since punctuation must be dictated, if attempted, the resulting transcript ends up as one long run-on sentence!

Can a voice input system be used by people with speech impediments?

Since voice input systems learn how individuals pronounce various words, it is possible for a person with a speech impediment (or regional accent) to use this technology. The key is the consistency of how the words are pronounced. From our experience, the discrete speech system Dragon Dictate seems to be the easiest to force into accepting non-standard pronunciation. Even with Dragon Dictate, the training period will take longer and the low initial accuracy rates may be frustrating.

Can use of a voice input system cause laryngitis?

There have been reports of some voice input users developing voice problems, primarily with the older, discrete speech systems when people had to pause between each word. Interestingly, these cases have usually been people who were using voice input technology because they had a repetitive strain injury such as carpal tunnel. The "rules" for using voice input technology safely appear to be following:

  • Speak softly; don't yell at the microphone. Relax!
  • Sit up; do not lean forward, since that can decrease your lung capacity.
  • Take frequent, short breaks.
  • Drink liquids (one singer has suggested avoiding caffeine); don't wait for you throat to get dry.


Computing Out-Loud (Susan Fulton) Useful site from a long-time voice input user. Provides tips for optimum use of voice input software, not necessarily related to specific products.

Author: CATEA, Georgia Tech.