IBM polishes gem of a speech recognition app
For more than a decade, continuous speech
recognition has seemed a foregone conclusion. Throw in enough research and computing
muscle, and it's just a matter of time.
As a leader in the charge, IBM Corp. has put in the most effort. It shows in ViaVoice,
the successor to IBM's Simply Speaking [GCN, Sept. 8, Page 36]. It recognizes continuous
speech and--equally important--works with leading word processors.
Natural dictation is great, but if you can't manipulate the text once it's on the
screen, what's the point?
IBM's ViaVoice is available to Lotus SmartSuite 97 and SmartSuite 98 users as an
The standalone version works hand in glove with Microsoft Word 97, Word 7.0 and Word
A recent Test Drive of Dragon Systems Inc.'s Naturally Speaking [GCN, Aug. 25, Page 1]
lauded its on-screen voice editing, which ViaVoice does not have. What it does have is
excellent recognition and the ability to read back a document to you.
Teaching the program to recognize your voice is arduous. A first phase helps the
software decipher most of your speech, and a second phase fine-tunes.
IBM's goal is correct speech recognition of at least 85 percent.
I found that the second phase not only improved recognition but also helped me learn
the dos and don'ts of dictation.
People use the terms "continuous" and "natural" speech
interchangeably, but they're different, just as dictation for transcription is different
than conversational speech.
If you're not familiar with dictation, expect a moderate learning curve. The
documentation stated that training was unnecessary. I disagree. It takes practice to use
this package, but the rewards in accuracy make it worth the trouble.
Once you take it through the two learning phases, the software is ready to go. In the
standalone version I tested with Word 97, ViaVoice loaded whenever I started Word.
I'd like the option of starting Word with or without ViaVoice loading. Even when voice
recognition becomes commonplace, people likely won't run it all the time.
If you're working on an unusual document or the office suddenly gets noisy, you'll want
to revert to the keyboard. You can do that with ViaVoice loaded, of course, but it takes
up a healthy chunk of resources that can't be spared when running Word 97.
I tested ViaVoice on a 200-MHz Dell Computer Corp. OptiPlex Pentium Pro running Windows
NT 4.0 in 48M RAM. Although the software was responsive, my rock-bottom recommendation
would be a 150-MHz Pentium MMX or a 166-MHz Pentium.
The ViaVoice speaker and microphone headset worked well.
Dictating with the headset was comfortable, and new possibilities were obvious--for
instance, combining voice recognition with computer telephony.
One improvement would be a kill switch on the microphone, such as those on tape
dictation devices, to manage interruptions efficiently. You could buy such a headset on
your own, but IBM should bundle dictation-style components if it wants to make this a
Unlike similar products, ViaVoice lets you finish dictating before making corrections.
You can complete your train of thought without stopping every sentence for some minor fix.
As you use ViaVoice, it gets better at understanding you. It starts with a 22,000-word
vocabulary, expandable to 64,000 with your additions. It recognizes words in context,
which prevents many common errors.
The feature that makes ViaVoice unique, however, is its text-to-speech function. I use
it for proofreading documents.
As people read, they have a natural tendency to skip over errors, which makes it
difficult to check one's writing. Listening to a different voice read aloud as you follow
along is a great way to spot your spelling, grammar and punctuation mistakes.
You can modify the reading voice, changing characteristics such as sex, age, speed and
inflection to fit your preferences.
Besides proofreading, I expect to discover other uses such as sending and receiving
e-mail by phone. That's the tip of the iceberg.
Voice recognition will force some changes in how you work, but ViaVoice makes the
transition fairly painless. The frustrating thing is that it's limited to dictation at the
moment. Next year's release of ViaVoice Gold will bring voice-controlled system and
editing tools. Offices with multiple word processors probably should wait for that
One complaint is that ViaVoice has a clunky, utilitarian interface common to many IBM
software products. It's a minor concern, though, because most interaction with the program
will be through the word processor.
ViaVoice is available separately for Microsoft Word or as part of an upgrade to
SmartSuite 97. The Gold edition will be available standalone as well as part of SmartSuite