Here's a telling tale of voice recognition gender-bending

When I first reviewed Voice Xpress Plus software from Lernout & Hauspie Speech
Products USA Inc. of Burlington, Mass. [GCN, June 22, 1998, Page 27],
it performed poorly even after hours of training in my voice patterns. It changed the
sentence, “There were no oranges, apples, grapes or pears in the snack bar” into
poorly punctuated gibberish: “There was no Oranges, Apple’s, rapist were pairs
in back bar.”


I gave it a D grade, which prompted a swarm of technicians and top brass from Lernout
& Hauspie to come see what was wrong.


They ran the standard product demo, making charts and graphs, dictating purchases of
Brazilian coffee stock at “five and one-half dollars per share—wait, show that
value in pesos” and performing other feats without ever touching the keyboard.


Then it was my turn.


We retreated from the conference room back to the GCN Lab, where I still had my
voice-trained software installed. A Lernout & Hauspie technician made careful
adjustments to the microphone. I took a deep breath and said, clearly and slowly:
“One small step for man, one giant leap for mankind.”


Voice Xpress Plus typed, “One squall stoop man, One leper mannequin.”


The company delegation was shocked. How could their product misunderstand the words so
badly? They huddled, they fiddled with RAM and settings, they turned knobs and tested the
room for noise. This went on for hours. In the end, though, I might as well have been
speaking Mandarin.


They all departed except for the chief technician, who was instructed to stay by my
side till the problem was fixed.


I don’t have an extra bedroom, and I’m not the world’s best cook, so I
was getting nervous that the technician might be around for a long time.


As I was estimating how many dinners I could squeeze out of a box of Tuna Helper, our
luck changed. The frustrated technician played around with the demographic questions the
package asks before training. I had assumed the questions had nothing to do with the
speech recognition capability, and I think he thought so too. We switched my profile to
that of a 15-year-old girl.


The machine stirred to life. Like that cute RCA dog hearing his master’s voice, my
system finally understood me. We tested several times, and the results fell into line with
the company’s claims of 85 percent to 90 percent recognition accuracy.


I made a suggestion to Lernout & Hauspie: Users train their machines for hours on
end anyhow, so the software should sample some of that voice data to determine the optimum
pitch for speech recognition, instead of relying on radio buttons that do not convey tone
and voice quality. Company officials recently told me they are incorporating such a change
into future versions.


Had the sampling information been in place during the initial review, I would have
graded the $99 product much higher. Based on my 15-year-old profile, I would give Voice
Express Plus a solid B+. It no longer puts any rapists in the snack bar.



About the Author

John Breeden II is a freelance technology writer for GCN.

inside gcn

  • Phishing

    Phishing is still a big problem, but users can help shrink it

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group