Talk to Me

 

Connecting state and local government leaders

The idea of speech recognition conjures up the romantic notion of commanding a computer a la 'Star Trek' or the 'you talk, it types' image of early advertising for the software.

By Mark A. KellnerSpecial to GCNThe idea of speech recognition conjures up the romantic notion of commanding a computer a la 'Star Trek' or the 'you talk, it types' image of early advertising for the software.The other side of the subject may be less pleasant but in some respects more significant. Speech recognition software is often needed when typists and other frequent users of computer keyboards suffer repetitive stress injuries. When typing suddenly becomes painful, the need to use voice recognition software to control a computer and input copy becomes a reality.Although some reports indicate that the number of RSI cases has fallen since the early 1990s, the medical clinic at the Massachusetts Institute of Technology reports seeing 300 patients a year complaining of RSI, suggesting that the problem is not going away.In either scenario, the development and improvement of speech recognition software has been, and remains, important. Not only are there many users who look forward to a time when they can control their PCs with voice commands, but the underlying technology is important to other uses.With the Internet spreading to handheld telephones, and personal digital assistants moving in as extensions of the enterprise, speech recognition takes on a whole new aura. It also can become a factor for agencies trying to meet Section 508 requirements for making information technology accessible to people with disabilities.The rising importance of speech recognition comes at a time when the products are more powerful and more affordable.'About 10 years ago, voice recognition and speech was a very expensive solution, because it was driven by hardware,' said Krishna Nathan, director of consumer voice recognition systems for IBM Voice Systems in West Palm Beach, Fla. 'That product was sold very specifically to certain businesses and [reseller] channels; it was a product that cost $1,300. The product was 'discrete' technology, where a user would have to pause between speaking each word.''Since then, desktop processing power has progressed dramatically, and the cost has come down a lot,' Nathan said. Now, speech recognition is 'all software. We've done away with hardware,' he said.Today, the new frontiers for speech recognition lie, in part at least, in improving microphone technology. The leading companies in the field are either offering or planning products that work with 'array' microphones, which lie flat on a monitor or a user's desk, eliminating the need for a headset.Another thing to look forward to later this year is the introduction of microphones that connect to portable PCs and desktop PCs via a Universal Serial Bus port. It is hoped these will deliver better sound quality and, thus, better speech recognition.Even Linux is being included. IBM is introducing a Linux version of its ViaVoice speech recognition software, and both IBM Corp. and Lernout & Hauspie Inc. (L&H) offer software developer kits to put voice features into Linux applications.What's more, Linux and other small-kernel operating systems could figure in bringing speech recognition to handheld platforms. Earlier this year, L&H demonstrated a prototype wireless device that has an Intel StrongARM processor and runs Linux. The prototype can check for, read and respond to e-mail by voice command and provide access to Web content.There has never been a better time to think about using speech recognition software on the job. But, at a time of perhaps the greatest technical advances in speech recognition software, the market is being consolidated. Sort of.On June 7, two leaders in the field became one: L&H, which has a good chunk of the market, finalized its acquisition of another speech software sultan, Dragon Systems Inc.The move increases L&H's product line dramatically, boosts the merged company's market share, and leaves IBM Corp. as the other principal force in the marketplace.For the next six to nine months, L&H and Dragon will operate with separate offices, Web sites and product lines. The next revision of Dragon's NaturallySpeaking software will bow this fall, and L&H just brought out Version 5.0 of Voice Xpress, its flagship product.According to Bill DeStefanis, senior director of product management in L&H's PC Applications Division, L&H will 'keep [current] development plans in place' at both companies.'Certainly for the next generation of products, we will start looking at potentially integrating the best of both applications,' he said.The combination of the two companies is aimed beyond the desktop PC market, he said.'If you look at speech more broadly than what's in the retail channel, some of the biggest opportunities are in embedded applications,' DeStefanis said. 'They [Dragon] have some of the best talent working on embedded technologies. The combination will allow us to bring products to market more quickly.'The speech recognition software field may be the Rodney Dangerfield of applications: It gets no respect. At least, it has not in recent years. PC Data, a Reston, Va., company that tracks sales, had no speech recognition products in its top'20 sales list for April 2000.Users of earlier versions of programs such as Dragon NaturallySpeaking and Voice Xpress often griped about the time it took to train a system to recognize the way a user spoke.A lot of effort has gone into making the programs easier to start off with and use. According to DeStefanis, such improvements were by design.'A new user can, in under 15 minutes, install, enroll [their voice] and have a successful experience with the software in that first half-hour exposure. As with any new technology, the first impressions are very important. People make up minds based on their first minutes, and we try to focus on that experience,' he said.Beyond making it easier to use the software, makers are working to build greater intelligence into their speech recognition programs, IBM's Nathan said.'A lot of what we're doing is focusing on the usability and productivity aspects,' Nathan said. 'We now get into the notion of browsing through the Internet, how you make searches less painful. There's a sister problem called natural language understanding, which is just what it says. Taking transcribed speech and acting on it. 'When's next flight to Albuquerque?''you want [the software] not to type but to have the computer act on it. That's the next step, because it makes interaction more natural.'The new version of Voice Xpress displays some of the other advancements in speech recognition. The software includes significantly improved accuracy and usability, and support for e-mail, Internet browsing and other applications. Also included is a technology called Nothing But Speech, which the company says is a new disfluency filter that eliminates the 'ahh' and 'umm' sounds users make while speaking that can increase errors in dictation.Many of these improvements come from user feedback. Hank Pokigo, product manager for Voice Xpress 5, said other improvements are aimed at making it easier to navigate the software and computer with voice commands.The new version of Voice Xpress, for example, includes 'a sample command screen that fights the 'blank screen syndrome,' ' Pokigo said.Although users understood the concept of text dictation, they needed to think about how to issue commands. 'With sample commands,' he added, 'users have a list of the top 20 or 30 commands in front of them; it's helpful to have that hand-holding right with them.'Along with customer research, Pokigo said, the company values the new generation of processors from Intel Corp. and Advanced Micro Devices Inc. of Sunnyvale, Calif.'We work directly with Intel and AMD to make sure we're keeping up to speed,' Pokigo said. 'It's very good for speech in general that AMD and Intel have pushed the processor. Pentium III instruction sets directly enhance the ability of speech recognition to get things done faster; because we can do things faster, it increases accuracy.'IBM's Nathan said: 'The key thing with speech is it's come a long way. It is worth giving it a try. Second is that the nature of its use is evolving, and the applications are evolving. We used to be all about dictation on the desktop, but now there's also telephony, Internet and mobile applications."
Speech recognition apps boost accuracy and do more than take dictation


















Discrete and concrete

















Merge in Boston





























Lernout & Hauspie's Voice Xpress includes a menu of tools and tips for improving accuracy.


Disfluency solved















Mark A. Kellner is a free-lance technology writer in Marina Del Rey, Calif. Contact him via e-mail at mark@kellner2000.com.
X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.