Tuesday, July 28, 2009

Automatic Speech Recognition (ASR) Speech

Voice Recognition Speech recognition (also referred to as voice recognition) is a process by which the elements of spoken language can be recognized and analyzed, and the linguistic message it contains transposed into a meaningful form so that a machine can respond correctly to spoken commands. Voice recognition is distinct from voice identification, which is the capability to identify a specific individual by comparing unknown recorded voices to known voice exemplars to identify similar and dissimilar characteristics.
The "holy grail" of ASR research is to allow a computer to recognize in real-time with 100% accuracy all words that are intelligibly spoken by any person, independent of vocabulary size, noise, speaker characteristics and accent, or channel conditions. Despite several decades of research in this area, accuracy greater than 90% is only attained in commercial when the task is constrained in some way.
Different levels of performance can be attained by unclassified systems. Recognition of continuous digits over a microphone channel (small vocabulary, no noise) can be greater than 99%. If the system is trained to learn an individual speaker's voice, then much larger vocabularies are possible, although accuracy drops to somewhere between 90% and 95% for commercially-available systems. For large-vocabulary speech recognition of different speakers over different channels, accuracy n commercial systems is less than 90%, and processing can take hundreds of times real-time. Automatic Speech Recognition History
The earliest attempts to devise systems for automatic speech recognition by machine were made in the 1950s. Much of the early research leading to the development of speech activation and recognition technology was funded by NSA, NSF and the Defense Department's DARPA. Much of the initial research, performed with NSA and NSF funding, was conducted in the 1980s.
Kurzweil was founded in 1982 and proposed to use its experience, industry knowledge, and market presence to leverage the production of the interface. In 1985, the company had introduced Kurzweil Voice System, the first 1,000-word discrete-speech recognizer. This interface, adaptable to many applications, allowed the user to control the application by voice without modifying the operating system or software. In 1987, Kurzweil introduced the first 20,000-word discrete-speech recognizer, which was incorporated into Kurzweil Voice Report software and allowed users to create structured reports by voice. ( know more at )

No comments:

Post a Comment

G Edward Griffin A Second Look at the Federal Reserve

The Crisis in a nutshell