AUDITORY MODEL SIGNAL PROCESSING FOR SPEECH RECOGNITION

Award Information
Agency: Department of Defense
Branch: Defense Advanced Research Projects Agency
Contract: N/A
Agency Tracking Number: 13566
Amount: $46,000.00
Phase: Phase I
Program: SBIR
Awards Year: 1990
Solicitation Year: N/A
Solicitation Topic Code: N/A
Solicitation Number: N/A
Small Business Information
4487 Technology Dr, Fremont, CA, 94538
DUNS: N/A
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Dr Stephen Gill
 (415) 490-7600
Business Contact
Phone: () -
Research Institution
N/A
Abstract
STATE-OF-THE-ART VOICE RECOGNITION TECHNOLOGY IS BASED ON MATCHING SPECTRAL VOICE PATTERNS (ACOUSTIC ENERGY AS A FUNCTION OF TIME AND FREQUENCY). THE SIGNAL PROCESSING REQUIREMENTS OF SPECTRAL PATTERN MATCHING ARE CURRENTLY SERVED BY SPECIAL PURPOSE FILTER BANKS OR DIGITAL TRANSFORMS PERFORMED ON HIGH PERFORMANCE DSP CHIPS. FOR SEVERAL YEARS VOTAN HAS BEEN PERFORMING AN IN-DEPTH STUDY OF THE HUMAN AUDITORY SYSTEM TO OBTAIN A BETTER UNDERSTANDING OF HOW SIGNALS ARE PROCESSED AND SPEECH FEATURES EXTRACTED BY A HUMAN BEING. AS PART OF THIS RESEARCH, DETAILED MATHEMATICAL MODELS OF THE PHYSICS, CHEMISTRY, AND NEUROPHYSIOLOGY OF THE AUDITORY SYSTEM HAVE BEEN DEVELOPED AND COMPARED WITH AVAILABLE EXPERIMENTAL DATA. THIS RESEARCH HAS DEMONSTRATED THAT THE SIGNAL PROCESSING AND FEATURE EXTRACTION PROCESS IN A HUMAN BEING ARE RADICALLY DIFFERENT FROM THE SPECTRAL PATTERN APPROACH OF CURRENT VOICE RECOGNITION SYSTEMS. THE AUDITORY SYSTEM IS EXTREMELY SENSITIVE TO FEATURES NOT PRESENT IN THE SPECTRAL PATTERN (PRINCIPALLY PHASE AND TIMING FEATURES), AND CONVERSELY IS INSENSITIVE TO FEATURES THAT ARE PROMINENT IN THE SPECTRAL PATTERN. THESE DIFFERENCES ARE OF VITAL IMPORTANCE FOR ACCURATE SPEECH RECOGNITION. THE OBJECTIVE OF THE PROPOSED PHASE I EFFORT IS TO DETERMINE THE FEASIBILITY OF DEVELOPING A PREPROCESSOR FOR SPEECH RECOGNITION WHICH INCORPORATES AS ACCURATELY AS POSSIBLE A MODEL OF THE HUMAN AUDITORY SYSTEM. THIS PREPROCESSOR WOULD PERFORM THE FUNCTIONS OF THE OUTER EAR, THE MIDDLE EAR, THE INNER EAR (COCHLEA), HAIR CELL NEURAL TRANSDUCTION, AND NEURAL SIGNAL PROCESSING IN THE COCHLEAR NUCLEUS. THE OUTPUT OF THE PREPROCESSOR WOULD BE ACOUSTIC FEATURES SUITABLE FOR SPEECH RECOGNITION SYSTEMS USING EITHER CONVENTIONAL PATTERN MATCHING TECHNIQUES OR THE NEWER NEURAL NET TECHNIQUES. ANTICIPATED BENEFITS/POTENTIAL COMMERCIAL APPLICATIONS - SPEECH RECOGNITION IS AN EXTREMELY IMPORTANT AREA FOR BOTH COMMERCIAL AND DEFENSE APPLICATIONS. RECOGNITION ACCURACY, PARTICULARLY FOR LARGE VOCABULARY, CONTINUOUS, SPEAKER INDEPENDENT RECOGNITION OVER

* Information listed above is at the time of submission. *

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government