You are here
Speech Enhancement Based on Auditory Coding of Voiced Signals
Phone: (315) 443-9749
Email: carney@alum.mit.edu
Phone: (301) 405-8861
Email: nadia.sadmulaire@omni-speech.com
Address:
Type: Nonprofit College or University
DESCRIPTION provided by applicant The greatest challenge to auditory communication is background noise especially in the complex acoustic environments that are experienced in daily life This project proposes to develop a novel speech enhancement algorithm that is robust in the presence of everyday environmental interference The algorithm is inspired by recent findings related to the neural coding of vowels in the mid brain Recent work from our group has shown that the formant frequencies of voiced sounds are encoded by the brain on the basis of changes in low frequency fluctuations related to voice pitch These responses are established in the auditory periphery and they are transformed into a robust representation of formant frequencies at the level of the auditory midbrain wherein neurons are exquisitely sensitive to rate fluctuations in the voice pitch frequency range This representation of formants is degraded in the presence of background noise by the inherent fluctuations introduced by the noise masker It is possible however to detect and track formants in the presence of noise using a strategy based on these pitch related fluctuations using a streamlined auditory model We are taking advantage of this finding to develop an algorithm that restores the pattern of fluctuations across frequency channels even in the presence of noise for listeners with or without hearing loss This restoration is accomplished by identifying and manipulating the rate fluctuations across the population of frequency channels in a manner that enhances the representation of speech The strategy involves identifying the low formant frequencies F F and F identifying the pitch F and amplifying a single harmonic of F near each formant peak This amplification restores the neural representation of noisy speech toward the response to speech in quiet The speech enhancement algorithm requires a pitch extraction mechanism that operates reliably within background noise For this project the Carney Lab has teamed with Omnispeech Inc which is developing such a pitch extraction mechanism This project is focused upon refinement of the formant tracking and harmonic identification algorithm using speech with known F and testing preference and intelligibility of processed vs unprocessed speech in the presence of noise refinement of the pitch extraction algorithm in the presence of noise and the combination of the new pitch extraction and speech enhancement mechanisms Tests will include listeners with normal hearing and listeners with mild to moderate sensorineural hearing loss The overall goal of this Phase I project is a proof of concept test of the novel speech enhancement algorithm Phase II will leverage the experience OmniSpeech is gaining in bringing speech technology to market in both cloud and embedded devices
PUBLIC HEALTH RELEVANCE The public health significance of the proposed work is that it will develop a speech enhancement algorithm that will assist listeners in challenging acoustic environments Difficulty hearing in noise is the most significant problem for all listeners including listeners with hearing loss In this study we will develop a novel speech enhancement algorithm and test its feasibility in listeners with and without hearing loss
* Information listed above is at the time of submission. *