You are here

Acoustic Source Separation and Localization

Award Information
Agency: Department of Defense
Branch: Defense Advanced Research Projects Agency
Contract: W31P4Q-10-C-0041
Agency Tracking Number: 09SB2-0189
Amount: $98,971.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: SB092-009
Solicitation Number: 2009.2
Timeline
Solicitation Year: 2009
Award Year: 2009
Award Start Date (Proposal Award Date): 2010-01-06
Award End Date (Contract End Date): 2010-09-06
Small Business Information
3361 Rouse Rd. Suite 215
Orlando, FL 32817
United States
DUNS: 146066829
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Rich Gombos
 Senior System Engineer
 (407) 384-9956
 Rich.Gombos@Coleengineering.com
Business Contact
 Bryan Cole
Title: President
Phone: (407) 384-9956
Email: Bryan.Cole@Coleengineering.com
Research Institution
N/A
Abstract

This is a proposal to determine the technical feasibility of a system capable of separating and localizing intermixed sounds in an auditory scene. Our approach will be to design and develop a computational auditory model that overcomes the inherent theoretical limits of the classic Fourier-based model. Phase I effort will produce a requirements specification and design documentation for the lower two levels of a five level computational auditory model, and include source code for the system components that we make operational. The (five level) objective system consists of the implementation of a real-time model capable of waveform analysis analogous to that of the human ear. We call it the Waveform Information Vector (WIV) to Time-Space Translator, or “ WIVEX” processor. Preliminary experiments have demonstrated the feasibility of parts of the model to encode and extract, in real time, meaningful information directly from the signal waveform. For example it can separate environmental sounds of all kinds, including speech, while determining their individual direction of arrival. Our Phase I research objectives are to: formalize the requirements and design specifications for a real-time system, through analysis, design and prototyping of components. Phase I will conclude with the delivery of necessary artifacts and demonstration of specific auditory functions (see Table 1) that are not now and probably never will be achievable with Fourier-based technology. All demonstrations are intended to run in real time, synchronously with the input signal. Table 1: Phase I Auditory Functions 1) Pitch detection in both speech and music, or any other tonal acoustic source 2) Instantaneous direction of arrival used to separate sound sources via binaural perception 3) Instantaneous monaural separation of sources by recognizing patterns in waveshape zero intervals 4) Phonetic segmentation of speech and environmental sources by pattern similarities 5) Meaningful labeling of waveshape components 6) Replication of psychoacoustic experiments in two-tone interference not explainable in current auditory theory 7) Demonstration of autonomic source selection according to attention priority from background of mixed signal sources Collectively, successful completion of these tasks should go a long way toward confirming the technical feasibility of the WIVEX as the basis of a more relevant auditory model. As such, this could produce a compelling theory for understanding the biophysical functions in the auditory pathways of not just humans but the entire animal kingdom. It will be shown that the operational functions and processing components of this model are analogous to neurological capabilities and merely require fundamental algorithms and mathematical computations. The proposed model is somewhat analogous to the auditory system of the animal kingdom in that it is built as an evolutionary hierarchy of processing levels that begin at a low level by extracting primitive meaning such as direction of arrival, amplitude, and encoded simple waveshape features. It then progresses upward in five stages of cognitive perception and culminates in complex aspects of human linguistic and emotional communication. These individual functions will be carried out in real time, synchronized with the incoming signal waveform. Thus it is possible to isolate and understand the basic auditory functions while at the same time peeling off highly useful applications.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government