A Novel Unsupervised Audio Clustering Approach in Noisy Environments

Award Information
Agency:
Department of Defense
Branch
Navy
Amount:
$80,000.00
Award Year:
2012
Program:
SBIR
Phase:
Phase I
Contract:
N00014-12-M-0037
Agency Tracking Number:
N112-163-0451
Solicitation Year:
2011
Solicitation Topic Code:
N112-163
Solicitation Number:
2011.2
Small Business Information
SIGNAL PROCESSING, INC.
MD, Rockville, MD, 20850-3563
Hubzone Owned:
N
Socially and Economically Disadvantaged:
Y
Woman Owned:
Y
Duns:
620282256
Principal Investigator:
Chiman Kwan
Chief Technology Officer
(240) 505-2641
chiman.kwan@signalpro.net
Business Contact:
Chihwa Yung
Chief Operations Officer
(301) 315-2322
chihwa.yung@signalpro.net
Research Institution:
Stub




Abstract
Detection of conversations in a noisy environment is challenging. We propose the following novel framework for audio clustering. First, we propose to apply computational auditory scene analysis (CASA) as a front-end to separate speech signals from non-speech background noise. Inspired by auditory perception, CASA typically segregates speech from noise by producing a binary time-frequency mask. The binary masks are then used to reconstruct clean speeches. Second, since the reconstructed clean speeches may contain more than one speaker"s voice, we propose an unsupervised audio clustering approach to perform speech separation. Unreliable time-frequency (T-F) units in simultaneous streams are reconstructed using a speech prior, and cepstral features are subsequently derived for clustering. We search for two clusters exhibiting the biggest speaker difference, i.e. the trace of the between- and within-cluster scatter matrix ratio. To speed up the search process, a genetic algorithm (GA) is employed. Third, after we extract the audio streams of each speaker, we go one more step. We propose to apply the latest speaker identification algorithm developed by our team for each separated voice stream. The reason to apply robust algorithms is that there may still be residual noise in the separated voice streams.

* information listed above is at the time of submission.

Agency Micro-sites

US Flag An Official Website of the United States Government