USA flag logo/image

An Official Website of the United States Government

A Novel Unsupervised Audio Clustering Approach in Noisy Environments

Award Information

Agency:
Department of Defense
Branch:
N/A
Award ID:
Program Year/Program:
2012 / SBIR
Agency Tracking Number:
N112-163-0451
Solicitation Year:
2011
Solicitation Topic Code:
N112-163
Solicitation Number:
2011.2
Small Business Information
SIGNAL PROCESSING, INC.
9700 Great Seneca Highway Rockville, MD -
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
 
Phase 1
Fiscal Year: 2012
Title: A Novel Unsupervised Audio Clustering Approach in Noisy Environments
Agency: DOD
Contract: N00014-12-M-0037
Award Amount: $80,000.00
 

Abstract:

Detection of conversations in a noisy environment is challenging. We propose the following novel framework for audio clustering. First, we propose to apply computational auditory scene analysis (CASA) as a front-end to separate speech signals from non-speech background noise. Inspired by auditory perception, CASA typically segregates speech from noise by producing a binary time-frequency mask. The binary masks are then used to reconstruct clean speeches. Second, since the reconstructed clean speeches may contain more than one speaker"s voice, we propose an unsupervised audio clustering approach to perform speech separation. Unreliable time-frequency (T-F) units in simultaneous streams are reconstructed using a speech prior, and cepstral features are subsequently derived for clustering. We search for two clusters exhibiting the biggest speaker difference, i.e. the trace of the between- and within-cluster scatter matrix ratio. To speed up the search process, a genetic algorithm (GA) is employed. Third, after we extract the audio streams of each speaker, we go one more step. We propose to apply the latest speaker identification algorithm developed by our team for each separated voice stream. The reason to apply robust algorithms is that there may still be residual noise in the separated voice streams.

Principal Investigator:

Chiman Kwan
Chief Technology Officer
(240) 505-2641
chiman.kwan@signalpro.net

Business Contact:

Chihwa Yung
Chief Operations Officer
(301) 315-2322
chihwa.yung@signalpro.net
Small Business Information at Submission:

SIGNAL PROCESSING, INC.
13619 Valley Oak Circle ROCKVILLE, MD -

EIN/Tax ID: 134320631
DUNS: N/A
Number of Employees:
Woman-Owned: Yes
Minority-Owned: Yes
HUBZone-Owned: No