A Novel Speech Separation Approach for Enhanced Speaker Identification and Speech Recognition
Agency / Branch:
DOD / NAVY
In order to improve the performance of speaker identification, voiceprint matching, and speech recognition in noisy and clutter (multiple-speaker cocktail party) environment, we need an integrated approach. In this project, we propose a novel approach that addresses this challenging problem in a unified framework. First, we propose to apply microphone(s) to acquire speech signals. Single microphone is more challenging in dealing with noisy conditions. With multiple microphones, it is possible to have much better Direction of Arrivals (DOA) estimation and background noise suppression. As a result, the collected speech will have high SNR. Second, we propose state-of-the-art speech separation techniques to separate voices for both single microphone and multiple microphones. Third, we propose to apply the latest speech enhancement algorithms, including Minimum Mean Square Error (MMSE), Modified Phase Opponency (MPO), and possibly other methods, to remove any residual noise in the separated voice streams. Fourth, robust features based on Mel-frequency Cepstral Coefficients (MFCC) will be applied to extract speech features. Finally, Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) will be used to identify the speaker and recognize the speech. Dynamic Time Warping (DTW) technique will be used for voiceprint verification.
Small Business Information at Submission:
Research Institution Information:
SIGNAL PROCESSING, INC.
13619 Valley Oak Circle ROCKVILLE, MD 20850
Number of Employees:
Department of Electrical & Com
2405 A.V. Williams Bldg.
College Park, MD 20742
Nonprofit college or university