A Novel Speech Separation Approach for Enhanced Speaker Identification and Speech Recognition
Agency / Branch:
DOD / NAVY
In order to improve the performance of speaker identification and speech recognition, we need an integrated approach. We propose a novel approach that addresses all of the above challenges in a unified framework. First, we propose to apply microphone arrays (1-D, 2-D or 3-D) to acquire speech signals. The arrays can provide better Direction of Arrivals (DOA) estimation and improves background noise suppression. The collected speech will have high SNR. Second, we propose to apply the latest speech enhancement algorithm developed by our subcontractor at UM. The idea is based on Modified Phase Opponency (MPO) and does not require noise estimation. Third, for each separated speech stream, there may still be regions that have poor SNR. So we propose to apply spectrogram reconstruction algorithm to repair the poor SNR regions. Fourth, robust features based on Mel-frequency Cepstral Coefficients (MFCC) will be applied to the repaired spectrogram. Finally, Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) will be used to identify the speaker and recognize the speech.
Small Business Information at Submission:
Research Institution Information:
SIGNAL PROCESSING, INC.
13619 Valley Oak Circle ROCKVILLE, MD 20850
Number of Employees:
UNIV. OF MARYLAND
Department of Electrical and
College Park, MD 20742
Nonprofit college or university