Journal Article Speech Enhancement Using Phase-Dependent A Priori SNR Estimator in Log-Mel Spectral Domain
Lee Yun Kyung, Park Jeon Gue, Lee Yunkeun, 권오욱
Issue Date
ETRI Journal, v.36 no.5, pp.721-729
한국전자통신연구원 (ETRI)
Project Code
14MS1500, Development of dialog-based spontaneous speech interface technology on mobile platform , Lee Yunkeun
We propose a novel phase-based method for single-channel speech enhancement to extract and enhance the desired signals in noisy environments by utilizing the phase information. In the method, a phase-dependent a priori signal-to-noise ratio (SNR) is estimated in the log-mel spectral domain to utilize both the magnitude and phase information of input speech signals. The phase-dependent estimator is incorporated into the conventional magnitude-based decision-directed approach that recursively computes the a priori SNR from noisy speech. Additionally, we reduce the performance degradation owing to the one-frame delay of the estimated phase-dependent a priori SNR by using a minimum mean square error (MMSE)-based and maximum a posteriori (MAP)-based estimator. In our speech enhancement experiments, the proposed phase-dependent a priori SNR estimator is shown to improve the output SNR by 2.6 dB for both the MMSE-based and MAP-based estimator cases as compared to a conventional magnitude-based estimator.
KSP Keywords
Frame delay, Magnitude and phase, Minimum Mean Square Error(MMSE), Output SNR, Phase information, Phase-based, Signal noise ratio(SNR), Speech Signals, decision-directed approach, map-based, maximum a posteriori