ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Enhancement of Coded Speech Using Neural Network-Based Side Information
Cited 2 time in scopus Download 182 time Share share facebook twitter linkedin kakaostory
저자
황수중, 천영주, 한상욱, 장인선, 신종원
발행일
202109
출처
IEEE Access, v.9, pp.121532-121540
ISSN
2169-3536
출판사
IEEE
DOI
https://dx.doi.org/10.1109/ACCESS.2021.3108784
협약과제
21ZH1200, 초실감 입체공간 미디어·콘텐츠 원천기술연구, 이태진
초록
Audio codecs generate notable artifacts when operating at low bitrates, which degrade the quality of the coded audio significantly. There have been several approaches to enhance the quality of decoded signals with and without side information. While pre- or post-processing approaches without side information can be applied directly to existing systems without modifying codecs, approaches utilizing side information can further enhance the performance while maintaining backward-compatibility with existing codecs. In this paper, we propose a method to improve decoded signals using neural network-based side information. A neural network in the transmitter side that generates the side information and another neural network in the receiver side that estimates the log power spectra (LPS) of the original signal from the decoded signal and the side information are jointly trained to accurately reconstruct the original signal. In the same line with the analysis-by-synthesis, the neural network that generates the side information in the transmitter side takes not only the LPS of the original signal but also the LPS of the decoded signal as the input by decoding the encoded bitstream at the transmitter side. Experimental results show that the proposed audio codec enhancement scheme using neural network-based side information outperformed the audio codec enhancement without side information for the same codec operating at higher bitrates.
본 저작물은 크리에이티브 커먼즈 저작자 표시 - 비영리 - 변경금지 (CC BY NC ND) 조건에 따라 이용할 수 있습니다.
저작자 표시 - 비영리 - 변경금지 (CC BY NC ND)