作者
Jianfeng Ren, Xudong Jiang, Junsong Yuan, Nadia Magnenat-Thalmann
发表日期
2016/10/18
期刊
IEEE Transactions on Multimedia
卷号
19
期号
3
页码范围
447-458
出版商
IEEE
简介
Sound-event classification often utilizes time-frequency analysis, which produces an image-like spectrogram. Recent approaches such as spectrogram image features and subband power distribution image features extract the image local statistics such as mean and variance from the spectrogram. They have demonstrated good performance. However, we argue that such simple image statistics cannot well capture the complex texture details of the spectrogram. Thus, we propose to extract the local binary pattern (LBP) from the logarithm of the Gammatone-like spectrogram. However, the LBP feature is sensitive to noise. After analyzing the spectrograms of sound events and the audio noise, we find that the magnitude of pixel differences, which is discarded by the LBP feature, carries important information for sound-event classification. We thus propose a multichannel LBP feature via pixel difference quantization to …
引用总数
201620172018201920202021202220232024178126512134
学术搜索中的文章
J Ren, X Jiang, J Yuan, N Magnenat-Thalmann - IEEE Transactions on Multimedia, 2016