Robust sound event classification using deep neural networks

I McLoughlin, H Zhang, Z Xie, Y Song… - IEEE/ACM Transactions …, 2015 - ieeexplore.ieee.org
The automatic recognition of sound events by computers is an important aspect of emerging
applications such as automated surveillance, machine hearing and auditory scene …

Robust audio event recognition with 1-max pooling convolutional neural networks

H Phan, L Hertel, M Maass, A Mertins - arXiv preprint arXiv:1604.06338, 2016 - arxiv.org
We present in this paper a simple, yet efficient convolutional neural network (CNN)
architecture for robust audio event recognition. Opposing to deep CNN architectures with …

Image feature representation of the subband power distribution for robust sound event classification

J Dennis, HD Tran, ES Chng - IEEE Transactions on Audio …, 2012 - ieeexplore.ieee.org
The ability to automatically recognize a wide range of sound events in real-world conditions
is an important part of applications such as acoustic surveillance and machine hearing. Our …

Sound event recognition in unstructured environments using spectrogram image processing

JW Dennis - 2014 - dr.ntu.edu.sg
The objective of this research is to develop feature extraction and classification techniques
for the task of sound event recognition (SER) in unstructured environments. Although this …

Using audio-visual features for robust voice activity detection in clean and noisy speech

I Almajai, B Milner - 2008 16th European Signal Processing …, 2008 - ieeexplore.ieee.org
The aim of this work is to utilize both audio and visual speech information to create a robust
voice activity detector (VAD) that operates in both clean and noisy speech. A statistical …

Single and multi-channel approaches for distant speech recognition under noisy reverberant conditions: I2R'S system description for the ASpIRE challenge

J Dennis, TH Dat - 2015 IEEE Workshop on Automatic Speech …, 2015 - ieeexplore.ieee.org
In this paper, we introduce the system developed at the Institute for Infocomm Research (I2
R) for the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge …

[PDF][PDF] Analysis of correlation between audio and visual speech features for clean audio feature prediction in noise.

I Almajai, B Milner, J Darch - Interspeech, 2006 - Citeseer
The aim of this work is to examine the correlation between audio and visual speech features.
The motivation is to find visual features that can provide clean audio feature estimates which …

Predicting formant frequencies from MFCC vectors [speech recognition applications]

J Darch, B Milner, X Shao, S Vaseghi… - … .(ICASSP'05). IEEE …, 2005 - ieeexplore.ieee.org
This work proposes a novel method of predicting formant frequencies from a stream of mel-
frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the …

[PDF][PDF] Maximising audio-visual speech correlation.

I Almajai, B Milner - AVSP, 2007 - academia.edu
The aim of this work is to investigate a selection of audio and visual speech features with the
aim of finding pairs that maximise audio-visual correlation. Two audio speech features have …

Visually-derived Wiener filters for speech enhancement

I Almajai, B Milner, J Darch… - 2007 IEEE International …, 2007 - ieeexplore.ieee.org
This work begins by examining the correlation between audio and visual speech features
and reveals higher correlation to exist within individual phoneme sounds rather than …