During the last decade, Convolutional Neural Networks (CNNs) have become the de facto standard for various Computer Vision and Machine Learning operations. CNNs are feed …
RT Schirrmeister, JT Springenberg… - Human brain …, 2017 - Wiley Online Library
Deep learning with convolutional neural networks (deep ConvNets) has revolutionized computer vision through end‐to‐end learning, that is, learning from the raw data. There is …
We describe the latest version of Microsoft's conversational speech recognition system for the Switchboard and CallHome domains. The system adds a CNN-BLSTM acoustic model to …
WD Jin, J Xu, Q Han, Y Zhang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Current RGB-D salient object detection (SOD) methods utilize the depth stream as complementary information to the RGB stream. However, the depth maps are usually of low …
S Tokui, K Oono, S Hido, J Clayton - Proceedings of workshop on …, 2015 - learningsys.org
Software frameworks for neural networks play key roles in the development and application of deep learning methods. However, as new types of deep learning models are developed …
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human …
Sequence-to-sequence models have shown success in end-to-end speech recognition. However these models have only used shallow acoustic encoder networks. In our work, we …
Layer-sequential unit-variance (LSUV) initialization-a simple method for weight initialization for deep net learning-is proposed. The method consists of the two steps. First, pre-initialize …
Learning acoustic models directly from the raw waveform data with minimal processing is challenging. Current waveform-based models have generally used very few (~ 2) …