This paper introduces wav2letter++, a fast open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for …
T Besard, C Foket, B De Sutter - IEEE Transactions on Parallel …, 2018 - ieeexplore.ieee.org
GPUs and other accelerators are popular devices for accelerating compute-intensive, parallelizable applications. However, programming these devices is a difficult task. Writing …
Deep learning has had remarkable success in robotic perception, but its data-centric nature suffers when it comes to generalizing to ever-changing environments. By contrast, physics …
In-memory databases require careful tuning and many engineering tricks to achieve good performance. Such database performance engineering is hard: a plethora of data and …
As the computational requirements for machine learning systems and the size and complexity of machine learning frameworks increases, essential framework innovation has …
A common operation in many data analytics workloads is to find the top-k items, ie, the largest or smallest operations according to some sort order (implemented via LIMIT or …
We demonstrate a high-performance vendor-agnostic method for massively parallel solving of ensembles of ordinary differential equations (ODEs) and stochastic differential equations …
G Kim, M Lee, J Jeong, J Kim - 2014 47th Annual IEEE/ACM …, 2014 - ieeexplore.ieee.org
GPUs are being widely used to accelerate different workloads and multi-GPU systems can provide higher performance with multiple discrete GPUs interconnected together. However …
Training large neural networks on big datasets requires significant computational resources and time. Transfer learning reduces training time by pre-training a base model on one …