Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

Unsupervised feature learning based on deep models for environmental audio tagging

Y Xu, Q Huang, W Wang, P Foster… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
Environmental audio tagging aims to predict only the presence or absence of certain
acoustic events in the interested acoustic scene. In this paper, we make contributions to …

Spoken content retrieval: A survey of techniques and technologies

M Larson, GJF Jones - Foundations and Trends® in …, 2012 - nowpublishers.com
Speech media, that is, digital audio and video containing spoken content, has blossomed in
recent years. Large collections are accruing on the Internet as well as in private and …

Factorized hidden layer adaptation for deep neural network based acoustic modeling

L Samarakoon, KC Sim - IEEE/ACM Transactions on Audio …, 2016 - ieeexplore.ieee.org
In this paper, we propose the factorized hidden layer (FHL) approach to adapt the deep
neural network (DNN) acoustic models for automatic speech recognition (ASR). FHL aims at …

Speech-augmented cone-of-vision for exploratory data analysis

R Bovo, D Giunchi, L Sidenmark, J Newn… - Proceedings of the …, 2023 - dl.acm.org
Mutual awareness of visual attention is crucial for successful collaboration. Previous
research has explored various ways to represent visual attention, such as field-of-view …

Large scale language modeling in automatic speech recognition

C Chelba, D Bikel, M Shugrina, P Nguyen… - arXiv preprint arXiv …, 2012 - arxiv.org
Large language models have been proven quite beneficial for a variety of automatic speech
recognition tasks in Google. We summarize results on Voice Search and a few YouTube …

Methods, systems, and media for searching for video content

YY Liu - US Patent 9,672,280, 2017 - Google Patents
US9672280B2 - Methods, systems, and media for searching for video content - Google Patents
US9672280B2 - Methods, systems, and media for searching for video content - Google Patents …

Faceted search and browsing of audio content on spoken web

M Diao, S Mukherjea, N Rajput… - Proceedings of the 19th …, 2010 - dl.acm.org
Spoken Web is a web of VoiceSites that can be accessed by a phone. The content in a
VoiceSite is audio. Therefore Spoken Web provides an alternate to the World Wide Web …

Podcast search: User goals and retrieval technologies

J Besser, M Larson, K Hofmann - Online information review, 2010 - emerald.com
Purpose–This research aims to identify users' goals and strategies when searching for
podcasts and their impact on the design of podcast retrieval technology. In particular, the …

Transcription of multi-genre media archives using out-of-domain data

PJ Bell, MJF Gales, P Lanchantin, X Liu… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org
We describe our work on developing a speech recognition system for multi-genre media
archives. The high diversity of the data makes this a challenging recognition task, which may …