An audio indexing system for election video material

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org

Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

被引用次数：129 相关文章所有 12 个版本

[PDF] ieee.org

Unsupervised feature learning based on deep models for environmental audio tagging

Y Xu, Q Huang, W Wang, P Foster… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org

Environmental audio tagging aims to predict only the presence or absence of certain
acoustic events in the interested acoustic scene. In this paper, we make contributions to …

被引用次数：96 相关文章所有 8 个版本

[PDF] nowpublishers.com

Spoken content retrieval: A survey of techniques and technologies

M Larson, GJF Jones - Foundations and Trends® in …, 2012 - nowpublishers.com

Speech media, that is, digital audio and video containing spoken content, has blossomed in
recent years. Large collections are accruing on the Internet as well as in private and …

被引用次数：107 相关文章所有 15 个版本

Factorized hidden layer adaptation for deep neural network based acoustic modeling

L Samarakoon, KC Sim - IEEE/ACM Transactions on Audio …, 2016 - ieeexplore.ieee.org

In this paper, we propose the factorized hidden layer (FHL) approach to adapt the deep
neural network (DNN) acoustic models for automatic speech recognition (ASR). FHL aims at …

被引用次数：74 相关文章所有 4 个版本

[PDF] acm.org

Speech-augmented cone-of-vision for exploratory data analysis

R Bovo, D Giunchi, L Sidenmark, J Newn… - Proceedings of the …, 2023 - dl.acm.org

Mutual awareness of visual attention is crucial for successful collaboration. Previous
research has explored various ways to represent visual attention, such as field-of-view …

被引用次数：3 相关文章所有 6 个版本

[PDF] arxiv.org

Large scale language modeling in automatic speech recognition

C Chelba, D Bikel, M Shugrina, P Nguyen… - arXiv preprint arXiv …, 2012 - arxiv.org

Large language models have been proven quite beneficial for a variety of automatic speech
recognition tasks in Google. We summarize results on Voice Search and a few YouTube …

被引用次数：65 相关文章所有 11 个版本

[PDF] googleapis.com

Methods, systems, and media for searching for video content

YY Liu - US Patent 9,672,280, 2017 - Google Patents

US9672280B2 - Methods, systems, and media for searching for video content - Google Patents
US9672280B2 - Methods, systems, and media for searching for video content - Google Patents …

被引用次数：40 相关文章所有 4 个版本

Faceted search and browsing of audio content on spoken web

M Diao, S Mukherjea, N Rajput… - Proceedings of the 19th …, 2010 - dl.acm.org

Spoken Web is a web of VoiceSites that can be accessed by a phone. The content in a
VoiceSite is audio. Therefore Spoken Web provides an alternate to the World Wide Web …

被引用次数：50 相关文章所有 2 个版本

Podcast search: User goals and retrieval technologies

J Besser, M Larson, K Hofmann - Online information review, 2010 - emerald.com

Purpose–This research aims to identify users' goals and strategies when searching for
podcasts and their impact on the design of podcast retrieval technology. In particular, the …

被引用次数：41 相关文章所有 8 个版本

[PDF] ed.ac.uk

Transcription of multi-genre media archives using out-of-domain data

PJ Bell, MJF Gales, P Lanchantin, X Liu… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org

We describe our work on developing a speech recognition system for multi-genre media
archives. The high diversity of the data makes this a challenging recognition task, which may …

被引用次数：41 相关文章所有 12 个版本

高级搜索

QQ 群