Effective sentence scoring method using bert for speech recognition

C Wei, YC Wang, B Wang, CCJ Kuo - arXiv preprint arXiv:2303.05759, 2023 - arxiv.org

Language modeling studies the probability distributions over strings of texts. It is one of the
most fundamental tasks in natural language processing (NLP). It has been widely used in …

被引用次数：34 相关文章所有 3 个版本

[PDF] acm.org

On the dangers of stochastic parrots: Can language models be too big?🦜

EM Bender, T Gebru, A McMillan-Major… - Proceedings of the 2021 …, 2021 - dl.acm.org

The past 3 years of work in NLP have been characterized by the development and
deployment of ever larger language models, especially for English. BERT, its variants, GPT …

被引用次数：4345 相关文章所有 9 个版本

[PDF] arxiv.org

Masked language model scoring

J Salazar, D Liang, TQ Nguyen, K Kirchhoff - arXiv preprint arXiv …, 2019 - arxiv.org

Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead,
we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are …

被引用次数：475 相关文章所有 7 个版本

[PDF] neurips.cc

Hyporadise: An open baseline for generative speech recognition with large language models

C Chen, Y Hu, CHH Yang… - Advances in …, 2024 - proceedings.neurips.cc

Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …

被引用次数：21 相关文章所有 9 个版本

[PDF] arxiv.org

Pre-training transformers as energy-based cloze models

K Clark, MT Luong, QV Le, CD Manning - arXiv preprint arXiv:2012.08561, 2020 - arxiv.org

We introduce Electric, an energy-based cloze model for representation learning over text.
Like BERT, it is a conditional generative model of tokens given their contexts. However …

被引用次数：81 相关文章所有 8 个版本

[PDF] arxiv.org

With a little help from my temporal context: Multimodal egocentric action recognition

E Kazakos, J Huh, A Nagrani, A Zisserman… - arXiv preprint arXiv …, 2021 - arxiv.org

In egocentric videos, actions occur in quick succession. We capitalise on the action's
temporal context and propose a method that learns to attend to surrounding actions in order …

被引用次数：48 相关文章所有 9 个版本

[PDF] arxiv.org

Adapting GPT, GPT-2 and BERT language models for speech recognition

X Zheng, C Zhang, PC Woodland - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Language models (LMs) pre-trained on massive amounts of text, in particular bidirectional
encoder representations from Transformers (BERT), generative pre-training (GPT), and GPT …

被引用次数：48 相关文章所有 3 个版本

[PDF] arxiv.org

Whispering LLaMA: A cross-modal generative error correction framework for speech recognition

S Radhakrishnan, CHH Yang, SA Khan… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce a new cross-modal fusion technique designed for generative error correction
in automatic speech recognition (ASR). Our methodology leverages both acoustic …

被引用次数：22 相关文章所有 6 个版本

[PDF] arxiv.org

Causal mediation analysis for interpreting neural nlp: The case of gender bias

J Vig, S Gehrmann, Y Belinkov, S Qian, D Nevo… - arXiv preprint arXiv …, 2020 - arxiv.org

Common methods for interpreting neural models in natural language processing typically
examine either their structure or their behavior, but not both. We propose a methodology …

被引用次数：86 相关文章所有 2 个版本

[PDF] arxiv.org

Fast end-to-end speech recognition via non-autoregressive models and cross-modal knowledge transferring from BERT

Y Bai, J Yi, J Tao, Z Tian, Z Wen… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Attention-based encoder-decoder (AED) models have achieved promising performance in
speech recognition. However, because the decoder predicts text tokens (such as characters …

被引用次数：57 相关文章所有 4 个版本

高级搜索

QQ 群