Local byte fusion for neural machine translation

A Thawani, S Ghanekar, X Zhu, J Pujara - arXiv preprint arXiv:2310.11628, 2023 - arxiv.org

Language models typically tokenize text into subwords, using a deterministic, hand-
engineered heuristic of combining characters into longer surface-level strings such as' ing'or …

被引用次数：2 相关文章所有 6 个版本

[PDF] mdpi.com

Neural machine translation of electrical engineering based on vector fusion

H Chen, Y Chen, J Zhang - Applied Sciences, 2023 - mdpi.com

The development of neural machine translation has achieved a good translation effect on
large-scale general corpora, but there are still many problems in the translation of low …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

K Slagle - arXiv preprint arXiv:2404.14408, 2024 - arxiv.org

Tokenization is widely used in large language models because it significantly improves
performance. However, tokenization imposes several disadvantages, such as performance …

Integrating Multi-scale Contextualized Information for Byte-based Neural Machine Translation

L Huang, Y Feng - arXiv preprint arXiv:2405.19290, 2024 - arxiv.org

Subword tokenization is a common method for vocabulary building in Neural Machine
Translation (NMT) models. However, increasingly complex tasks have revealed its …

[PDF][PDF] Manipulating Data Representations for Neural Machine Translation

C Amrhein - 2023 - zora.uzh.ch

In natural language processing, much current research focuses on training larger and larger
models on more and more data. In this thesis, we argue that how data is represented can …

Privacy in Federated Learning

M Zhang - 2024 - search.proquest.com

Abstract The rise of Artificial Intelligence technology has raised concerns about the potential
compromise of privacy due to the handling of personal data. Private AI prevents cybercrimes …

[PDF] openreview.net

Subword embedding from bytes against embedding-based attacks

M Zhang, J Xu - openreview.net

NLP models have grown as a powerful technology and impact our social life like never
before, along with rising concerns in practical applications including privacy invasion and …

高级搜索

QQ 群