All word embeddings from one embedding

S Takase, S Kiyono - arXiv preprint arXiv:2104.06022, 2021 - arxiv.org

We propose a parameter sharing method for Transformers (Vaswani et al., 2017). The
proposed approach relaxes a widely used technique, which shares parameters for one layer …

被引用次数：83 相关文章所有 5 个版本

[PDF] arxiv.org

Rethinking perturbations in encoder-decoders for fast training

S Takase, S Kiyono - arXiv preprint arXiv:2104.01853, 2021 - arxiv.org

We often use perturbations to regularize neural models. For neural encoder-decoders,
previous studies applied the scheduled sampling (Bengio et al., 2015) and adversarial …

被引用次数：44 相关文章所有 4 个版本

[PDF] arxiv.org

CipherDAug: Ciphertext based data augmentation for neural machine translation

N Kambhatla, L Born, A Sarkar - arXiv preprint arXiv:2204.00665, 2022 - arxiv.org

We propose a novel data-augmentation technique for neural machine translation based on
ROT-$ k $ ciphertexts. ROT-$ k $ is a simple letter substitution cipher that replaces a letter in …

被引用次数：24 相关文章所有 5 个版本

[PDF] arxiv.org

Embedding compression with hashing for efficient representation learning in large-scale graph

CCM Yeh, M Gu, Y Zheng, H Chen, J Ebrahimi… - Proceedings of the 28th …, 2022 - dl.acm.org

Graph neural networks (GNNs) are deep learning models designed specifically for graph
data, and they typically rely on node features as the input to the first layer. When applying …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Efficient estimation of influence of a training instance

S Kobayashi, S Yokoi, J Suzuki, K Inui - arXiv preprint arXiv:2012.04207, 2020 - arxiv.org

Understanding the influence of a training instance on a neural network model leads to
improving interpretability. However, it is difficult and inefficient to evaluate the influence …

被引用次数：19 相关文章所有 4 个版本

[PDF] arxiv.org

Large Vocabulary Size Improves Large Language Models

S Takase, R Ri, S Kiyono, T Kato - arXiv preprint arXiv:2406.16508, 2024 - arxiv.org

This paper empirically investigates the relationship between subword vocabulary size and
the performance of large language models (LLMs) to provide insights on how to define the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation

H Qu, W Fan, Z Zhao, Q Li - arXiv preprint arXiv:2406.10450, 2024 - arxiv.org

There is a growing interest in utilizing large-scale language models (LLMs) to advance next-
generation Recommender Systems (RecSys), driven by their outstanding language …

被引用次数：10 相关文章

[PDF] aclanthology.org

Extract, Select and Rewrite: A Modular Sentence Summarization Method

S Guan, V Padmakumar - Proceedings of the 4th New Frontiers in …, 2023 - aclanthology.org

A modular approach has the advantage of being compositional and controllable, comparing
to most end-to-end models. In this paper we propose Extract-Select-Rewrite (ESR), a three …

被引用次数：2 相关文章所有 2 个版本

Automatic text summarization using transformer-based language models

R Rao, S Sharma, N Malik - International Journal of System Assurance …, 2024 - Springer

Automatic text summarization is a lucrative field in natural language processing (NLP). The
amount of data flow has multiplied with the switch to digital. The massive datasets hold a …

被引用次数：2 相关文章所有 3 个版本

[PDF] uni-miskolc.hu

[PDF][PDF] Faculty of Mechanical Engineering and Informatics Extended Sentence Parsing Method for Text-to-Semantic Application

WT Sewunetie - 2024 - geik.uni-miskolc.hu

The field of Natural Language Processing (NLP) has advanced significantly in recent years,
resulting in the creation of complex systems that indicate a deep comprehension of human …

高级搜索

QQ 群