Straightening out the straight-through estimator: Overcoming optimization challenges in vector...

L Heinrich, T Golling, M Kagan, S Klein… - Machine Learning …, 2024 - iopscience.iop.org

We propose masked particle modeling (MPM) as a self-supervised method for learning
generic, transferable, and reusable representations on unordered sets of inputs for use in …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Givt: Generative infinite-vocabulary transformers

M Tschannen, C Eastwood, F Mentzer - arXiv preprint arXiv:2312.02116, 2023 - arxiv.org

We introduce generative infinite-vocabulary transformers (GIVT) which generate vector
sequences with real-valued entries, instead of discrete tokens from a finite vocabulary. To …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Vector Quantization for Recommender Systems: A Review and Outlook

Q Liu, X Dong, J Xiao, N Chen, H Hu, J Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

Vector quantization, renowned for its unparalleled feature compression capabilities, has
been a prominent topic in signal processing and machine learning research for several …

被引用次数：2 相关文章所有 2 个版本

[PDF] openreview.net

EdVAE: Mitigating codebook collapse with evidential discrete variational autoencoders

G Baykal, M Kandemir, G Unal - Pattern Recognition, 2024 - Elsevier

Codebook collapse is a common problem in training deep generative models with discrete
representation spaces like Vector Quantized Variational Autoencoders (VQ-VAEs). We …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to act without actions

D Schmidt, M Jiang - arXiv preprint arXiv:2312.10812, 2023 - arxiv.org

Pre-training large models on vast amounts of web data has proven to be an effective
approach for obtaining powerful, general models in several domains, including language …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

S Li, Z Wang, Z Liu, D Wu, C Tan, J Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org

Similar to natural language models, pre-trained genome language models are proposed to
capture the underlying intricacies within genomes with unsupervised sequence modeling …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

VQ-NeRV: A Vector Quantized Neural Representation for Videos

Y Xu, X Feng, F Qin, R Ge, Y Peng, C Wang - arXiv preprint arXiv …, 2024 - arxiv.org

Implicit neural representations (INR) excel in encoding videos within neural networks,
showcasing promise in computer vision tasks like video compression and denoising. INR …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

OmniJet-: The first cross-task foundation model for particle physics

J Birk, A Hallin, G Kasieczka - arXiv preprint arXiv:2403.05618, 2024 - arxiv.org

Foundation models are multi-dataset and multi-task machine learning methods that once pre-
trained can be fine-tuned for a large variety of downstream applications. The successful …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Learning the Language of Protein Structure

B Gaujac, J Donà, L Copoiu, T Atkinson… - arXiv preprint arXiv …, 2024 - arxiv.org

Representation learning and\emph {de novo} generation of proteins are pivotal
computational biology tasks. Whilst natural language processing (NLP) techniques have …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner

D Yang, H Guo, Y Wang, R Huang, X Li, X Tan… - arXiv preprint arXiv …, 2024 - arxiv.org

The Large Language models (LLMs) have demonstrated supreme capabilities in text
understanding and generation, but cannot be directly applied to cross-modal tasks without …

高级搜索

QQ 群