Large language models (LLMs) have proven to be remarkably efficient, both across a wide range of natural language processing tasks and well beyond them. However, a …
H Li, M Wang, S Lu, X Cui, PY Chen - High-dimensional Learning …, 2024 - openreview.net
Chain-of-Thought (CoT) is an efficient prompting method that enables the reasoning ability of large language models by augmenting the query using multiple examples with …
The transformer architecture has catalyzed revolutionary advances in language modeling. However, recent architectural recipes, such as state-space models, have bridged the …
AAK Julistiono, DA Tarzanagh, N Azizan - arXiv preprint arXiv:2410.14581, 2024 - arxiv.org
Attention mechanisms have revolutionized several domains of artificial intelligence, such as natural language processing and computer vision, by enabling models to selectively focus …
H Li, M Wang, S Lu, X Cui, PY Chen - arXiv preprint arXiv:2410.02167, 2024 - arxiv.org
Chain-of-Thought (CoT) is an efficient prompting method that enables the reasoning ability of large language models by augmenting the query using multiple examples with multiple …
In recent years, transformer-based models have revolutionized deep learning, particularly in sequence modeling. To better understand this phenomenon, there is a growing interest in …
S Chevalier, D Starkenburg, K Dvijotham - arXiv preprint arXiv:2408.10491, 2024 - arxiv.org
In the field of formal verification, Neural Networks (NNs) are typically reformulated into equivalent mathematical programs which are optimized over. To overcome the inherent non …