When Can Transformers Count to n?

G Yehudai, H Kaplan, A Ghandeharioun… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models based on the transformer architectures can solve highly complex
tasks. But are there simple tasks that such models cannot solve? Here we focus on very …

Decomposition Polyhedra of Piecewise Linear Functions

MC Brandenburg, M Grillo, C Hertrich - arXiv preprint arXiv:2410.04907, 2024 - arxiv.org
In this paper we contribute to the frequently studied question of how to decompose a
continuous piecewise linear (CPWL) function into a difference of two convex CPWL …

[HTML][HTML] Combining Dielectric and Hyperspectral Data for Apple Core Browning Detection

H Liu, J He, Y Shi, Y Bi - Applied Sciences, 2024 - mdpi.com
Apple core browning not only affects the nutritional quality of apples, but also poses a health
risk to consumers. Therefore, there is an urgent need to develop a fast and reliable non …

Efficient Learning Using Spiking Neural Networks Equipped With Affine Encoders and Decoders

AM Neuman, PC Petersen - arXiv preprint arXiv:2404.04549, 2024 - arxiv.org
We study the learning problem associated with spiking neural networks. Specifically, we
consider hypothesis sets of spiking neural networks with affine temporal encoders and …

Depth Separations in Neural Networks: Separating the Dimension from the Accuracy

I Safran, D Reichman, P Valiant - arXiv preprint arXiv:2402.07248, 2024 - arxiv.org
We prove an exponential separation between depth 2 and depth 3 neural networks, when
approximating an $\mathcal {O}(1) $-Lipschitz target function to constant accuracy, with …