[HTML][HTML] Data-distribution-informed Nyström approximation for structured data using vector quantization-based landmark determination

M Münch, KS Bohnsack, FM Schleif, T Villmann - Neurocomputing, 2024 - Elsevier
We present an effective method for supervised landmark selection in sparse Nyström
approximations of kernel matrices for structured data. Our approach transforms structured …

Elliptical Attention

SK Nielsen, LU Abdullaev, R Teo… - arXiv preprint arXiv …, 2024 - arxiv.org
Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-
the-art performance across a variety of applications in language and vision. This dot-product …

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

Y Chen, Q Tao, F Tonin, JAK Suykens - arXiv preprint arXiv:2402.01476, 2024 - arxiv.org
While the great capability of Transformers significantly boosts prediction accuracy, it could
also yield overconfident predictions and require calibrated uncertainty estimation, which can …

An SVD-like Decomposition of Bounded-Input Bounded-Output Functions

BC Brown, M King, S Warnick, E Yeung… - arXiv preprint arXiv …, 2024 - arxiv.org
The Singular Value Decomposition (SVD) of linear functions facilitates the calculation of
their 2-induced norm and row and null spaces, hallmarks of linear control theory. In this …

Fairness-Aware Attention for Contrastive Learning

S Nielsen, TM Nguyen - openreview.net
Contrastive learning has proven instrumental in learning unbiased representations of data,
especially in complex environments characterized by high-cardinality and high-dimensional …