Fast projection onto the simplex and the l 1 ball

G Peyré, M Cuturi - Foundations and Trends® in Machine …, 2019 - nowpublishers.com

Optimal transport (OT) theory can be informally described using the words of the French
mathematician Gaspard Monge (1746–1818): A worker with a shovel in hand has to move a …

被引用次数：3913 相关文章所有 14 个版本

[PDF] arxiv.org

A survey of recent advances in optimization methods for wireless communications

YF Liu, TH Chang, M Hong, Z Wu… - IEEE Journal on …, 2024 - ieeexplore.ieee.org

Mathematical optimization is now widely regarded as an indispensable modeling and
solution tool for the design of wireless communications systems. While optimization has …

被引用次数：34 相关文章所有 2 个版本

[PDF] neurips.cc

Efficient and modular implicit differentiation

M Blondel, Q Berthet, M Cuturi… - Advances in neural …, 2022 - proceedings.neurips.cc

Automatic differentiation (autodiff) has revolutionized machine learning. Itallows to express
complex computations by composing elementary ones in creativeways and removes the …

被引用次数：270 相关文章所有 10 个版本

Zero-shot hyperspectral sharpening

R Dian, A Guo, S Li - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Fusing hyperspectral images (HSIs) with multispectral images (MSIs) of higher spatial
resolution has become an effective way to sharpen HSIs. Recently, deep convolutional …

被引用次数：65 相关文章所有 5 个版本

Multiview consensus graph clustering

K Zhan, F Nie, J Wang, Y Yang - IEEE Transactions on Image …, 2018 - ieeexplore.ieee.org

A graph is usually formed to reveal the relationship between data points and graph structure
is encoded by the affinity matrix. Most graph-based multiview clustering methods use …

被引用次数：467 相关文章所有 9 个版本

[PDF] mlr.press

Parseval networks: Improving robustness to adversarial examples

M Cisse, P Bojanowski, E Grave… - International …, 2017 - proceedings.mlr.press

We introduce Parseval networks, a form of deep neural networks in which the Lipschitz
constant of linear, convolutional and aggregation layers is constrained to be smaller than $1 …

被引用次数：927 相关文章所有 8 个版本

[PDF] mlr.press

Sorting out Lipschitz function approximation

C Anil, J Lucas, R Grosse - International Conference on …, 2019 - proceedings.mlr.press

Training neural networks under a strict Lipschitz constraint is useful for provable adversarial
robustness, generalization bounds, interpretable gradients, and Wasserstein distance …

被引用次数：390 相关文章所有 6 个版本

[PDF] mlr.press

From softmax to sparsemax: A sparse model of attention and multi-label classification

A Martins, R Astudillo - International conference on machine …, 2016 - proceedings.mlr.press

We propose sparsemax, a new activation function similar to the traditional softmax, but able
to output sparse probabilities. After deriving its properties, we show how its Jacobian can be …

被引用次数：908 相关文章所有 10 个版本

[PDF] aclanthology.org

Sparse sequence-to-sequence models

B Peters, V Niculae, AFT Martins - arXiv preprint arXiv:1905.05702, 2019 - arxiv.org

Sequence-to-sequence models are a powerful workhorse of NLP. Most variants employ a
softmax transformation in both their attention mechanism and output layer, leading to dense …

被引用次数：252 相关文章所有 5 个版本

[PDF] thecvf.com

Hyperspectral super-resolution by coupled spectral unmixing

C Lanaras, E Baltsavias… - Proceedings of the IEEE …, 2015 - openaccess.thecvf.com

Hyperspectral cameras capture images with many narrow spectral channels, which densely
sample the electromagnetic spectrum. The detailed spectral resolution is useful for many …

被引用次数：415 相关文章所有 11 个版本

高级搜索

QQ 群