Convergence rates of parameter estimation for some weakly identifiable finite mixtures

H Nguyen, TT Nguyen, N Ho - Advances in Neural …, 2023 - proceedings.neurips.cc

Understanding the parameter estimation of softmax gating Gaussian mixture of experts has
remained a long-standing open problem in the literature. It is mainly due to three …

被引用次数：21 相关文章所有 11 个版本

[PDF] arxiv.org

Estimating the number of components in finite mixture models via the Group-Sort-Fuse procedure

T Manole, A Khalili - The Annals of Statistics, 2021 - projecteuclid.org

Estimating the number of components in finite mixture models via the Group-Sort-Fuse
procedure Page 1 The Annals of Statistics 2021, Vol. 49, No. 6, 3043–3069 https://doi.org/10.1214/21-AOS2072 …

被引用次数：26 相关文章所有 5 个版本

[PDF] neurips.cc

Projection robust Wasserstein distance and Riemannian optimization

T Lin, C Fan, N Ho, M Cuturi… - Advances in neural …, 2020 - proceedings.neurips.cc

Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a
robust variant of the Wasserstein distance. Recent work suggests that this quantity is more …

被引用次数：70 相关文章所有 10 个版本

[PDF] mlr.press

A riemannian block coordinate descent method for computing the projection robust wasserstein distance

M Huang, S Ma, L Lai - International Conference on …, 2021 - proceedings.mlr.press

The Wasserstein distance has become increasingly important in machine learning and deep
learning. Despite its popularity, the Wasserstein distance is hard to approximate because of …

被引用次数：52 相关文章所有 8 个版本

[PDF] arxiv.org

Fusemoe: Mixture-of-experts transformers for fleximodal fusion

X Han, H Nguyen, C Harris, N Ho, S Saria - arXiv preprint arXiv …, 2024 - arxiv.org

As machine learning models in critical fields increasingly grapple with multimodal data, they
face the dual challenges of handling a wide array of modalities, often incomplete due to …

被引用次数：12 相关文章所有 3 个版本

[PDF] projecteuclid.org

Strong identifiability and optimal minimax rates for finite mixture estimation

P Heinrich, J Kahn - 2018 - projecteuclid.org

Strong identifiability and optimal minimax rates for finite mixture estimation Page 1 The Annals
of Statistics 2018, Vol. 46, No. 6A, 2844–2870 https://doi.org/10.1214/17-AOS1641 © Institute …

被引用次数：111 相关文章所有 8 个版本

[PDF] mlr.press

Towards convergence rates for parameter estimation in Gaussian-gated mixture of experts

H Nguyen, TT Nguyen, K Nguyen… - … Conference on Artificial …, 2024 - proceedings.mlr.press

Originally introduced as a neural network for ensemble learning, mixture of experts (MoE)
has recently become a fundamental building block of highly successful modern deep neural …

被引用次数：11 相关文章所有 7 个版本

[PDF] arxiv.org

Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in iterations

Y Wu, HH Zhou - Mathematical Statistics and Learning, 2021 - ems.press

We analyze the classical EM algorithm for parameter estimation in the symmetric two-
component Gaussian mixtures in d dimensions. We show that, even in the absence of any …

被引用次数：58 相关文章所有 5 个版本

[PDF] mlr.press

On the minimax optimality of the EM algorithm for learning two-component mixed linear regression

J Kwon, N Ho, C Caramanis - International Conference on …, 2021 - proceedings.mlr.press

We study the convergence rates of the EM algorithm for learning two-component mixed
linear regression under all regimes of signal-to-noise ratio (SNR). We resolve a long …

被引用次数：46 相关文章所有 5 个版本

[PDF] arxiv.org

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

H Nguyen, P Akbarian, N Ho - arXiv preprint arXiv:2401.13875, 2024 - arxiv.org

Dense-to-sparse gating mixture of experts (MoE) has recently become an effective
alternative to a well-known sparse MoE. Rather than fixing the number of activated experts …

被引用次数：8 相关文章所有 3 个版本

高级搜索

QQ 群