A mixture-of-expert approach to rl-based dialogue management

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：124 相关文章所有 3 个版本

[PDF] arxiv.org

Leveraging large language models in conversational recommender systems

L Friedman, S Ahuja, D Allen, Z Tan… - arXiv preprint arXiv …, 2023 - arxiv.org

A Conversational Recommender System (CRS) offers increased transparency and control to
users by enabling them to engage with the system through a real-time multi-turn dialogue …

被引用次数：107 相关文章所有 3 个版本

[PDF] arxiv.org

Statistical perspective of top-k sparse softmax gating mixture of experts

H Nguyen, P Akbarian, F Yan, N Ho - arXiv preprint arXiv:2309.13850, 2023 - arxiv.org

Top-K sparse softmax gating mixture of experts has been widely used for scaling up massive
deep-learning architectures without increasing the computational cost. Despite its popularity …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

On least squares estimation in softmax gating mixture of experts

H Nguyen, N Ho, A Rinaldo - arXiv preprint arXiv:2402.02952, 2024 - arxiv.org

Mixture of experts (MoE) model is a statistical machine learning design that aggregates
multiple expert networks using a softmax gating function in order to form a more intricate and …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

A general theory for softmax gating multinomial logistic mixture of experts

H Nguyen, P Akbarian, TT Nguyen, N Ho - arXiv preprint arXiv:2310.14188, 2023 - arxiv.org

Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating
functions to achieve greater performance in numerous regression and classification …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

AI Text-to-Behavior: A Study In Steerability

D Noever, S Hyams - arXiv preprint arXiv:2308.07326, 2023 - arxiv.org

The research explores the steerability of Large Language Models (LLMs), particularly
OpenAI's ChatGPT iterations. By employing a behavioral psychology framework called …

被引用次数：4 相关文章所有 2 个版本

[PDF] aaai.org

DGPO: discovering multiple strategies with diversity-guided policy optimization

W Chen, S Huang, Y Chiang, T Pearce… - Proceedings of the …, 2024 - ojs.aaai.org

Most reinforcement learning algorithms seek a single optimal strategy that solves a given
task. However, it can often be valuable to learn a diverse set of solutions, for instance, to …

被引用次数：6 相关文章所有 6 个版本

[PDF] aclanthology.org

Be Helpful but Don't Talk too Much-Enhancing Helpfulness in Conversations through Relevance in Multi-Turn Emotional Support

J Li, B Peng, YY Hsu, CR Huang - Proceedings of the 2024 …, 2024 - aclanthology.org

For a conversation to help and support, speakers should maintain an “effect-effort” tradeoff.
As outlined in the gist of “Cognitive Relevance Principle”, helpful speakers should optimize …

Offline reinforcement learning for mixture-of-expert dialogue management

D Gupta, Y Chow, A Tulepbergenov… - Advances in …, 2024 - proceedings.neurips.cc

Reinforcement learning (RL) has shown great promise for developing agents for dialogue
management (DM) that are non-myopic, conduct rich conversations, and maximize overall …

被引用次数：4 相关文章所有 8 个版本

[PDF] arxiv.org

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

H Nguyen, N Ho, A Rinaldo - arXiv preprint arXiv:2405.13997, 2024 - arxiv.org

The softmax gating function is arguably the most popular choice in mixture of experts
modeling. Despite its widespread use in practice, softmax gating may lead to unnecessary …

被引用次数：4 相关文章所有 3 个版本

高级搜索

QQ 群