Langevin thompson sampling with logarithmic communication: bandits and reinforcement learning

A Karbasi, NL Kuang, Y Ma… - … Conference on Machine …, 2023 - proceedings.mlr.press
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …

Langevin Thompson sampling with logarithmic communication: bandits and reinforcement learning

A Karbasi, NL Kuang, YA Ma, S Mitra - Proceedings of the 40th …, 2023 - dl.acm.org
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

A Karbasi, NL Kuang, Y Ma, S Mitra - openreview.net
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …

[PDF][PDF] Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

A Karbasi, NL Kuang, YA Ma, S Mitra - proceedings.mlr.press
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

A Karbasi, NL Kuang, YA Ma, S Mitra - arXiv preprint arXiv:2306.08803, 2023 - arxiv.org
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

A Karbasi, N Lijing Kuang, YA Ma, S Mitra - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use
and appealing empirical performance. However, many existing analytical and empirical …