A Survey of RWKV

Z Li, T Xia, Y Chang, Y Wu - arXiv preprint arXiv:2412.14847, 2024 - arxiv.org
The Receptance Weighted Key Value (RWKV) model offers a novel alternative to the
Transformer architecture, merging the benefits of recurrent and attention-based systems …

Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment

M Wang, C Ma, Q Chen, L Meng, Y Han, J Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Self-play methods have demonstrated remarkable success in enhancing model capabilities
across various domains. In the context of Reinforcement Learning from Human Feedback …

Enhancing AI-Bot Strength and Strategy Diversity in Adversarial Games: A Novel Deep Reinforcement Learning Framework

C Sun, S Shen, D Xue, W Tao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep reinforcement learning (DRL) has emerged as a leading technique for designing AI-
bots in the gaming industry. However, practical implementation of DRL-trained bots often …

Artificial Intelligence Algorithms for Large Economic and Computer Games

Z Li - 2024 - deepblue.lib.umich.edu
Contemporary artificial intelligence algorithms (search, graphical models, machine learning,
etc.) have achieved great success in a variety of practical domains. This thesis particularly …