关注
Shangdong Yang
标题
引用次数
引用次数
年份
Efficient Average Reward Reinforcement Learning Using Constant Shifting Values
S Yang, Y Gao, B An, H Wang, X Chen
Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016
312016
An Optimal Algorithm for the Stochastic Bandits While Knowing the Near-Optimal Mean Reward
S Yang, Y Gao
IEEE Transactions on Neural Networks and Learning Systems 32 (5), 2285-2291, 2021
122021
A Contextual Bandit Approach to Personalized Online Recommendation via Sparse Interactions
C Zhang, H Wang, S Yang, Y Gao
Advances in Knowledge Discovery and Data Mining: 23rd Pacific-Asia …, 2019
112019
Contextual Bandits With Hidden Features to Online Recommendation via Sparse Interactions
S Yang, H Wang, C Zhang, Y Gao
IEEE Intelligent Systems 35 (5), 62-72, 2020
102020
New Galois Hulls Of Generalized Reed-Solomon Codes
Y Wu, C Li, S Yang
Finite Fields and Their Applications 83, 102084, 2022
82022
Effective Interpretable Policy Distillation via Critical Experiences Identification
X Liu, S Liu, B An, Y Gao, S Yang, W Li
IEEE Intelligent Systems, 2023
52023
Incremental Nonnegative Matrix Factorization Based on Matrix Sketching and k-means Clustering
C Zhang, H Wang, S Yang, Y Gao
Intelligent Data Engineering and Automated Learning–IDEAL 2016: 17th …, 2016
52016
Learning Explicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning via Polarization Policy Gradient
W Chen, W Li, X Liu, S Yang, Y Gao
Proceedings of the AAAI Conference on Artificial Intelligence, 2023
42023
WToE: Learning When to Explore in Multi-Agent Reinforcement Learning
S Dong, H Mao, S Yang, Z Shengyu, L Wenbin, H Jianye, Y Gao
IEEE Transactions on Cybernetics, 2023
32023
Modified Retrace for Off-Policy Temporal Difference Learning
X Chen, X Ma, Y Li, G Yang, S Yang, Y Gao
39th Conference On Uncertainty in Artificial Intelligence, 2023
32023
Keeping Minimal Experience to Achieve Efficient Interpretable Policy Distillation
X Liu, S Liu, W Li, S Yang, Y Gao
arXiv preprint arXiv:2203.00822, 2022
22022
An Optimal Algorithm for the Stochastic Bandits with Knowing Near-optimal Mean Reward
S Yang, H Wang, Y Gao, X Chen
Proceedings of the 17th International Conference on Autonomous Agents and …, 2018
22018
Online Attentive Kernel-based Temporal Difference Learning
X Chen, G Yang, S Yang, H Wang, S Dong, Y Gao
Knowledge-Based Systems 278, 110902, 2023
12023
Modeling rationality: Toward better performance against unknown agents in sequential games
Z Ge, S Yang, P Tian, Z Chen, Y Gao
IEEE Transactions on Cybernetics 54 (5), 2966-2977, 2022
12022
Online attentive kernel-based temporal difference learning
G Yang, X Chen, S Yang, H Wang, S Dong, Y Gao
arXiv preprint arXiv:2201.09065, 2022
12022
Learning Credit Assignment for Cooperative Reinforcement Learning
W Chen, W Li, X Liu, S Yang
arXiv preprint arXiv:2210.05367, 2022
12022
Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in Mixed Cooperative and Competitive Environments
S Dong, C Li, S Yang, W Li, Y Gao
Expert Systems with Applications 257, 125116, 2024
2024
Egoism, Utilitarianism and Egalitarianism in Multi-Agent Reinforcement Learning
S Dong, C Li, S Yang, B An, W Li, Y Gao
Neural Networks 178, 106544, 2024
2024
Selective Policy Transfer in Multi-Agent Systems with Sparse Interactions
Y Zhuang, Y Liu, S Yang, Y Gao
Knowledge-Based Systems 300, 112031, 2024
2024
混合博弈问题的求解与应用综述
董绍康, 李超, 杨光, 葛振兴, 曹宏业, 陈武兵, 杨尚东, 陈兴国, 李文斌, ...
软件学报, 1-47, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–20