关注
Shalabh Bhatnagar
Shalabh Bhatnagar
Professor in the Department of Computer Science and Automation, Indian Institute of Science
在 iisc.ac.in 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Incremental natural actor-critic algorithms
S Bhatnagar, M Ghavamzadeh, M Lee, RS Sutton
Advances in neural information processing systems 20, 2007
10682007
Fast gradient-descent methods for temporal-difference learning with linear function approximation
RS Sutton, HR Maei, D Precup, S Bhatnagar, D Silver, C Szepesvári, ...
Proceedings of the 26th annual international conference on machine learning …, 2009
7132009
Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods
HLPLAP S.Bhatnagar
Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation …, 2013
415*2013
Reinforcement learning with function approximation for traffic signal control
LA Prashanth, S Bhatnagar
IEEE Transactions on Intelligent Transportation Systems 12 (2), 412-421, 2010
3812010
Convergent temporal-difference learning with arbitrary smooth function approximation
H Maei, C Szepesvari, S Bhatnagar, D Precup, D Silver, RS Sutton
Advances in neural information processing systems 22, 2009
3402009
Toward off-policy learning control with function approximation.
HR Maei, C Szepesvári, S Bhatnagar, RS Sutton
ICML 10, 719-726, 2010
3332010
An online actor–critic algorithm with function approximation for constrained markov decision processes
S Bhatnagar, K Lakshmanan
Journal of Optimization Theory and Applications 153, 688-708, 2012
2842012
An actor–critic algorithm with function approximation for discounted cost constrained Markov decision processes
S Bhatnagar
Systems & Control Letters 59 (12), 760-766, 2010
2262010
Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge
A Singla, S Padakandla, S Bhatnagar
IEEE transactions on intelligent transportation systems 22 (1), 107-118, 2019
2172019
Reinforcement learning algorithm for non-stationary environments
S Padakandla, P KJ, S Bhatnagar
Applied Intelligence 50 (11), 3590-3606, 2020
1362020
Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
S Bhatnagar, MC Fu, SI Marcus, IJ Wang
ACM Transactions on Modeling and Computer Simulation (TOMACS) 13 (2), 180-209, 2003
1162003
Multi-agent reinforcement learning for traffic signal control
KJ Prabuchandran, HK AN, S Bhatnagar
17th International IEEE Conference on Intelligent Transportation Systems …, 2014
1142014
Reinforcement learning with average cost for adaptive control of traffic lights at intersections
LA Prashanth, S Bhatnagar
2011 14th International IEEE Conference on Intelligent Transportation …, 2011
892011
A time aggregation approach to Markov decision processes
XR Cao, Z Ren, S Bhatnagar, M Fu, S Marcus
Automatica 38 (6), 929-943, 2002
892002
Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization
S Bhatnagar
ACM Transactions on Modeling and Computer Simulation (TOMACS) 15 (1), 74-107, 2005
792005
Two-timescale algorithms for learning Nash equilibria in general-sum stochastic games
HL Prasad, P LA, S Bhatnagar
Proceedings of the 2015 International Conference on Autonomous Agents and …, 2015
702015
Two time-scale stochastic approximation with controlled Markov noise and off-policy temporal-difference learning
P Karmakar, S Bhatnagar
Mathematics of Operations Research 43 (1), 130-151, 2018
692018
Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization
S Bhatnagar
ACM Transactions on Modeling and Computer Simulation (TOMACS) 18 (1), 1-35, 2007
682007
Two-timescale algorithms for simulation optimization of hidden Markov models
S Bhatnagar, MC Fu, SI Marcus, S Bhatnagar
Iie Transactions 33 (3), 245-258, 2001
592001
A Simultaneous Deterministic Perturbation Actor-Critic Algorithm with an Application to Optimal Mortgage Refinancing
VLR Chinthalapati, S Bhatnagar
Proceedings of the 45th IEEE Conference on Decision and Control, 4151-4156, 2006
582006
系统目前无法执行此操作,请稍后再试。
文章 1–20