A review of tracking and trajectory prediction methods for autonomous driving

F Leon, M Gavrilescu - Mathematics, 2021 - mdpi.com
This paper provides a literature review of some of the most important concepts, techniques,
and methodologies used within autonomous car systems. Specifically, we focus on two …

A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic

M Hong, HT Wai, Z Wang, Z Yang - SIAM Journal on Optimization, 2023 - SIAM
This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization.
Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is …

On finite-time convergence of actor-critic algorithm

S Qiu, Z Yang, J Ye, Z Wang - IEEE Journal on Selected Areas …, 2021 - ieeexplore.ieee.org
Actor-critic algorithm and their extensions have made great achievements in real-world
decision-making problems. In contrast to its empirical success, the theoretical understanding …

A review of tracking, prediction and decision making methods for autonomous driving

F Leon, M Gavrilescu - arXiv preprint arXiv:1909.07707, 2019 - arxiv.org
This literature review focuses on three important aspects of an autonomous car system:
tracking (assessing the identity of the actors such as cars, pedestrians or obstacles in a …

Finite-time performance bounds and adaptive learning rate selection for two time-scale reinforcement learning

H Gupta, R Srikant, L Ying - Advances in Neural …, 2019 - proceedings.neurips.cc
We study two time-scale linear stochastic approximation algorithms, which can be used to
model well-known reinforcement learning algorithms such as GTD, GTD2, and TDC. We …

Taming communication and sample complexities in decentralized policy evaluation for cooperative multi-agent reinforcement learning

X Zhang, Z Liu, J Liu, Z Zhu… - Advances in Neural …, 2021 - proceedings.neurips.cc
Cooperative multi-agent reinforcement learning (MARL) has received increasing attention in
recent years and has found many scientific and engineering applications. However, a key …

A block coordinate ascent algorithm for mean-variance optimization

T Xie, B Liu, Y Xu, M Ghavamzadeh… - Advances in …, 2018 - proceedings.neurips.cc
Risk management in dynamic decision problems is a primary concern in many fields,
including financial investment, autonomous driving, and healthcare. The mean-variance …

Modified retrace for off-policy temporal difference learning

X Chen, X Ma, Y Li, G Yang… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Off-policy learning is a key to extend reinforcement learning as it allows to learn a target
policy from a different behavior policy that generates the data. However, it is well known as …

Continual auxiliary task learning

M McLeod, C Lo, M Schlegel… - Advances in …, 2021 - proceedings.neurips.cc
Learning auxiliary tasks, such as multiple predictions about the world, can provide many
benefits to reinforcement learning systems. A variety of off-policy learning algorithms have …

Exploring reinforcement learning techniques in the realm of mobile robotics

Z Haider, MZ Sardar, AT Azar… - … of Automation and …, 2024 - inderscienceonline.com
Mobile robots are intelligent machines that can move and perform tasks in different
environments. The key factor enabling the autonomy of mobile robots lies in the reliability …