Check for updates Towards Tackling MaxSAT by Combining Nested Monte Carlo with Local Search

A Saffidine, T Cazenave¹ - … , LION 17, Nice, France, June 4–8 …, 2023 - books.google.com
… e = 0.1 means there is 10% probability to take a random initialization for the literal, and so
on… Probabilistic MDP-behavior planning for cars. In: 2011 14th International IEEE Conference …

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
… Incorporating probabilistic statistics from multiple trajectories offers a promising solution
for sub-optimal data, which guides policy behaviors with learned estimated returns from the …

[图书][B] Online Maintenance Prioritization Via Monte Carlo Tree Search and Case-Based Reasoning in Complex Manufacturing Systems

ML Hoffman - 2021 - search.proquest.com
… Throughout this work, we specify the machine degradation transition probabilities, though
several methods exist for estimating a Markov degradation transition matrix from observed data…

[HTML][HTML] 基于博弈论的多车智能驾驶交互决策综述

衣鹏, 潘越, 王文远, 刘政钦, 洪奕光 - 2023 - kzyjc.alljournals.cn
… -vehicle driving to multi-vehicle driving in hybrid traffic scenarios. The main forthcoming
challenge is to generate high-quality trajectories that conform to vehicle … Multi-vehicle driving can …