Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation H Kiyohara, R Kishimoto, K Kawakami, K Kobayashi, K Nakata, Y Saito The Twelfth International Conference on Learning Representations, 2024 | 6 | 2024 |
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation H Kiyohara, R Kishimoto, K Kawakami, K Kobayashi, K Nakata, Y Saito arXiv preprint arXiv:2311.18206, 2023 | 2 | 2023 |
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits T Shimizu, K Tanaka, R Kishimoto, H Kiyohara, M Nomura, Y Saito arXiv preprint arXiv:2408.11202, 2024 | | 2024 |
Efficient Offline Learning of Ranking Policies via Top- Policy Decomposition R Kishimoto, K Tanaka, H Kiyohara, Y Narita, N Shimizu, Y Yamamoto, ... ICML 2024 Workshop: Aligning Reinforcement Learning Experimentalists and …, 0 | | |