Demodice: Offline imitation learning with supplementary imperfect demonstrations GH Kim, S Seo, J Lee, W Jeon, HJ Hwang, H Yang, KE Kim International Conference on Learning Representations, 2022 | 72 | 2022 |
Monte-Carlo tree search for constrained POMDPs J Lee, GH Kim, P Poupart, KE Kim Advances in Neural Information Processing Systems 31, 2018 | 69 | 2018 |
Variational interaction information maximization for cross-domain disentanglement HJ Hwang, GH Kim, S Hong, KE Kim Advances in Neural Information Processing Systems 33, 22479-22491, 2020 | 43 | 2020 |
Multi-view representation learning via total correlation objective HJ Hwang, GH Kim, S Hong, KE Kim Advances in Neural Information Processing Systems 34, 12194-12207, 2021 | 34 | 2021 |
Monte-carlo tree search in continuous action spaces with value gradients J Lee, W Jeon, GH Kim, KE Kim Proceedings of the AAAI conference on artificial intelligence 34 (04), 4561-4568, 2020 | 23 | 2020 |
Lobsdice: Offline learning from observation via stationary distribution correction estimation GH Kim, J Lee, Y Jang, H Yang, KE Kim Advances in Neural Information Processing Systems 35, 8252-8264, 2022 | 18* | 2022 |
Variational inference for sequential data with future likelihood estimates GH Kim, Y Jang, H Yang, KE Kim International Conference on Machine Learning, 5296-5305, 2020 | 4 | 2020 |
Prospector: Improving LLM agents with self-asking and trajectory ranking B Kim, Y Jang, L Logeswaran, GH Kim, YJ Kim, H Lee, M Lee | 2 | 2023 |
Trust region sequential variational inference GH Kim, Y Jang, J Lee, W Jeon, H Yang, KE Kim Asian conference on machine learning, 1033-1048, 2019 | 2 | 2019 |
Bayesian optimistic kullback–leibler exploration K Lee, GH Kim, P Ortega, DD Lee, KE Kim Machine Learning 108, 765-783, 2019 | 2 | 2019 |
SafeDICE: offline safe imitation learning with non-preferred demonstrations Y Jang, GH Kim, J Lee, S Sohn, B Kim, H Lee, M Lee Advances in Neural Information Processing Systems 36, 2024 | | 2024 |
Information-theoretic state space model for multi-view reinforcement learning HJ Hwang, S Seo, Y Jang, S Kim, GH Kim, S Hong, KE Kim | | 2023 |
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration Y Jang, GH Kim, B Kim, YJ Kim, H Lee, M Lee Forty-first International Conference on Machine Learning, 0 | | |
DfPO: Degeneration-free Policy Optimization via Action Masking in Natural Language Action Spaces Y Jang, GH Kim, B Kim, H Lee, M Lee | | |