Fusing pre-trained language models with multimodal prompts through reinforcement learning Y Yu, J Chung, H Yun, J Hessel, JS Park, X Lu, R Zellers, P Ammanabrolu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 44* | 2023 |
Acav100m: Automatic curation of large-scale datasets for audio-visual video representation learning S Lee, J Chung, Y Yu, G Kim, T Breuel, G Chechik, Y Song Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 39 | 2021 |
Transitional adaptation of pretrained models for visual storytelling Y Yu, J Chung, H Yun, J Kim, G Kim Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 32 | 2021 |
Character grounding and re-identification in story of videos and text descriptions Y Yu, J Kim, H Yun, J Chung, G Kim Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 7 | 2020 |
Language models as compilers: Simulating pseudocode execution improves algorithmic reasoning in language models H Chae, Y Kim, S Kim, KT Ong, B Kwak, M Kim, S Kim, T Kwon, J Chung, ... arXiv preprint arXiv:2404.02575, 2024 | 4 | 2024 |
Long Story Short: a Summarize-then-Search Method for Long Video Question Answering J Chung, Y Yu arXiv preprint arXiv:2311.01233, 2023 | 2 | 2023 |
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics S Lee, S Lim, S Han, G Oh, H Chae, J Chung, M Kim, B Kwak, Y Lee, ... arXiv preprint arXiv:2406.14703, 2024 | 1 | 2024 |
HyperCLOVA X Technical Report KM Yoo, J Han, S In, H Jeon, J Jeong, J Kang, H Kim, KM Kim, M Kim, ... arXiv preprint arXiv:2404.01954, 2024 | 1 | 2024 |
Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding J Chung, S Lee, M Kim, S Han, A Yousefpour, J Hessel, Y Yu arXiv preprint arXiv:2406.18925, 2024 | | 2024 |
VLIS: Unimodal Language Models Guide Multimodal Language Generation J Chung, Y Yu | | 2023 |
Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms S Han, J Kim, J Hessel, L Jiang, J Chung, Y Son, Y Choi, Y Yu arXiv preprint arXiv:2310.10418, 2023 | | 2023 |
Supplementary Material for ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning S Lee, J Chung, Y Yu, G Kim, T Breuel, G Chechik, Y Song | | |
Supplementary Material for: Transitional Adaptation of Pretrained Models for Visual Storytelling Y Yu, J Chung, H Yun, J Kim, G Kim | | |