On the Reliability of Watermarks for Large Language Models J Kirchenbauer, J Geiping, Y Wen, M Shu, K Saifullah, K Kong, ... arXiv preprint arXiv:2306.04634, 2023 | 103 | 2023 |
Dall· e mini B Dayma, S Patil, P Cuenca, K Saifullah, T Abraham, P Le Khac, L Melas, ... July, 2021 | 88* | 2021 |
Bring Your Own Data! Self-Sensitivity Evaluation for Large Language Models N Jain, K Saifullah, Y Wen, J Kirchenbauer, M Shu, A Saha, M Goldblum, ... First Conference on Language Modeling, 0 | 18* | |
Coercing LLMs to do and reveal (almost) anything J Geiping, A Stein, M Shu, K Saifullah, Y Wen, T Goldstein arXiv preprint arXiv:2402.14020, 2024 | 15 | 2024 |
CinePile: A Long Video Question Answering Dataset and Benchmark R Rawal, K Saifullah, R Basri, D Jacobs, G Somepalli, T Goldstein arXiv preprint arXiv:2405.08813, 2024 | 7 | 2024 |
LiveBench: A Challenging, Contamination-Free LLM Benchmark C White, S Dooley, M Roberts, A Pal, B Feuer, S Jain, R Shwartz-Ziv, ... arXiv preprint arXiv:2406.19314, 2024 | 1 | 2024 |
Seeing in Words: Learning to Classify through Language Bottlenecks K Saifullah, Y Wen, J Geiping, M Goldblum, T Goldstein arXiv preprint arXiv:2307.00028, 2023 | 1 | 2023 |
Learning UI-to-Code Reverse Generator Using Visual Critic Without Rendering D Soselia, K Saifullah, T Zhou arXiv preprint arXiv:2305.14637, 2023 | 1 | 2023 |