Phi-3 technical report: A highly capable language model locally on your phone M Abdin, J Aneja, H Awadalla, A Awadallah, AA Awan, N Bach, A Bahree, ... arXiv preprint arXiv:2404.14219, 2024 | 474 | 2024 |
Model tells you what to discard: Adaptive kv cache compression for llms S Ge*, Y Zhang*, L Liu, M Zhang, J Han, J Gao arXiv preprint arXiv:2310.01801, 2023 | 101 | 2023 |
Towards disentangling relevance and bias in unbiased learning to rank Y Zhang, L Yan, Z Qin, H Zhuang, J Shen, X Wang, M Bendersky, ... Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 12 | 2023 |
Darkjargon. net: A platform for understanding underground conversation with latent meaning D Seyler, W Liu, Y Zhang, XF Wang, CX Zhai Proceedings of the 44th International ACM SIGIR Conference on Research and …, 2021 | 5 | 2021 |
Cooperative reasoning on knowledge graph and corpus: A multi-agentReinforcement learning approach Y Zhang, X Cheng, H Gao, C Zhai arXiv preprint arXiv:1912.02206, 2019 | 5 | 2019 |
Learning to order sub-questions for complex question answering Y Zhang, X Cheng, Y Zhang, Z Wang, Z Fang, X Wang, Z Huang, C Zhai arXiv preprint arXiv:1911.04065, 2019 | 4 | 2019 |
A little goes a long way: Efficient long context training and inference with partial contexts S Ge, X Lin, Y Zhang, J Han, H Peng arXiv preprint arXiv:2410.01485, 2024 | 3 | 2024 |
S2-Attention: Hardware-Aware Context Sharding Among Attention Heads X Lin*, Y Zhang*, S Ge, L Ren, B Patra, V Chaudhary, X Peng, Hao Song arXiv preprint arXiv:2407.17678, 2024 | 1* | 2024 |