Sparsegpt: Massive language models can be accurately pruned in one-shot

E Frantar, D Alistarh - International Conference on Machine …, 2023 - proceedings.mlr.press
We show for the first time that large-scale generative pretrained transformer (GPT) family
models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal …

Gptq: Accurate post-training quantization for generative pre-trained transformers

E Frantar, S Ashkboos, T Hoefler, D Alistarh - arXiv preprint arXiv …, 2022 - arxiv.org
Generative Pre-trained Transformer models, known as GPT or OPT, set themselves apart
through breakthrough performance across complex language modelling tasks, but also by …

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

Z Yao, R Yazdani Aminabadi… - Advances in …, 2022 - proceedings.neurips.cc
How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …

Quip: 2-bit quantization of large language models with guarantees

J Chee, Y Cai, V Kuleshov… - Advances in Neural …, 2024 - proceedings.neurips.cc
This work studies post-training parameter quantization in large language models (LLMs).
We introduce quantization with incoherence processing (QuIP), a new method based on the …

OPTQ: Accurate quantization for generative pre-trained transformers

E Frantar, S Ashkboos, T Hoefler… - … Conference on Learning …, 2022 - openreview.net
Generative Pre-trained Transformer models, known as GPT or OPT, set themselves apart
through breakthrough performance across complex language modelling tasks, but also by …

Qasc: A dataset for question answering via sentence composition

T Khot, P Clark, M Guerquin, P Jansen… - Proceedings of the AAAI …, 2020 - ojs.aaai.org
Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question
answering. We present a multi-hop reasoning dataset, Question Answering via Sentence …

Mobilevlm: A fast, reproducible and strong vision language assistant for mobile devices

X Chu, L Qiao, X Lin, S Xu, Y Yang, Y Hu, F Wei… - arXiv preprint arXiv …, 2023 - arxiv.org
We present MobileVLM, a competent multimodal vision language model (MMVLM) targeted
to run on mobile devices. It is an amalgamation of a myriad of architectural designs and …

Scienceworld: Is your agent smarter than a 5th grader?

R Wang, P Jansen, MA Côté… - arXiv preprint arXiv …, 2022 - arxiv.org
We present ScienceWorld, a benchmark to test agents' scientific reasoning abilities in a new
interactive text environment at the level of a standard elementary school science curriculum …

Improving natural language inference using external knowledge in the science questions domain

X Wang, P Kapanipathi, R Musa, M Yu… - Proceedings of the …, 2019 - ojs.aaai.org
Abstract Natural Language Inference (NLI) is fundamental to many Natural Language
Processing (NLP) applications including semantic search and question answering. The NLI …

What makes reading comprehension questions easier?

S Sugawara, K Inui, S Sekine, A Aizawa - arXiv preprint arXiv:1808.09384, 2018 - arxiv.org
A challenge in creating a dataset for machine reading comprehension (MRC) is to collect
questions that require a sophisticated understanding of language to answer beyond using …