Abq-llm: Arbitrary-bit quantized inference acceleration for large language models

C Zeng, S Liu, Y Xie, H Liu, X Wang, M Wei… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have revolutionized natural language processing tasks.
However, their practical application is constrained by substantial memory and computational …

Data augmentation and preparation process of PerInfEx: a Persian chatbot with the ability of information extraction

P Safari, M Shamsfard - IEEE Access, 2024 - ieeexplore.ieee.org
In this paper, we describe data preparation for our proposed chatbot PerInfEx (Persian
Information Extraction chatbot). It aims to interactively chit-chat with users in Persian and by …

GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference

C Zeng, S Liu, S Yang, F Chen, X Mei, L Fu - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid growth in the scale and complexity of large language models (LLMs), the
costs of training and inference have risen substantially. Model compression has emerged as …

From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique

S Elahimanesh, S Salehi, SZ Movahed… - arXiv preprint arXiv …, 2023 - arxiv.org
In the wake of the post-pandemic era, marked by social isolation and surging rates of
depression and anxiety, conversational agents based on digital psychotherapy can play an …