We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and …
This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a …
Multimodal large language models (MLLMs) have shown impressive reasoning abilities. However, they are also more vulnerable to jailbreak attacks than their LLM predecessors …
The rapid evolution of artificial intelligence (AI) through developments in Large Language Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements …
Current efficient approaches to building Multimodal Large Language Models (MLLMs) mainly incorporate visual information into LLMs with a simple visual mapping network such …
Recent research demonstrates that the nascent fine-tuning-as-a-service business model exposes serious safety concerns--fine-tuning over a few harmful data uploaded by the users …
This paper makes the first attempt towards unsupervised preference alignment in Vision- Language Models (VLMs). We generate chosen and rejected responses with regard to the …
Large Language Models (LLMs) have transformed artificial intelligence by advancing natural language understanding and generation, enabling applications across fields beyond …
T Huang, S Hu, F Ilhan, SF Tekin… - arXiv preprint arXiv …, 2024 - openreview.net
Recent studies show that Large Language Models (LLMs) with safety alignment can be jail- broken by fine-tuning on a dataset mixed with harmful data. First time in the literature, we …