Conditional image-to-video generation with latent flow diffusion models

H Ni, C Shi, K Li, SX Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video
starting from an image (eg, a person's face) and a condition (eg, an action class label like …

Active teacher for semi-supervised object detection

P Mi, J Lin, Y Zhou, Y Shen, G Luo… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we study teacher-student learning from the perspective of data initialization
and propose a novel algorithm called Active Teacher for semi-supervised object detection …

Cantor: Inspiring multimodal chain-of-thought of mllm

T Gao, P Chen, M Zhang, C Fu, Y Shen… - Proceedings of the …, 2024 - dl.acm.org
With the advent of large language models (LLMs) enhanced by the chain-of-thought (CoT)
methodology, the visual reasoning problem is usually decomposed into manageable sub …

Ompq: Orthogonal mixed precision quantization

Y Ma, T Jin, X Zheng, Y Wang, H Li, Y Wu… - Proceedings of the …, 2023 - ojs.aaai.org
To bridge the ever-increasing gap between deep neural networks' complexity and hardware
capability, network quantization has attracted more and more research attention. The latest …

Solving oscillation problem in post-training quantization through a theoretical perspective

Y Ma, H Li, X Zheng, X Xiao, R Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression
methods practically, benefitting from its data privacy and low computation costs. We argue …

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

X Xia, J Liu, S Zhang, Q Wu, H Wei… - Forty-first International …, 2024 - openreview.net
Coreset selection is powerful in reducing computational costs and accelerating data
processing for deep learning algorithms. It strives to identify a small subset from large-scale …

Training language model agents without modifying language models

S Zhang, J Zhang, J Liu, L Song, C Wang… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Researchers and practitioners have recently reframed powerful Large Language Models
(LLMs) as agents, enabling them to automate complex tasks largely via the use of …

Coreset selection with prioritized multiple objectives

X Xia, J Liu, S Zhang, Q Wu, T Liu - arXiv preprint arXiv:2311.08675, 2023 - arxiv.org
Coreset selection is powerful in reducing computational costs and accelerating data
processing for deep learning algorithms. It strives to identify a small subset from large-scale …

Hypertime: Hyperparameter optimization for combating temporal distribution shifts

S Zhang, Y Wu, Z Zheng, Q Wu, C Wang - Proceedings of the 32nd ACM …, 2024 - dl.acm.org
In this work, we propose a hyperparameter optimization method named HyperTime to find
hyperparameters robust to potential temporal distribution shifts in the unseen test data. Our …

Autogen: Enabling next-gen LLM applications via multi-agent conversations

Q Wu, G Bansal, J Zhang, Y Wu, B Li, E Zhu… - First Conference on …, 2024 - openreview.net
We present AutoGen, an open-source framework that allows developers to build LLM
applications by composing multiple agents to converse with each other to accomplish tasks …