Tuning computer vision models with task rewards

J Qiu, L Li, J Sun, J Peng, P Shi… - IEEE Journal of …, 2023 - ieeexplore.ieee.org

Large AI models, or foundation models, are models recently emerging with massive scales
both parameter-wise and data-wise, the magnitudes of which can reach beyond billions …

被引用次数：73 相关文章所有 6 个版本

[PDF] thecvf.com

Hive: Harnessing human feedback for instructional visual editing

S Zhang, X Yang, Y Feng, C Qin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Incorporating human feedback has been shown to be crucial to align text generated by large
language models to human preferences. We hypothesize that state-of-the-art instructional …

被引用次数：49 相关文章所有 4 个版本

[PDF] neurips.cc

4m: Massively multimodal masked modeling

D Mizrahi, R Bachmann, O Kar, T Yeo… - Advances in …, 2024 - proceedings.neurips.cc

Current machine learning models for vision are often highly specialized and limited to a
single modality and task. In contrast, recent large language models exhibit a wide range of …

被引用次数：15 相关文章所有 5 个版本

HiVeGPT: Human-machine-augmented intelligent vehicles with generative pre-trained transformer

J Zhang, J Pu, J Xue, M Yang, X Xu… - IEEE Transactions …, 2023 - ieeexplore.ieee.org

Recently, a chat generative pre-trained transformer (ChatGPT) attracts widespread attention
in the academies and industries because of its powerful conversational ability with human …

被引用次数：49 相关文章所有 2 个版本

[PDF] neurips.cc

Llmscore: Unveiling the power of large language models in text-to-image synthesis evaluation

Y Lu, X Yang, X Li, XE Wang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Existing automatic evaluation on text-to-image synthesis can only provide an image-text
matching score, without considering the object-level compositionality, which results in poor …

被引用次数：27 相关文章所有 6 个版本

[PDF] arxiv.org

A survey of reinforcement learning from human feedback

T Kaufmann, P Weng, V Bengs… - arXiv preprint arXiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …

被引用次数：37 相关文章所有 4 个版本

[PDF] neurips.cc

Reward finetuning for faster and more accurate unsupervised object discovery

K Luo, Z Liu, X Chen, Y You… - Advances in …, 2023 - proceedings.neurips.cc

Recent advances in machine learning have shown that Reinforcement Learning from
Human Feedback (RLHF) can improve machine learning models and align them with …

被引用次数：3 相关文章所有 5 个版本

[PDF] neurips.cc

Taskmet: Task-driven metric learning for model learning

D Bansal, RTQ Chen, M Mukadam… - Advances in Neural …, 2024 - proceedings.neurips.cc

Deep learning models are often used with some downstream task. Models solely trained to
achieve accurate predictions may struggle to perform well on the desired downstream tasks …

被引用次数：3 相关文章所有 6 个版本

[PDF] thecvf.com

Cross-domain image captioning with discriminative finetuning

R Dessì, M Bevilacqua, E Gualdoni… - Proceedings of the …, 2023 - openaccess.thecvf.com

Neural captioners are typically trained to mimic human-generated references without
optimizing for any specific communication goal, leading to problems such as the generation …

被引用次数：8 相关文章所有 6 个版本

[PDF] acm.org

Unifiedgesture: A unified gesture synthesis model for multiple skeletons

S Yang, Z Wang, Z Wu, M Li, Z Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org

The automatic co-speech gesture generation draws much attention in computer animation.
Previous works designed network structures on individual datasets, which resulted in a lack …

被引用次数：5 相关文章所有 3 个版本

高级搜索

QQ 群