Segment anything model for medical image segmentation: Current applications and future directions

Y Zhang, Z Shen, R Jiao - Computers in Biology and Medicine, 2024 - Elsevier
Due to the inherent flexibility of prompting, foundation models have emerged as the
predominant force in the fields of natural language processing and computer vision. The …

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

Alignsam: Aligning segment anything model to open context via reinforcement learning

D Huang, X Xiong, J Ma, J Li, Z Jie… - Proceedings of the …, 2024 - openaccess.thecvf.com
Powered by massive curated training data Segment Anything Model (SAM) has
demonstrated its impressive generalization capabilities in open-world scenarios with the …

Clip as rnn: Segment countless visual concepts without training endeavor

S Sun, R Li, P Torr, X Gu, S Li - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Existing open-vocabulary image segmentation methods require a fine-tuning step on mask
labels and/or image-text datasets. Mask labels are labor-intensive which limits the number of …

Maskclustering: View consensus based mask graph clustering for open-vocabulary 3d instance segmentation

M Yan, J Zhang, Y Zhu, H Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Open-vocabulary 3D instance segmentation is cutting-edge for its ability to segment 3D
instances without predefined categories. However progress in 3D lags behind its 2D …

[HTML][HTML] Generative ai for visualization: State of the art and future directions

Y Ye, J Hao, Y Hou, Z Wang, S Xiao, Y Luo, W Zeng - Visual Informatics, 2024 - Elsevier
Generative AI (GenAI) has witnessed remarkable progress in recent years and
demonstrated impressive performance in various generation tasks in different domains such …

Open-vocabulary SAM: Segment and recognize twenty-thousand classes interactively

H Yuan, X Li, C Zhou, Y Li, K Chen, CC Loy - arXiv preprint arXiv …, 2024 - arxiv.org
The CLIP and Segment Anything Model (SAM) are remarkable vision foundation models
(VFMs). SAM excels in segmentation tasks across diverse domains, while CLIP is renowned …

Foundation models in smart agriculture: Basics, opportunities, and challenges

J Li, M Xu, L Xiang, D Chen, W Zhuang, X Yin… - … and Electronics in …, 2024 - Elsevier
The past decade has witnessed the rapid development and adoption of machine and deep
learning (ML & DL) methodologies in agricultural systems, showcased by great successes in …

OpenMEDLab: An open-source platform for multi-modality foundation models in medicine

X Wang, X Zhang, G Wang, J He, Z Li, W Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
The emerging trend of advancing generalist artificial intelligence, such as GPTv4 and
Gemini, has reshaped the landscape of research (academia and industry) in machine …

Building Vision-Language Models on Solid Foundations with Masked Distillation

S Sameni, K Kafle, H Tan… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Recent advancements in Vision-Language Models (VLMs) have marked a
significant leap in bridging the gap between computer vision and natural language …