U-kan makes strong backbone for medical image segmentation and generation

C Li, X Liu, W Li, C Wang, H Liu, Y Liu, Z Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
U-Net has become a cornerstone in various visual applications such as image segmentation
and diffusion probability models. While numerous innovative designs and improvements …

Endora: Video Generation Models as Endoscopy Simulators

C Li, H Liu, Y Liu, BY Feng, W Li, X Liu, Z Chen… - … Conference on Medical …, 2024 - Springer
Generative models hold promise for revolutionizing medical education, robot-assisted
surgery, and data augmentation for machine learning. Despite progress in generating 2D …

Gtp-4o: Modality-prompted heterogeneous graph learning for omni-modal biomedical representation

C Li, X Liu, C Wang, Y Liu, W Yu, J Shao… - European Conference on …, 2025 - Springer
Recent advances in learning multi-modal representation have witnessed the success in
biomedical domains. While established techniques enable handling multi-modal …

A review of 3d reconstruction techniques for deformable tissues in robotic surgery

M Xu, Z Guo, A Wang, L Bai, H Ren - arXiv preprint arXiv:2408.04426, 2024 - arxiv.org
As a crucial and intricate task in robotic minimally invasive surgery, reconstructing surgical
scenes using stereo or monocular endoscopic video holds immense potential for clinical …

Deform3dgs: Flexible deformation for fast surgical scene reconstruction with gaussian splatting

S Yang, Q Li, D Shen, B Gong, Q Dou, Y Jin - International Conference on …, 2024 - Springer
Tissue deformation poses a key challenge for accurate surgical scene reconstruction.
Despite yielding high reconstruction quality, existing methods suffer from slow rendering …

P2SAM: Probabilistically Prompted SAMs Are Efficient Segmentator for Ambiguous Medical Images

Y Huang, C Li, Z Lin, H Liu, H Xu, Y Liu… - Proceedings of the …, 2024 - dl.acm.org
Generating diverse plausible outputs from a single input is crucial for addressing visual
ambiguities, exemplified in medical imaging where experts may provide varying semantic …

Large Spatial Model: End-to-end Unposed Images to Semantic 3D

Z Fan, J Zhang, W Cong, P Wang, R Li, K Wen… - arXiv preprint arXiv …, 2024 - arxiv.org
Reconstructing and understanding 3D structures from a limited number of images is a well-
established problem in computer vision. Traditional methods usually break this task into …

When 3D Partial Points Meets SAM: Tooth Point Cloud Segmentation with Sparse Labels

Y Liu, W Li, C Wang, H Chen, Y Yuan - International Conference on …, 2024 - Springer
Tooth point cloud segmentation is a fundamental task in many orthodontic applications.
Current research mainly focuses on fully supervised learning which demands expensive …

Advancing precise diagnosis of nasopharyngeal carcinoma through endoscopy-based radiomics analysis

Y Xu, J Wang, C Li, Y Su, H Peng, L Guo, S Lin, J Li… - Iscience, 2024 - cell.com
Nasopharyngeal carcinoma (NPC) has high metastatic potential and is hard to detect early.
This study aims to develop a deep learning model for NPC diagnosis using optical imagery …

Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM

C Li, W Li, H Liu, X Liu, Q Xu, Z Chen, Y Huang… - The Thirty-eighth Annual … - openreview.net
As the vision foundation models like the Segment Anything Model (SAM) demonstrate potent
universality, they also present challenges in giving ambiguous and uncertain predictions …