An overview on visual slam: From tradition to semantic

W Chen, G Shang, A Ji, C Zhou, X Wang, C Xu, Z Li… - Remote Sensing, 2022 - mdpi.com
Visual SLAM (VSLAM) has been developing rapidly due to its advantages of low-cost
sensors, the easy fusion of other sensors, and richer environmental information. Traditional …

A survey of convolutional neural networks: analysis, applications, and prospects

Z Li, F Liu, W Yang, S Peng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
A convolutional neural network (CNN) is one of the most significant networks in the deep
learning field. Since CNN made impressive achievements in many areas, including but not …

YOLOv6: A single-stage object detection framework for industrial applications

C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke… - arXiv preprint arXiv …, 2022 - arxiv.org
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …

Segment everything everywhere all at once

X Zou, J Yang, H Zhang, F Li, L Li… - Advances in …, 2024 - proceedings.neurips.cc
In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …

Generalized decoding for pixel, image, and language

X Zou, ZY Dou, J Yang, Z Gan, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present X-Decoder, a generalized decoding model that can predict pixel-level
segmentation and language tokens seamlessly. X-Decoder takes as input two types of …

Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

DC-YOLOv8: small-size object detection algorithm based on camera sensor

H Lou, X Duan, J Guo, H Liu, J Gu, L Bi, H Chen - Electronics, 2023 - mdpi.com
Traditional camera sensors rely on human eyes for observation. However, human eyes are
prone to fatigue when observing objects of different sizes for a long time in complex scenes …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Seggpt: Segmenting everything in context

X Wang, X Zhang, Y Cao, W Wang, C Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
We present SegGPT, a generalist model for segmenting everything in context. We unify
various segmentation tasks into a generalist in-context learning framework that …