NetVLAD: CNN architecture for weakly supervised place recognition

MC Schiappa, YS Rawat, M Shah - ACM Computing Surveys, 2023 - dl.acm.org

The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …

被引用次数：105 相关文章所有 4 个版本

[PDF] arxiv.org

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：347 相关文章所有 9 个版本

[PDF] thecvf.com

Lightglue: Local feature matching at light speed

P Lindenberger, PE Sarlin… - Proceedings of the …, 2023 - openaccess.thecvf.com

We introduce LightGlue, a deep neural network that learns to match local features across
images. We revisit multiple design decisions of SuperGlue, the state of the art in sparse …

被引用次数：160 相关文章所有 7 个版本

[PDF] thecvf.com

Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition

S Hausler, S Garg, M Xu, M Milford… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Visual Place Recognition is a challenging task for robotics and autonomous
systems, which must deal with the twin problems of appearance and viewpoint change in an …

被引用次数：329 相关文章所有 12 个版本

[PDF] thecvf.com

R2former: Unified retrieval and reranking transformer for place recognition

S Zhu, L Yang, C Chen, M Shah… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Visual Place Recognition (VPR) estimates the location of query images by matching
them with images in a reference database. Conventional methods generally adopt …

被引用次数：52 相关文章所有 7 个版本

[PDF] neurips.cc

Object-centric learning with slot attention

F Locatello, D Weissenborn… - Advances in neural …, 2020 - proceedings.neurips.cc

Learning object-centric representations of complex scenes is a promising step towards
enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep …

被引用次数：699 相关文章所有 10 个版本

[PDF] thecvf.com

Rethinking visual geo-localization for large-scale applications

G Berton, C Masone, B Caputo - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Visual Geo-localization (VG) is the task of estimating the position where a given photo was
taken by comparing it with a large database of images of known locations. To investigate …

被引用次数：121 相关文章所有 7 个版本

[PDF] arxiv.org

Clip2video: Mastering video-text retrieval via image clip

H Fang, P Xiong, L Xu, Y Chen - arXiv preprint arXiv:2106.11097, 2021 - arxiv.org

We present CLIP2Video network to transfer the image-language pre-training model to video-
text retrieval in an end-to-end manner. Leading approaches in the domain of video-and …

被引用次数：255 相关文章所有 2 个版本

[PDF] arxiv.org

Deep learning for 3d point clouds: A survey

Y Guo, H Wang, Q Hu, H Liu, L Liu… - IEEE transactions on …, 2020 - ieeexplore.ieee.org

Point cloud learning has lately attracted increasing attention due to its wide applications in
many areas, such as computer vision, autonomous driving, and robotics. As a dominating …

被引用次数：1844 相关文章所有 15 个版本

[PDF] thecvf.com

Back to the feature: Learning robust camera localization from pixels to pose

PE Sarlin, A Unagar, M Larsson… - Proceedings of the …, 2021 - openaccess.thecvf.com

Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple
learning algorithms. Many regress precise geometric quantities, like poses or 3D points …

被引用次数：235 相关文章所有 13 个版本

高级搜索

QQ 群