Billion-scale pretraining with vision transformers for multi-task visual representations

T Chen, X Chen, X Du, A Rashwan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …

被引用次数：34 相关文章所有 4 个版本

[PDF] thecvf.com

Revisiting weakly supervised pre-training of visual perception models

M Singh, L Gustafson, A Adcock… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Model pre-training is a cornerstone of modern visual recognition systems. Although
fully supervised pre-training on datasets like ImageNet is still the de-facto standard, recent …

被引用次数：80 相关文章所有 6 个版本

[PDF] arxiv.org

Skin deep: Investigating subjectivity in skin tone annotations for computer vision benchmark datasets

T Barrett, Q Chen, A Zhang - Proceedings of the 2023 ACM Conference …, 2023 - dl.acm.org

To investigate the well-observed racial disparities in computer vision systems that analyze
images of humans, researchers have turned to skin tone as a more objective annotation …

被引用次数：17 相关文章所有 5 个版本

[PDF] acm.org

Itemsage: Learning product embeddings for shopping recommendations at pinterest

P Baltescu, H Chen, N Pancha, A Zhai… - Proceedings of the 28th …, 2022 - dl.acm.org

Learned embeddings for products are an important building block for web-scale e-
commerce recommendation systems. At Pinterest, we build a single set of product …

被引用次数：34 相关文章所有 6 个版本

[PDF] ssrn.com

Integrating ChatGPT, Bard, and leading-edge generative artificial intelligence in building and construction industry: applications, framework, challenges, and future …

N Rane, S Choudhary, J Rane - 2023 - papers.ssrn.com

The infusion of generative artificial intelligence (AI), as exemplified by models such as
ChatGPT and Bard is proving to be a revolutionary catalyst within the building and …

被引用次数：39 相关文章所有 3 个版本

[PDF] mlr.press

Surface vision transformers: Attention-based modelling applied to cortical analysis

S Dahan, A Fawaz, LZJ Williams… - … on Medical Imaging …, 2022 - proceedings.mlr.press

The extension of convolutional neural networks (CNNs) to non-Euclidean geometries has
led to multiple frameworks for studying manifolds. Many of those methods have shown …

被引用次数：23 相关文章所有 5 个版本

[PDF] mdpi.com

Learning gait representations with noisy multi-task learning

A Cosma, E Radoi - Sensors, 2022 - mdpi.com

Gait analysis is proven to be a reliable way to perform person identification without relying
on subject cooperation. Walking is a biometric that does not significantly change in short …

被引用次数：13 相关文章所有 7 个版本

[PDF] arxiv.org

Bamboo: Building mega-scale vision dataset continually with human-machine synergy

Y Zhang, Q Sun, Y Zhou, Z He, Z Yin, K Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Large-scale datasets play a vital role in computer vision. But current datasets are annotated
blindly without differentiation to samples, making the data collection inefficient and …

被引用次数：19 相关文章所有 2 个版本

[PDF] neurips.cc

Understanding robust learning through the lens of representation similarities

C Cianfarani, AN Bhagoji, V Sehwag… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Representation learning,\textit {ie} the generation of representations useful for
downstream applications, is a task of fundamental importance that underlies much of the …

被引用次数：12 相关文章所有 10 个版本

[PDF] acm.org

CLIP-Branches: Interactive Fine-Tuning for Text-Image Retrieval

C Lülf, DM Lima Martins, MA Vaz Salles… - Proceedings of the 47th …, 2024 - dl.acm.org

The advent of text-image models, most notably CLIP, has significantly transformed the
landscape of information retrieval. These models enable the fusion of various modalities …

被引用次数：3 相关文章

高级搜索

QQ 群