Transferring the knowledge learned from large scale datasets (eg, ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) …
While neural networks have been successfully applied to many natural language processing tasks, they come at the cost of interpretability. In this paper, we propose a general …
We propose bilinear models, a recognition architecture that consists of two feature extractors whose outputs are multiplied using outer product at each location of the image and pooled …
In the context of fine-grained visual categorization, the ability to interpret models as human- understandable visual manuals is sometimes as important as achieving high classification …
Scaling up fine-grained recognition to all domains of fine-grained objects is a challenge the computer vision community will need to face in order to realize its goal of recognizing all …
We address the problem of cross-domain image retrieval, considering the following practical application: given a user photo depicting a clothing image, our goal is to retrieve the same or …
Current approaches for fine-grained recognition do the following: First, recruit experts to annotate a dataset of images, optionally also collecting more structured data in the form of …
This paper proposes a joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN). We consider learning binary …
G Van Horn, P Perona - arXiv preprint arXiv:1709.01450, 2017 - arxiv.org
The world is long-tailed. What does this mean for computer vision and visual recognition? The main two implications are (1) the number of categories we need to consider in …