作者
Amaia Salvador, Nicholas Hynes, Yusuf Aytar, Javier Marin, Ferda Ofli, Ingmar Weber, Antonio Torralba
发表日期
2017
研讨会论文
Proceedings of the IEEE conference on computer vision and pattern recognition
页码范围
3020-3028
简介
In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Accordingly, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level, semantic classification objective improves performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general.
引用总数
201720182019202020212022202320241356921119212112261
学术搜索中的文章
A Salvador, N Hynes, Y Aytar, J Marin, F Ofli, I Weber… - Proceedings of the IEEE conference on computer …, 2017