作者
Charlie Hewitt, Hatice Gunes
发表日期
2018/7/23
期刊
arXiv preprint arXiv:1807.08775
简介
This paper focuses on the design, deployment and evaluation of Convolutional Neural Network (CNN) architectures for facial affect analysis on mobile devices. Unlike traditional CNN approaches, models deployed to mobile devices must minimise storage requirements while retaining high performance. We therefore propose three variants of established CNN architectures and comparatively evaluate them on a large, in-the-wild benchmark dataset of facial images. Our results show that the proposed architectures retain similar performance to the dataset baseline while minimising storage requirements: achieving 58% accuracy for eight-class emotion classification and average RMSE of 0.39 for valence/arousal prediction. To demonstrate the feasibility of deploying these models for real-world applications, we implement a music recommendation interface based on predicted user affect. Although the CNN models were not trained in the context of music recommendation, our case study shows that: (i) the trained models achieve similar prediction performance to the benchmark dataset, and (ii) users tend to positively rate the song recommendations provided by the interface. Average runtime of the deployed models on an iPhone 6S equates to ~45 fps, suggesting that the proposed architectures are also well suited for real-time deployment on video streams.
引用总数
201720182019202020212022202320241103101483
学术搜索中的文章