Semantic-Guided Feature Distillation for Multimodal Recommendation

F Liu, H Chen, Z Cheng, L Nie… - Proceedings of the 31st …, 2023 - dl.acm.org
Proceedings of the 31st ACM International Conference on Multimedia, 2023dl.acm.org
Multimodal recommendation exploits the rich multimodal information associated with users
or items to enhance the representation learning for better performance. In these methods,
end-to-end feature extractors (eg, shallow/deep neural networks) are often adopted to tailor
the generic multimodal features that are extracted from raw data by pre-trained models for
recommendation. However, compact extractors, such as shallow neural networks, may find it
challenging to extract effective information from complex and high-dimensional generic …
Multimodal recommendation exploits the rich multimodal information associated with users or items to enhance the representation learning for better performance. In these methods, end-to-end feature extractors (e.g., shallow/deep neural networks) are often adopted to tailor the generic multimodal features that are extracted from raw data by pre-trained models for recommendation. However, compact extractors, such as shallow neural networks, may find it challenging to extract effective information from complex and high-dimensional generic modality features. Conversely, DNN-based extractors may encounter the data sparsity problem in recommendation. To address this problem, we propose a novel model-agnostic approach called Semantic-guided Feature Distillation (SGFD), which employs a teacher-student framework to extract feature for multimodal recommendation. The teacher model first extracts rich modality features from the generic modality feature by considering both the semantic information of items and the complementary information of multiple modalities. SGFD then utilizes response-based and feature-based distillation loss to effectively transfer the knowledge encoded in the teacher model to the student model. To evaluate the effectiveness of our SGFD, we integrate SGFD into three backbone multimodal recommendation models. Extensive experiments on three public real-world datasets demonstrate that SGFD-enhanced models can achieve substantial improvement over their counterparts.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果