Enhanced video analytics for sentiment analysis based on fusing textual, auditory and visual...- 学术资源搜索

Enhanced video analytics for sentiment analysis based on fusing textual, auditory and visual information

S Al-Azani, ESM El-Alfy - IEEE Access, 2020 - ieeexplore.ieee.org

IEEE Access, 2020•ieeexplore.ieee.org

With the widespread of online videos and digital transformation, video informatics and analytics have recently gained substantially increasing importance with an impressive success in a variety of tasks such as digital marketing, video surveillance and security systems, healthcare systems, talk show analysis, analysis of influencing groups in social media, and target tracking. This paper evaluates the potential contribution of various video modalities and how they are correlated to video analytics for sentiment analysis in the morphologically-rich Arabic language. Moreover, an enhanced approach is presented for video analytics to predict the speaker's sentiment of multi-dialect Arabic through the integration of textual, auditory and visual modalities. Different features are extracted to represent each modality including prosodic and spectral acoustic features to represent audio, neural word embedding to represent audio text transcript, and dense optical-flow descriptors to represent visual modality. The extracted features are used individually to train two machine learning classifiers to provide a baseline. Then, the effectiveness of various combinations of modalities is verified using multi-level fusion (feature, score and decision). The experimental results demonstrate that the proposed approach of combining different modalities can lead to more accurate prediction of speaker's sentiment with above 94% accuracy.

ieeexplore.ieee.org

展开收起

被引用次数：26 相关文章所有 4 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Enhanced video analytics for sentiment analysis based on fusing textual, auditory and visual information

引用