Akvsr: Audio knowledge empowered visual speech recognition by compressing audio knowledge of a pretrained model

JH Yeo, M Kim, J Choi, DH Kim… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Visual Speech Recognition (VSR) is the task of predicting spoken words from silent lip
movements. VSR is regarded as a challenging task because of the insufficient information …

Spatio-temporal attention mechanism and knowledge distillation for lip reading

S Elashmawy, M Ramsis, HM Eraqi… - arXiv preprint arXiv …, 2021 - arxiv.org
Despite the advancement in the domain of audio and audio-visual speech recognition,
visual speech recognition systems are still quite under-explored due to the visual ambiguity …