Deep Learning based Visually Rich Document Content Understanding: A Survey

Y Ding, J Lee, SC Han - arXiv preprint arXiv:2408.01287, 2024 - arxiv.org
Visually Rich Documents (VRDs) are essential in academia, finance, medical fields, and
marketing due to their multimodal information content. Traditional methods for extracting …

On Disentanglement of Asymmetrical Knowledge Transfer for Modality-Task Agnostic Federated Learning

J Chen, A Zhang - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
There has been growing concern regarding data privacy during the development and
deployment of Multimodal Foundation Models for Artificial General Intelligence (AGI), while …