The increasing prevalence of Visually-rich Documents (VRDs) in diverse domains has led to a growing interest in Visually-rich Document Understanding (VrDU). Researchers have been developing sophisticated systems to extract explicit and implicit information from such documents. However, a comprehensive literature review on this field still missing. In this paper, we present a comprehensive overview of VrDU systems, focusing on their applications in Key Information Extraction (KIE), Visual Question Answering (VQA) and document classification tasks. We delve into the fundamental concepts and techniques used in VrDU, providing a high-level abstraction of these approaches. To facilitate understanding and comparison, we propose a taxonomy that categorizes VrDU systems based on four key aspects: modality, geometric approach, OCR integration type and task. Additionally, we present the benchmarks commonly used in the field and discuss the current challenges faced by VrDU systems. We believe that this paper advances the research in the field of VrDU by providing a comprehensive resource for researchers and practitioners