resourced in terms of the data scale and diversity compared to other major languages. This
limitation has excluded it from the current “pre-training and fine-tuning” paradigm that is
dominated by Transformer architectures. In this paper, we provide a comprehensive review
on the existing resources and methodologies for Cantonese Natural Language Processing,
covering the recent progress in language understanding, text generation and development …