Leveraging on cross linguistic similarities to reduce grammar development effort for the under-resourced languages: a case of Kenyan Bantu languages

B Kituku, W Nganga, L Muchemi - … International Conference on …, 2021 - ieeexplore.ieee.org
2021 International Conference on Information and Communication …, 2021ieeexplore.ieee.org
Rule-based grammar development is labor-intensive in terms of time and knowledge
requirements, especially for complex morphology and under-resourced languages.
Notwithstanding, these grammars are needed for deep natural language processing,
generation of well-formed output, or both. To address the challenge, this paper seeks to
develop shared multilingual wide-coverage grammar for a subset of Kenyan Bantu
languages in Grammatical Framework (GF) by leveraging on cross linguistic similarities …
Rule-based grammar development is labor-intensive in terms of time and knowledge requirements, especially for complex morphology and under-resourced languages. Notwithstanding, these grammars are needed for deep natural language processing, generation of well-formed output, or both. To address the challenge, this paper seeks to develop shared multilingual wide-coverage grammar for a subset of Kenyan Bantu languages in Grammatical Framework (GF) by leveraging on cross linguistic similarities using the grammar engineering strategies: grammar porting and grammar sharing. The shared grammar was developed using the morphology-driven approach, where the lexicons are defined first, followed by inflection regular expression and finally the syntax production rules. The resulting congruent Bantu parameterized grammar had shareability for category linearizations, parameters, paradigms, and syntax rules of 100%, 68.75%, 65.3% and 89.57%, respectively, while portability (modification) was exhibited in paradigms, parameter plus syntax rules at 14.29%, 18.75% and 10.43% respectively. The research concludes leveraging on the cross-linguistic similarities of principles and parameters significantly reduces multilingual grammar's development effort and contributes by developing the Bantu parametrized grammar which demonstrates how the effort of developing the rule base has been significantly reduced in languages where data is a scarce commodity.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果