[图书][B] Machine Translation with Minimal Reliance on Parallel Resources

G Tambouratzis, M Vassiliou, S Sofianopoulos - 2017 - Springer
2017Springer
This chapter presents in detail the main translation process of PRESEMT, delving deeper in
the core of the system and its inner workings. All SMT systems (Koehn 2010), which
represent the dominant approach to automatic translation, employ two statistical models in
order to find the best translation in the target language: the translation model and the
language model. The translation model derives from parallel corpora and can have many
forms and employ various language features. Its task is to produce various TL alternative …
This chapter presents in detail the main translation process of PRESEMT, delving deeper in the core of the system and its inner workings. All SMT systems (Koehn 2010), which represent the dominant approach to automatic translation, employ two statistical models in order to find the best translation in the target language: the translation model and the language model. The translation model derives from parallel corpora and can have many forms and employ various language features. Its task is to produce various TL alternative translations based on n-grams found in the SL input sentence and rank them according to their probabilities. The language model derives from the monolingual TL corpus and given a TL sentence its task is to provide a metric as to how well formed the TL is. By combining both models during the decoding process, an SMT system can provide translation alternatives using the translation model, and then reorder the words and make translation choices over the words of each translation alternative, finally selecting the translation with the highest score. Local and long-distance reordering is one of the most challenging aspects of any Machine Translation system. In modern SMT, numerous approaches have used pre-processing techniques that perform word reordering in the source side based on the syntactic properties of the target side (Rottmann and Vogel 2007; Popovic and Ney 2006; Collins et al. 2005), in order to overcome the long-distance word reordering problem. Of course, short-range reorderings are easily captured by the language model if missed by the translation model. PRESEMT differentiates itself from all modern SMT systems, by using a bilingual dictionary and breaking down the translation process in two steps, preceded by a pre-processing step (cf. Chap. 2). During pre-processing, the input SL text is tagged, lemmatised and chunked, using the phrasing model produced by the PMG. In the first step of the translation process (Structure Selection), the dictionary produces translation alternatives (single and multi-word ones) for all words in the
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果