Parallel sentence generation from comparable corpora for improved SMT

CK Reddy, CC Aggarwal - 2015 - books.google.com

Supplying a comprehensive overview of healthcare analytics research, Healthcare Data
Analytics provides an understanding of the analytical techniques currently available to solve …

被引用次数：173 相关文章所有 12 个版本

[PDF] ssrn.com

Arabic-English parallel corpus: A new resource for translation training and language teaching

HM Alotaibi - Arab World English Journal (AWEJ) Volume, 2017 - papers.ssrn.com

Parallel corpora can be defined as collections of aligned, translated texts of two or more
languages. They play a major role in translation and contrastive studies, and are also …

被引用次数：60 相关文章所有 13 个版本

[PDF] aclanthology.org

[PDF][PDF] Hybrid parallel sentence mining from comparable corpora

D Stefanescu, R Ion, S Hunsicker - Proceedings of the 16th annual …, 2012 - aclanthology.org

This paper presents a fast and accurate parallel sentence mining algorithm for comparable
corpora called LEXACC based on the Cross-Language Information Retrieval framework …

被引用次数：54 相关文章所有 9 个版本

[PDF] springer.com

Making the most of comparable corpora in Neural Machine Translation: a case study

H Gete, T Etchegoyhen - Language Resources and Evaluation, 2022 - Springer

Comparable corpora can benefit the development of Neural Machine Translation models, in
particular for under-resourced languages. We present a case study centred on the …

被引用次数：7 相关文章所有 6 个版本

[PDF] arxiv.org

Extracting an English-Persian parallel corpus from comparable corpora

A Karimi, E Ansari, BS Bigham - arXiv preprint arXiv:1711.00681, 2017 - arxiv.org

Parallel data are an important part of a reliable Statistical Machine Translation (SMT)
system. The more of these data are available, the better the quality of the SMT system …

被引用次数：25 相关文章所有 8 个版本

[PDF] aclanthology.org

Efficient extraction of pseudo-parallel sentences from raw monolingual data using word embeddings

B Marie, A Fujita - Proceedings of the 55th Annual Meeting of the …, 2017 - aclanthology.org

We propose a new method for extracting pseudo-parallel sentences from a pair of large
monolingual corpora, without relying on any document-level information. Our method first …

被引用次数：25 相关文章所有 3 个版本

[PDF] academia.edu

[PDF][PDF] Collecting and using comparable corpora for statistical machine translation

I Skadiņa, A Aker, N Mastropavlos, F Su… - Proceedings of the 8th …, 2012 - academia.edu

Lack of sufficient parallel data for many languages and domains is currently one of the major
obstacles to further advancement of automated translation. The ACCURAT project is …

被引用次数：43 相关文章所有 13 个版本

[PDF] kyoto-u.ac.jp

Integrated parallel sentence and fragment extraction from comparable corpora: A case study on Chinese--Japanese Wikipedia

C Chu, T Nakazawa, S Kurohashi - ACM Transactions on Asian and …, 2015 - dl.acm.org

Parallel corpora are crucial for statistical machine translation (SMT); however, they are quite
scarce for most language pairs and domains. As comparable corpora are far more available …

被引用次数：28 相关文章所有 4 个版本

[PDF] sciencedirect.com

Extracting comparable articles from wikipedia and measuring their comparabilities

M Saad, D Langlois, K Smaïli - Procedia-Social and Behavioral Sciences, 2013 - Elsevier

Parallel corpora are not available for all domains and languages, but statistical methods in
multilingual research domains require huge parallel/comparable corpora. Comparable …

被引用次数：33 相关文章所有 9 个版本

[PDF] hal.science

Parallel corpora preparation for English-Amharic machine translation

Y Biadgligne, K Smaïli - … : 16th International Work-Conference on Artificial …, 2021 - Springer

In this paper, we describe the development of an English-Amharic parallel corpus and
Machine Translation (MT) experiments conducted on it. Two different tests have been …

被引用次数：14 相关文章所有 6 个版本

高级搜索

QQ 群