Active learning and crowdsourcing for machine translation in low resource scenarios

V Ambati - 2012 - search.proquest.com
Corpus based approaches to automatic translation such as Example Based and Statistical
Machine Translation systems use large amounts of parallel data created by humans to train …

A comprehensive analysis of bilingual lexicon induction

A Irvine, C Callison-Burch - Computational Linguistics, 2017 - direct.mit.edu
Bilingual lexicon induction is the task of inducing word translations from monolingual
corpora in two languages. In this article we present the most comprehensive analysis of …

Parallel sentence generation from comparable corpora for improved SMT

S Abdul Rauf, H Schwenk - Machine translation, 2011 - Springer
A parallel corpus is an essential resource for statistical machine translation (SMT) but is
often not available in the required amounts for all domains and languages. An approach is …

[图书][B] Learning machine translation

C Goutte - 2009 - books.google.com
The Internet gives us access to a wealth of information in languages we don't understand.
The investigation of automated or semi-automated approaches to translation has become a …

[PDF][PDF] Twitter translation using translation-based cross-lingual retrieval

L Jehl, F Hieber, S Riezler - … of the seventh workshop on statistical …, 2012 - aclanthology.org
Microblogging services such as Twitter have become popular media for real-time
usercreated news reporting. Such communication often happens in parallel in different …

[PDF][PDF] The peculiarities of translations of official business plans from English into Russian

LR Sakaeva, MA Yahin, D Mensah… - Opción: Revista de …, 2019 - dialnet.unirioja.es
The subject of the present article is the general specificities in the texts of business
documentation, specifically business plans and their translations from English into Russian …

Extracting parallel phrases from comparable data

S Hewavitharana, S Vogel - Building and using comparable corpora, 2013 - Springer
Mining parallel data from comparable corpora is a promising approach for overcoming the
data sparseness in statistical machine translation and other NLP applications. Even if two …

Integrated parallel sentence and fragment extraction from comparable corpora: A case study on Chinese--Japanese Wikipedia

C Chu, T Nakazawa, S Kurohashi - ACM Transactions on Asian and …, 2015 - dl.acm.org
Parallel corpora are crucial for statistical machine translation (SMT); however, they are quite
scarce for most language pairs and domains. As comparable corpora are far more available …

Mining parallel fragments from comparable texts

M Cettolo, M Federico, N Bertoldi - Proceedings of the 7th …, 2010 - aclanthology.org
This paper proposes a novel method for exploiting comparable documents to generate
parallel data for machine translation. First, each source document is paired to each sentence …

They Are Out There, If You Know Where to Look”: Mining Transliterations of OOV Query Terms for Cross-Language Information Retrieval

R Udupa, SK, A Bakalov, A Bhole - … on IR Research, ECIR 2009, Toulouse …, 2009 - Springer
It is well known that the use of a good Machine Transliteration system improves the retrieval
performance of Cross-Language Information Retrieval (CLIR) systems when the query and …