[图书][B] Handbook of natural language processing

N Indurkhya, FJ Damerau - 2010 - taylorfrancis.com
The Handbook of Natural Language Processing, Second Edition presents practical tools
and techniques for implementing natural language processing in computer systems. Along …

POCLib: A high-performance framework for enabling near orthogonal processing on compression

F Zhang, J Zhai, X Shen, O Mutlu… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Parallel technology boosts data processing in recent years, and parallel direct data
processing on hierarchically compressed documents exhibits great promise. The high …

Improving machine translation performance by exploiting non-parallel corpora

DS Munteanu, D Marcu - Computational Linguistics, 2005 - direct.mit.edu
We present a novel method for discovering parallel sentences in comparable, non-parallel
corpora. We train a maximum entropy classifier that, given a pair of sentences, can reliably …

[PDF][PDF] Extracting parallel sentences from comparable corpora using document level alignment

J Smith, C Quirk, K Toutanova - … of the North American chapter of …, 2010 - aclanthology.org
The quality of a statistical machine translation (SMT) system is heavily dependent upon the
amount of parallel sentences used in training. In recent years, there have been several …

[PDF][PDF] Extracting parallel sub-sentential fragments from non-parallel corpora

DS Munteanu, D Marcu - … of the 21st international conference on …, 2006 - aclanthology.org
We present a novel method for extracting parallel sub-sentential fragments from
comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs …

[图书][B] High-performance parallel database processing and grid databases

D Taniar, CHC Leung, W Rahayu, S Goel - 2008 - books.google.com
The latest techniques and principles of parallel and grid database processing The growth in
grid databases, coupled with the utility of parallel query processing, presents an important …

[PDF][PDF] Mining very-non-parallel corpora: Parallel sentence and lexicon extraction via bootstrapping and e

P Fung, P Cheung - Proceedings of the 2004 conference on …, 2004 - aclanthology.org
We present a method capable of extracting parallel sentences from far more disparate “very-
non-parallel corpora” than previous “comparable corpora” methods, by exploiting …

Overviewing important aspects of the last twenty years of research in comparable corpora

S Sharoff, R Rapp, P Zweigenbaum - Building and using comparable …, 2013 - Springer
Overviewing Important Aspects of the Last Twenty Years of Research in Comparable Corpora |
SpringerLink Skip to main content Advertisement SpringerLink Account Menu Find a journal …

Detecting cross-lingual semantic divergence for neural machine translation

M Carpuat, Y Vyas, X Niu - Proceedings of the First Workshop on …, 2017 - aclanthology.org
Parallel corpora are often not as parallel as one might assume: non-literal translations and
noisy translations abound, even in curated corpora routinely used for training and …

A dom tree alignment model for mining parallel data from the web

L Shi, C Niu, M Zhou, J Gao - COLING-ACL, 2005 - microsoft.com
This paper presents a new web mining scheme for parallel data acquisition. Based on the
Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM …