ParaCrawl: Web-scale acquisition of parallel corpora

M Bañón, P Chen, B Haddow, K Heafield, H Hoang… - 2020 - strathprints.strath.ac.uk
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Bañón, P Chen, B Haddow, K Heafield, H Hoang… - academia.edu
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

ParaCrawl: web-scale acquisition of parallel corpora

M Bañón, P Chen, B Haddow… - … Annual Meeting of …, 2020 - pureportal.strath.ac.uk
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Banón, P Chen, B Haddow, K Heafield, H Hoang… - pinzhenchen.github.io
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Bañón, P Chen, B Haddow, K Heafield, H Hoang… - scholar.archive.org
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Bañón, P Chen, B Haddow, K Heafield… - Proceedings of the …, 2020 - aclanthology.org
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Banón, P Chen, B Haddow, K Heafield, H Hoang… - kheafield.com
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Bañón, P Chen, B Haddow, K Heafield… - … Conference of the …, 2020 - research.ed.ac.uk
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Banón, P Chen, B Haddow, K Heafield, H Hoang… - researchgate.net
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

[PDF][PDF] ParaCrawl: Web-Scale Acquisition of Parallel Corpora

M Bañón, P Chen, B Haddow, K Heafield, H Hoang… - core.ac.uk
We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …