[PDF][PDF] Sampling-based multilingual alignment

A Lardilleux, Y Lepage - Proceedings of the International …, 2009 - aclanthology.org
We present a sub-sentential alignment method that extracts high quality multi-word
alignments from sentence-aligned multilingual parallel corpora. Unlike other methods, it …

A web-based Bengali news corpus for named entity recognition

A Ekbal, S Bandyopadhyay - Language Resources and Evaluation, 2008 - Springer
The rapid development of language resources and tools using machine learning techniques
for less computerized languages requires appropriately tagged corpus. A tagged Bengali …

Terminological and ontological analysis of european directives: multilinguism in law

G Ajani, L Lesmo, G Boella, A Mazzei… - Proceedings of the 11th …, 2007 - dl.acm.org
This paper describes the philosophy behind our tool called" Legal Taxonomy Syllabus", the
analytical instruments it provides and some case studies. The Legal Taxonomy Syllabus is …

Leveraging parallel corpora and existing wordnets for automatic construction of the slovene wordnet

D Fišer - Human Language Technology. Challenges of the …, 2009 - Springer
The paper reports on a series of experiments conducted in order to test the feasibility of
automatically generating synsets for Slovene wordnet. The resources used were the …

[PDF][PDF] The contribution of low frequencies to multilingual sub-sentential alignment: a differential associative approach

A Lardilleux, Y Lepage, F Yvon - International Journal of Advanced …, 2011 - 133.9.48.109
The goal of this paper is to show that, contrary to preconceived ideas, one can efficiently
take advantage of low frequency words in natural language processing. We put them to use …

[PDF][PDF] Development of bengali named entity tagged corpus and its use in ner systems

A Ekbal, S Bandyopadhyay - Proceedings of the 6th Workshop on …, 2008 - aclanthology.org
The rapid development of language tools using machine learning techniques for less
computerized languages requires appropriately tagged corpus. A Bengali news corpus has …

[图书][B] Multilevel legal ontologies

G Ajani, G Boella, L Lesmo, M Martin, A Mazzei… - 2010 - Springer
In order to manage the conceptual representation of European law we have proposed the
Legal Taxonomy Syllabus (LTS) and the related methodology. In this paper we consider …

Rapid detection of similar peer-reviewed scientific papers via constant number of randomized fingerprints

Y HaCohen-Kerner, A Tayeb - Information Processing & Management, 2017 - Elsevier
This research is concerned with the detection of similar academic papers. Given a tested
paper from a given corpus of 10,099 peer-reviewed scientific papers, a two-stage process …

The contribution of the notion of hapax legomena to word alignment

A Lardilleux, Y Lepage - Proceedings of the 4th Language and …, 2007 - hal.science
Current techniques in word alignment disregard words with a low frequency because they
would not be useful. Against this belief, this paper shows that, in particular, the notion of …

A truly multilingual, high coverage, accurate, yet simple, sub-sentential alignment method

A Lardilleux, Y Lepage - The 8th conference of the Association for …, 2008 - hal.science
This paper describes a new alignment method that extracts high quality multi-word
alignments from sentence-aligned multilingual parallel corpora. The method can handle …