This paper presents an approach for Multilingual News Document Clustering in comparable corpora. We have implemented two algorithms of heuristic nature that follow the approach …
B Mathieu, R Besançon, C Fluhr - RIAO, 2004 - Citeseer
Abstract Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we …
K Kishida - Journal of Information Science, 2011 - journals.sagepub.com
It is often necessary to categorize automatically multilingual document sets, in which documents written in a variety of languages are included, into topically homogeneous …
This paper is focused on discovering bilingual news clusters in a comparable corpus. Particularly, we deal with the news representation and with the calculation of the similarity …
In this paper we evaluate the influence of different document representations in the results of multilingual news clustering. We aim at proving whether or not the use of only named …
This paper focuses on the task of bilingual clustering, which involves dividing a set of documents from two different languages into a set of thematically homogeneous groups. It …
BW Bader, PA Chew - Text Mining: Applications and Theory, 2010 - Wiley Online Library
In a series of articles published largely in the computational linguistics literature, this chapter outlines a number of computational techniques for clustering documents in a multilingual …
JF Silva, GP Lopes, JT Mexia - Pliska Studia Mathematica Bulgarica, 2004 - academia.edu
208 J. Silva, J. Mexia, CA Coelho, G. Lopes that might influence the behaviour of our methodology and might bias the unsupervised clustering method proposed. Since we want …
J Silva, J Mexia, CA Coelho, G Lopes - Progress in Artificial Intelligence …, 2001 - Springer
This paper describes a statistics-based approach for clustering documents and for extracting cluster topics. Relevant Expressions (REs) are extracted from corpora and used as …