Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications

I Vulić, W De Smet, J Tang, MF Moens - Information Processing & …, 2015 - Elsevier
Probabilistic topic models are unsupervised generative models which model document
content as a two-step generation process, that is, documents are observed as mixtures of …

[PDF][PDF] Named entity recognition

IM Konkol - University of West Bohemia, 2015 - core.ac.uk
The idea of automatic extraction of important information from text documents comes from
the time of first steps in the natural language processing. Its importance rapidly grows with …

Exploiting named entities for bilingual news clustering

S Montalvo, R Martínez, V Fresno… - Journal of the …, 2015 - Wiley Online Library
In this article, we present a new algorithm for clustering a bilingual collection of comparable
news items in groups of specific topics. Our hypothesis is that named entities (NE s) are …

Clustering Bilingual Documents Using Various Clustering Linkages Coupled with Different Proximity Measurement Techniques

R Alfred, LC Leong, MH Ahmad Hijazi… - Advanced Science …, 2015 - ingentaconnect.com
With the rich data on the web, a documents clustering task for monolingual documents is
insufficient in order to produce an efficient information retrieval system. A Multilingual …