Online polylingual topic models for fast document translation detection

C-BiLDA extracting cross-lingual topics from non-parallel texts by distinguishing shared from unshared content

G Heyman, I Vulić, MF Moens - Data Mining and Knowledge Discovery, 2016 - Springer

We study the problem of extracting cross-lingual topics from non-parallel multilingual text
datasets with partially overlapping thematic content (eg, aligned Wikipedia articles in two …

被引用次数：25 相关文章所有 7 个版本

[PDF] reminder-project.eu

[PDF][PDF] A bridge over the language gap: Topic modelling for text analyses across languages for country comparative research

F Lind, JM Eberl, S Galyga… - University of Vienna …, 2019 - reminder-project.eu

A Bridge Over the Language Gap: Topic Modelling for Text Analyses Across Languages for
Country Comparative Research Page 1 Working PaPer A Bridge Over the Language Gap: Topic …

被引用次数：11 相关文章

[PDF] psu.edu

Efficient nearest-neighbor search in the probability simplex

K Krstovski, DA Smith, HM Wallach… - Proceedings of the 2013 …, 2013 - dl.acm.org

Document similarity tasks arise in many areas of information retrieval and natural language
processing. A fundamental question when comparing documents is which representation to …

被引用次数：21 相关文章所有 8 个版本

[PDF] recognition.su

[PDF][PDF] BigARTM: библиотека с открытым кодом для тематического моделирования больших текстовых коллекций

К Воронцов, А Фрей, П Ромов, АО Янина… - … данными в областях …, 2015 - recognition.su

Аннотация Тематическое моделирование—это одно из современных направлений
статистического анализа текстов, активно развивающееся последние 10–15 лет …

被引用次数：14 相关文章所有 4 个版本

[PDF] aaai.org

[PDF][PDF] Temporal and object relations in unsupervised plan and activity recognition

RG Freedman, HT Jung, S Zilberstein - 2015 AAAI Fall Symposium …, 2015 - cdn.aaai.org

We consider ways to improve the performance of unsupervised plan and activity recognition
techniques by considering temporal and object relations in addition to postural data …

被引用次数：12 相关文章所有 3 个版本

[PDF] aclanthology.org

[PDF][PDF] Bootstrapping translation detection and sentence extraction from comparable corpora

K Krstovski, DA Smith - Proceedings of the 2016 Conference of …, 2016 - aclanthology.org

Most work on extracting parallel text from comparable corpora depends on linguistic
resources such as seed parallel documents or translation dictionaries. This paper presents a …

被引用次数：11 相关文章所有 3 个版本

[PDF] aclanthology.org

[PDF][PDF] Online multilingual topic models with multi-level hyperpriors

K Krstovski, DA Smith, MJ Kurtz - … of the 2016 Conference of the …, 2016 - aclanthology.org

For topic models, such as LDA, that use a bag-of-words assumption, it becomes especially
important to break the corpus into appropriately-sized “documents”. Since the models are …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Bilingual Topic Models for Comparable Corpora

G Balikas, MR Amini, M Clausel - arXiv preprint arXiv:2111.15278, 2021 - arxiv.org

Probabilistic topic models like Latent Dirichlet Allocation (LDA) have been previously
extended to the bilingual setting. A fundamental modeling assumption in several of these …

Mining and learning from multilingual text collections using topic models and word embeddings

G Balikas - 2017 - hal.science

Text is one of the most pervasive and persistent sources of information. Content analysis of
text in its broad sense refers to methods for studying and retrieving information from …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

Multilingual Topic Models

K Krstovski, MJ Kurtz, DA Smith… - arXiv preprint arXiv …, 2017 - arxiv.org

Scientific publications have evolved several features for mitigating vocabulary mismatch
when indexing, retrieving, and computing similarity between articles. These mitigation …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群