Probabilistic topic models are unsupervised generative models which model document content as a two-step generation process, that is, documents are observed as mixtures of …
How can a single person understand what's going on in a collection of millions of documents? This is an increasingly common problem: sifting through an organization's e …
M Faruqui, C Dyer - Proceedings of the 14th Conference of the …, 2014 - aclanthology.org
The distributional hypothesis of Harris (1954), according to which the meaning of words is evidenced by the contexts they occur in, has motivated several effective techniques for …
Recent advances in research tools for the systematic analysis of textual data are enabling exciting new research throughout the social sciences. For comparative politics, scholars who …
WY Zou, R Socher, D Cer… - Proceedings of the 2013 …, 2013 - aclanthology.org
We introduce bilingual word embeddings: semantic embeddings associated across two languages in the context of neural language models. We propose a method to learn …
Word embeddings have been found useful for many NLP tasks, including part-of-speech tagging, named entity recognition, and parsing. Adding multilingual context when learning …
We develop the multilingual topic model for unaligned text (MuTo), a probabilistic model of text that is designed to analyze corpora composed of documents in two languages. From …
This paper studies the problem of question retrieval in community question answering (CQA). To bridge lexical gaps in questions, which is regarded as the biggest challenge in …
J Boyd-Graber, P Resnik - … of the 2010 Conference on Empirical …, 2010 - aclanthology.org
In this paper, we develop multilingual supervised latent Dirichlet allocation (MLSLDA), a probabilistic generative model that allows insights gleaned from one language's data to …