Source identification for mixtures of product distributions

S Gordon, BH Mazaheri, Y Rabani… - … on Learning Theory, 2021 - proceedings.mlr.press
We give an algorithm for source identification of a mixture of k product distributions on n bits.
This is a fundamental problem in machine learning with many applications. Our algorithm …

Topological differential testing

K Ambrose, S Huntsman, M Robinson… - arXiv preprint arXiv …, 2020 - arxiv.org
We introduce topological differential testing (TDT), an approach to extracting the consensus
behavior of a set of programs on a corpus of inputs. TDT uses the topological notion of a …

Generalized feature embedding for supervised, unsupervised, and online learning tasks

E Golinko, X Zhu - Information Systems Frontiers, 2019 - Springer
Feature embedding is an emerging research area which intends to transform features from
the original space into a new space to support effective learning. Many feature embedding …

Provably efficient exploration for reinforcement learning using unsupervised learning

F Feng, R Wang, W Yin, SS Du… - Advances in Neural …, 2020 - proceedings.neurips.cc
Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration
in reinforcement learning (RL) problems [tang2017exploration, bellemare2016unifying], we …

False clustering rate control in mixture models

A Marandon, T Rebafka, E Roquain, N Sokolovska - 2022 - hal.science
The clustering task consists in delivering labels to the members of a sample. For most data
sets, some individuals are ambiguous and intrinsically difficult to attribute to one or another …

On the identifiability of finite mixtures of finite product measures

B Tahmasebi, SA Motahari, MA Maddah-Ali - arXiv preprint arXiv …, 2018 - arxiv.org
The problem of identifiability of finite mixtures of finite product measures is studied. A mixture
model with $ K $ mixture components and $ L $ observed variables is considered, where …

Bayesian modeling of mutual exclusivity in cancer mutations

P Czyż, N Beerenwinkel - bioRxiv, 2024 - biorxiv.org
When cancer develops, gene mutations do not occur independently, prompting researchers
to pose scientific hypotheses about their interactions. Synthetic lethal interactions, which …

Distributed Fact Checking

A Verma, A Sharbafchi, B Touri… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
We formulate the problem of fake news detection using distributed inexpert agents. We
consider the source for news/statements as a binary source (to model true vs. false …

False membership rate control in mixture models

A Marandon, T Rebafka, E Roquain… - arXiv preprint arXiv …, 2022 - arxiv.org
The clustering task consists in partitioning elements of a sample into homogeneous groups.
Most datasets contain individuals that are ambiguous and intrinsically difficult to attribute to …

Contributions to reliable machine learning via false discovery rate control

A Marandon-Carlhian - 2023 - theses.hal.science
The reliability of ML methods is critical in applications that involve decision making. The goal
of this thesis is to propose new methods for risk control in several learning tasks: novelty …