Clustering large graphs via the singular value decomposition

P Drineas, A Frieze, R Kannan, S Vempala, V Vinay - Machine learning, 2004 - Springer
We consider the problem of partitioning a set of m points in the n-dimensional Euclidean
space into k clusters (usually m and n are variable, while k is fixed), so as to minimize the …

Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix

P Drineas, R Kannan, MW Mahoney - SIAM Journal on computing, 2006 - SIAM
In many applications, the data consist of (or may be naturally formulated as) an m*n matrix A.
It is often of interest to find a low-rank approximation to A, ie, an approximation D to the …

A survey on distribution testing: Your data is big. But is it blue?

CL Canonne - Theory of Computing, 2020 - theoryofcomputing.org
The field of property testing originated in work on program checking, and has evolved into
an established and very active research area. In this work, we survey the developments of …

An automatic inequality prover and instance optimal identity testing

G Valiant, P Valiant - SIAM Journal on Computing, 2017 - SIAM
We consider the problem of verifying the identity of a distribution: Given the description of a
distribution over a discrete finite or countably infinite support, p=(p_1,p_2,...), how many …

Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition

P Drineas, R Kannan, MW Mahoney - SIAM Journal on Computing, 2006 - SIAM
In many applications, the data consist of (or may be naturally formulated as) an m*n matrix A
which may be stored on disk but which is too large to be read into random access memory …

Energy minimization via graph cuts: Settling what is possible

D Freedman, P Drineas - 2005 IEEE Computer Society …, 2005 - ieeexplore.ieee.org
The recent explosion of interest in graph cut methods in computer vision naturally spawns
the question: what energy functions can be minimized via graph cuts? This question was first …

Selfish behavior and stability of the Internet: A game-theoretic analysis of TCP

A Akella, S Seshan, R Karp, S Shenker… - ACM SIGCOMM …, 2002 - dl.acm.org
For years, the conventional wisdom [7, 22] has been that the continued stability of the
Internet depends on the widespread deployment of" socially responsible" congestion …

The structure of optimal private tests for simple hypotheses

CL Canonne, G Kamath, A McMillan, A Smith… - Proceedings of the 51st …, 2019 - dl.acm.org
Hypothesis testing plays a central role in statistical inference, and is used in many settings
where privacy concerns are paramount. This work answers a basic question about privately …

Strong lower bounds for approximating distribution support size and the distinct elements problem

S Raskhodnikova, D Ron, A Shpilka, A Smith - SIAM Journal on Computing, 2009 - SIAM
We consider the problem of approximating the support size of a distribution from a small
number of samples, when each element in the distribution appears with probability at least …

Turnstile streaming algorithms might as well be linear sketches

Y Li, HL Nguyen, DP Woodruff - Proceedings of the forty-sixth annual …, 2014 - dl.acm.org
In the turnstile model of data streams, an underlying vector x∈{--m,--m+ 1,..., m--1, m} n is
presented as a long sequence of positive and negative integer updates to its coordinates. A …