P Paschou, J Lewis, A Javed, P Drineas - Journal of Medical Genetics, 2010 - jmg.bmj.com
Background and aims The analysis of large-scale genetic data from thousands of individuals has revealed the fact that subtle population genetic structure can be detected at levels that …
In today's information systems, the availability of massive amounts of data necessitates the development of fast and accurate algorithms to summarize these data and represent them in …
B Savas, IS Dhillon - Proceedings of the 2011 SIAM International …, 2011 - SIAM
In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and …
A Sood, T Hastie - arXiv preprint arXiv:2307.12892, 2023 - arxiv.org
We consider the problem of selecting a small subset of representative variables from a large dataset. In the computer science literature, this dimensionality reduction problem is typically …
Given a very large data set distributed over a cluster of several nodes, this paper addresses the problem of selecting a few data instances that best represent the entire data set. The …
Subset selection is an important component in evolutionary multiobjective optimization (EMO) algorithms. Clustering, as a classic method to group similar data points together, has …
A Amato, V Di Lecce - Open Computer Science, 2023 - degruyter.com
The popularity of artificial intelligence applications is on the rise, and they are producing better outcomes in numerous fields of research. However, the effectiveness of these …
M Rahmani, GK Atia - IEEE signal processing letters, 2017 - ieeexplore.ieee.org
Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data …
C Boutsidis - arXiv preprint arXiv:1105.0709, 2011 - arxiv.org
We study three fundamental problems of Linear Algebra, lying in the heart of various Machine Learning applications, namely: 1)" Low-rank Column-based Matrix Approximation" …