Improved coresets and sublinear algorithms for power means in euclidean spaces

V Cohen-Addad, D Saulpic… - Advances in Neural …, 2021 - proceedings.neurips.cc
In this paper, we consider the problem of finding high dimensional power means: given a set
$ A $ of $ n $ points in $\R^ d $, find the point $ m $ that minimizes the sum of Euclidean …

A framework and benchmark for deep batch active learning for regression

D Holzmüller, V Zaverkin, J Kästner… - Journal of Machine …, 2023 - jmlr.org
The acquisition of labels for supervised learning can be expensive. To improve the sample
efficiency of neural network regression, we study active learning methods that adaptively …

Data acquisition for improving machine learning models

Y Li, X Yu, N Koudas - arXiv preprint arXiv:2105.14107, 2021 - arxiv.org
The vast advances in Machine Learning over the last ten years have been powered by the
availability of suitably prepared data for training purposes. The future of ML-enabled …

Parallel batch k-means for Big data clustering

RM Alguliyev, RM Aliguliyev, LV Sukhostat - Computers & Industrial …, 2021 - Elsevier
The application of clustering algorithms is expanding due to the rapid growth of data
volumes. Nevertheless, existing algorithms are not always effective because of high …

Distributed K-Means clustering guaranteeing local differential privacy

C Xia, J Hua, W Tong, S Zhong - Computers & Security, 2020 - Elsevier
In many cases, a service provider might require to aggregate data from end-users to perform
mining tasks such as K-means clustering. Nevertheless, since such data often contain …

Fast and accurate least-mean-squares solvers

A Maalouf, I Jubran, D Feldman - Advances in Neural …, 2019 - proceedings.neurips.cc
Least-mean squares (LMS) solvers such as Linear/Ridge/Lasso-Regression, SVD and
Elastic-Net not only solve fundamental machine learning problems, but are also the building …

Identifying insufficient data coverage for ordinal continuous-valued attributes

A Asudeh, N Shahbazi, Z Jin, HV Jagadish - Proceedings of the 2021 …, 2021 - dl.acm.org
Appropriate training data is a requirement for building good machine-learned models. In this
paper, we study the notion of coverage for ordinal and continuous-valued attributes, by …

PACE: a PAth-CEntric paradigm for stochastic path finding

B Yang, J Dai, C Guo, CS Jensen, J Hu - The VLDB Journal, 2018 - Springer
With the growing volumes of vehicle trajectory data, it becomes increasingly possible to
capture time-varying and uncertain travel costs, eg, travel time, in a road network. The …

Positional encoder graph neural networks for geographic data

K Klemmer, NS Safir, DB Neill - International Conference on …, 2023 - proceedings.mlr.press
Graph neural networks (GNNs) provide a powerful and scalable solution for modeling
continuous spatial data. However, they often rely on Euclidean distances to construct the …

Positively weighted kernel quadrature via subsampling

S Hayakawa, H Oberhauser… - Advances in Neural …, 2022 - proceedings.neurips.cc
We study kernel quadrature rules with convex weights. Our approach combines the spectral
properties of the kernel with recombination results about point measures. This results in …