Medclip: Contrastive learning from unpaired medical images and text

Z Wang, Z Wu, D Agarwal, J Sun - arXiv preprint arXiv:2210.10163, 2022 - arxiv.org
Existing vision-text contrastive learning like CLIP aims to match the paired image and
caption embeddings while pushing others apart, which improves representation …

A review on design inspired subsampling for big data

J Yu, M Ai, Z Ye - Statistical Papers, 2024 - Springer
Subsampling focuses on selecting a subsample that can efficiently sketch the information of
the original data in terms of statistical inference. It provides a powerful tool in big data …

A survey on data pricing: from economics to data science

J Pei - IEEE Transactions on knowledge and Data …, 2020 - ieeexplore.ieee.org
Data are invaluable. How can we assess the value of data objectively, systematically and
quantitatively? Pricing data, or information goods in general, has been studied and practiced …

Data valuation using reinforcement learning

J Yoon, S Arik, T Pfister - International Conference on …, 2020 - proceedings.mlr.press
Quantifying the value of data is a fundamental problem in machine learning and has multiple
important use cases:(1) building insights about the dataset and task,(2) domain …

Training data influence analysis and estimation: A survey

Z Hammoudeh, D Lowd - Machine Learning, 2024 - Springer
Good models require good training data. For overparameterized deep models, the causal
relationship between training data and model predictions is increasingly opaque and poorly …

Resolving training biases via influence-based data relabeling

S Kong, Y Shen, L Huang - International Conference on Learning …, 2021 - openreview.net
The performance of supervised learning methods easily suffers from the training bias issue
caused by train-test distribution mismatch or label noise. Influence function is a technique …

Regularizing second-order influences for continual learning

Z Sun, Y Mu, G Hua - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Continual learning aims to learn on non-stationary data streams without catastrophically
forgetting previous knowledge. Prevalent replay-based methods address this challenge by …

A survey of dataset refinement for problems in computer vision datasets

Z Wan, Z Wang, CT Chung, Z Wang - ACM computing surveys, 2024 - dl.acm.org
Large-scale datasets have played a crucial role in the advancement of computer vision.
However, they often suffer from problems such as class imbalance, noisy labels, dataset …

GEX: A flexible method for approximating influence via Geometric Ensemble

SY Kim, K Kim, E Yang - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Through a deeper understanding of predictions of neural networks, Influence Function (IF)
has been applied to various tasks such as detecting and relabeling mislabeled samples …

Characterizing the influence of graph elements

Z Chen, P Li, H Liu, P Hong - arXiv preprint arXiv:2210.07441, 2022 - arxiv.org
Influence function, a method from robust statistics, measures the changes of model
parameters or some functions about model parameters concerning the removal or …