Does graph distillation see like vision dataset counterpart?

B Yang, K Wang, Q Sun, C Ji, X Fu… - Advances in …, 2024 - proceedings.neurips.cc
Training on large-scale graphs has achieved remarkable results in graph representation
learning, but its cost and storage have attracted increasing concerns. Existing graph …

Expanding small-scale datasets with guided imagination

Y Zhang, D Zhou, B Hooi, K Wang… - Advances in neural …, 2023 - proceedings.neurips.cc
The power of DNNs relies heavily on the quantity and quality of training data. However,
collecting and annotating data on a large scale is often expensive and time-consuming. To …

Towards lossless dataset distillation via difficulty-aligned trajectory matching

Z Guo, K Wang, G Cazenavette, H Li, K Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
The ultimate goal of Dataset Distillation is to synthesize a small synthetic dataset such that a
model trained on this synthetic set will perform equally well as a model trained on the full …

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Dataset regeneration for sequential recommendation

M Yin, H Wang, W Guo, Y Liu, S Zhang… - Proceedings of the 30th …, 2024 - dl.acm.org
The sequential recommender (SR) system is a crucial component of modern recommender
systems, as it aims to capture the evolving preferences of users. Significant efforts have …

Spanning training progress: Temporal dual-depth scoring (tdds) for enhanced dataset pruning

X Zhang, J Du, Y Li, W Xie… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Dataset pruning aims to construct a coreset capable of achieving performance comparable
to the original full dataset. Most existing dataset pruning methods rely on snapshot-based …

M3d: Dataset condensation by minimizing maximum mean discrepancy

H Zhang, S Li, P Wang, D Zeng, S Ge - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Training state-of-the-art (SOTA) deep models often requires extensive data, resulting in
substantial training and storage costs. To address these challenges, dataset condensation …

FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models

L Zhao, T Zhao, Z Lin, X Ning, G Dai… - Proceedings of the …, 2024 - openaccess.thecvf.com
In recent years there has been significant progress in the development of text-to-image
generative models. Evaluating the quality of the generative models is one essential step in …

Navigating complexity: Toward lossless graph condensation via expanding window matching

Y Zhang, T Zhang, K Wang, Z Guo, Y Liang… - arXiv preprint arXiv …, 2024 - arxiv.org
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing
a compact counterpart without sacrificing the performance of Graph Neural Networks …

Can pre-trained models assist in dataset distillation?

Y Lu, X Chen, Y Zhang, J Gu, T Zhang, Y Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Dataset Distillation (DD) is a prominent technique that encapsulates knowledge from a large-
scale original dataset into a small synthetic dataset for efficient training. Meanwhile, Pre …