Re-thinking data strategy and integration for artificial intelligence: concepts, opportunities, and challenges

A Aldoseri, KN Al-Khalifa, AM Hamouda - Applied Sciences, 2023 - mdpi.com
The use of artificial intelligence (AI) is becoming more prevalent across industries such as
healthcare, finance, and transportation. Artificial intelligence is based on the analysis of …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H Jin, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Towards data-centric graph machine learning: Review and outlook

X Zheng, Y Liu, Z Bao, M Fang, X Hu, AWC Liew… - arXiv preprint arXiv …, 2023 - arxiv.org
Data-centric AI, with its primary focus on the collection, management, and utilization of data
to drive AI models and applications, has attracted increasing attention in recent years. In this …

Dataperf: Benchmarks for data-centric ai development

M Mazumder, C Banbury, X Yao… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …

Fingpt: Democratizing internet-scale data for financial large language models

XY Liu, G Wang, H Yang, D Zha - arXiv preprint arXiv:2307.10485, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable proficiency in
understanding and generating human-like texts, which may potentially revolutionize the …

[PDF][PDF] Skeleton-of-thought: Large language models can do parallel decoding

X Ning, Z Lin, Z Zhou, Z Wang, H Yang… - Proceedings ENLSP …, 2023 - lirias.kuleuven.be
This work aims at decreasing the end-to-end generation latency of large language models
(LLMs). One of the major causes of the high generation latency is the sequential decoding …

Dataset regeneration for sequential recommendation

M Yin, H Wang, W Guo, Y Liu, S Zhang… - Proceedings of the 30th …, 2024 - dl.acm.org
The sequential recommender (SR) system is a crucial component of modern recommender
systems, as it aims to capture the evolving preferences of users. Significant efforts have …

Opengsl: A comprehensive benchmark for graph structure learning

Z Zhiyao, S Zhou, B Mao, X Zhou… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Graph Neural Networks (GNNs) have emerged as the de facto standard for
representation learning on graphs, owing to their ability to effectively integrate graph …

Imitation learning from imperfection: Theoretical justifications and algorithms

Z Li, T Xu, Z Qin, Y Yu, ZQ Luo - Advances in Neural …, 2024 - proceedings.neurips.cc
Imitation learning (IL) algorithms excel in acquiring high-quality policies from expert data for
sequential decision-making tasks. But, their effectiveness is hampered when faced with …

Fusecap: Leveraging large language models for enriched fused image captions

N Rotstein, D Bensaïd, S Brody… - Proceedings of the …, 2024 - openaccess.thecvf.com
The advent of vision-language pre-training techniques enhanced substantial progress in the
development of models for image captioning. However, these models frequently produce …