Data collection and quality challenges in deep learning: A data-centric ai perspective

SE Whang, Y Roh, H Song, JG Lee - The VLDB Journal, 2023 - Springer
Data-centric AI is at the center of a fundamental shift in software engineering where machine
learning becomes the new software, powered by big data and computing infrastructure …

[HTML][HTML] The pipeline for the continuous development of artificial intelligence models—Current state of research and practice

M Steidl, M Felderer, R Ramler - Journal of Systems and Software, 2023 - Elsevier
Companies struggle to continuously develop and deploy Artificial Intelligence (AI) models to
complex production systems due to AI characteristics while assuring quality. To ease the …

Operationalizing machine learning: An interview study

S Shankar, R Garcia, JM Hellerstein… - arXiv preprint arXiv …, 2022 - arxiv.org
Organizations rely on machine learning engineers (MLEs) to operationalize ML, ie, deploy
and maintain ML pipelines in production. The process of operationalizing ML, or MLOps …

Opportunities and Challenges in Data-Centric AI

S Kumar, S Datta, V Singh, SK Singh, R Sharma - IEEE Access, 2024 - ieeexplore.ieee.org
Artificial intelligence (AI) systems are trained to solve complex problems and learn to
perform specific tasks by using large volumes of data, such as prediction, classification …

A data quality-driven view of mlops

C Renggli, L Rimanic, NM Gürel, B Karlaš… - arXiv preprint arXiv …, 2021 - arxiv.org
Developing machine learning models can be seen as a process similar to the one
established for traditional software development. A key difference between the two lies in the …

Management of machine learning lifecycle artifacts: A survey

M Schlegel, KU Sattler - ACM SIGMOD Record, 2023 - dl.acm.org
The explorative and iterative nature of developing and operating ML applications leads to a
variety of artifacts, such as datasets, features, models, hyperparameters, metrics, software …

Ggfast: Automating generation of flexible network traffic classifiers

J Piet, D Nwoji, V Paxson - Proceedings of the ACM SIGCOMM 2023 …, 2023 - dl.acm.org
When employing supervised machine learning to analyze network traffic, the heart of the
task often lies in developing effective features for the ML to leverage. We develop GGFAST …

Riding a bicycle while building its wheels: the process of machine learning-based capability development and IT-business alignment practices

T Mucha, S Ma, K Abhari - Internet Research, 2023 - emerald.com
Purpose Recent advancements in Artificial Intelligence (AI) and, at its core, Machine
Learning (ML) offer opportunities for organizations to develop new or enhance existing …

Hyper-tune: Towards efficient hyper-parameter tuning at scale

Y Li, Y Shen, H Jiang, W Zhang, J Li, J Liu… - arXiv preprint arXiv …, 2022 - arxiv.org
The ever-growing demand and complexity of machine learning are putting pressure on
hyper-parameter tuning systems: while the evaluation cost of models continues to increase …

Openbox: A Python toolkit for generalized black-box optimization

H Jiang, Y Shen, Y Li, B Xu, S Du, W Zhang… - Journal of Machine …, 2024 - jmlr.org
Black-box optimization (BBO) has a broad range of applications, including automatic
machine learning, experimental design, and database knob tuning. However, users still face …