Data pricing in machine learning pipelines

Z Cong, X Luo, J Pei, F Zhu, Y Zhang - Knowledge and Information …, 2022 - Springer
Abstract Machine learning is disruptive. At the same time, machine learning can only
succeed by collaboration among many parties in multiple steps naturally as pipelines in an …

Frugalgpt: How to use large language models while reducing cost and improving performance

L Chen, M Zaharia, J Zou - arXiv preprint arXiv:2305.05176, 2023 - arxiv.org
There is a rapidly growing number of large language models (LLMs) that users can query for
a fee. We review the cost associated with querying popular LLM APIs, eg GPT-4, ChatGPT …

Towards unbiased and accurate deferral to multiple experts

V Keswani, M Lease, K Kenthapadi - Proceedings of the 2021 AAAI/ACM …, 2021 - dl.acm.org
Machine learning models are often implemented in cohort with humans in the pipeline, with
the model having an option to defer to a domain expert in cases where it has low confidence …

Cocktail: A multidimensional optimization for model serving in cloud

JR Gunasekaran, CS Mishra, P Thinakaran… - … USENIX Symposium on …, 2022 - usenix.org
With a growing demand for adopting ML models for a variety of application services, it is vital
that the frameworks serving these models are capable of delivering highly accurate …

Large language model routing with benchmark datasets

T Shnitzer, A Ou, M Silva, K Soule, Y Sun… - arXiv preprint arXiv …, 2023 - arxiv.org
There is a rapidly growing number of open-source Large Language Models (LLMs) and
benchmark datasets to compare them. While some models dominate these benchmarks, no …

Alice: Active learning with contrastive natural language explanations

W Liang, J Zou, Z Yu - arXiv preprint arXiv:2009.10259, 2020 - arxiv.org
Training a supervised neural network classifier typically requires many annotated training
samples. Collecting and annotating a large number of data points are costly and sometimes …

Hapi: A large-scale longitudinal dataset of commercial ml api predictions

L Chen, Z Jin, ES Eyuboglu, C Ré… - Advances in Neural …, 2022 - proceedings.neurips.cc
Commercial ML APIs offered by providers such as Google, Amazon and Microsoft have
dramatically simplified ML adoptions in many applications. Numerous companies and …

Mldemon: Deployment monitoring for machine learning systems

T Ginart, MJ Zhang, J Zou - International conference on …, 2022 - proceedings.mlr.press
Post-deployment monitoring of ML systems is critical for ensuring reliability, especially as
new user inputs can differ from the training distribution. Here we propose a novel approach …

Mondrian: Prompt abstraction attack against large language models for cheaper API pricing

WM Si, M Backes, Y Zhang - arXiv preprint arXiv:2308.03558, 2023 - arxiv.org
The Machine Learning as a Service (MLaaS) market is rapidly expanding and becoming
more mature. For example, OpenAI's ChatGPT is an advanced large language model (LLM) …

Did the model change? efficiently assessing machine learning api shifts

L Chen, T Cai, M Zaharia, J Zou - arXiv preprint arXiv:2107.14203, 2021 - arxiv.org
Machine learning (ML) prediction APIs are increasingly widely used. An ML API can change
over time due to model updates or retraining. This presents a key challenge in the usage of …