Did the model change? efficiently assessing machine learning api shifts

L Chen, M Zaharia, J Zou - arXiv preprint arXiv:2307.09009, 2023 - arxiv.org

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services.
However, when and how these models are updated over time is opaque. Here, we evaluate …

被引用次数：363 相关文章所有 11 个版本

[PDF] arxiv.org

Frugalgpt: How to use large language models while reducing cost and improving performance

L Chen, M Zaharia, J Zou - arXiv preprint arXiv:2305.05176, 2023 - arxiv.org

There is a rapidly growing number of large language models (LLMs) that users can query for
a fee. We review the cost associated with querying popular LLM APIs, eg GPT-4, ChatGPT …

被引用次数：116 相关文章所有 3 个版本

[PDF] acm.org

Zeno: An interactive framework for behavioral evaluation of machine learning

ÁA Cabrera, E Fu, D Bertucci, K Holstein… - Proceedings of the …, 2023 - dl.acm.org

Machine learning models with high accuracy on test data can still produce systematic
failures, such as harmful biases and safety issues, when deployed in the real world. To …

被引用次数：38 相关文章所有 4 个版本

[PDF] neurips.cc

Ecosystem-level analysis of deployed machine learning reveals homogeneous outcomes

C Toups, R Bommasani, K Creel… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning is traditionally studied at the model level: researchers measure
and improve the accuracy, robustness, bias, efficiency, and other dimensions of specific …

被引用次数：7 相关文章所有 5 个版本

[PDF] neurips.cc

Estimating and explaining model performance when both covariates and labels shift

L Chen, M Zaharia, JY Zou - Advances in Neural …, 2022 - proceedings.neurips.cc

Deployed machine learning (ML) models often encounter new user data that differs from
their training data. Therefore, estimating how well a given model might perform on the new …

被引用次数：16 相关文章所有 6 个版本

[PDF] neurips.cc

Hapi: A large-scale longitudinal dataset of commercial ml api predictions

L Chen, Z Jin, ES Eyuboglu, C Ré… - Advances in Neural …, 2022 - proceedings.neurips.cc

Commercial ML APIs offered by providers such as Google, Amazon and Microsoft have
dramatically simplified ML adoptions in many applications. Numerous companies and …

被引用次数：8 相关文章所有 8 个版本

Judging an Airbnb booking by its cover: How profile photos affect guest ratings

H Jang - Journal of Consumer Marketing, 2022 - emerald.com

Purpose This research aims to examine whether the facial appearances and expressions of
Airbnb host photos influence guest star ratings. Design/methodology/approach This …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

The rise of open science: Tracking the evolution and perceived value of data and methods link-sharing practices

H Cao, J Dodge, K Lo, DA McFarland… - arXiv preprint arXiv …, 2023 - arxiv.org

In recent years, funding agencies and journals increasingly advocate for open science
practices (eg data and method sharing) to improve the transparency, access, and …

被引用次数：3 相关文章所有 7 个版本

[PDF] usenix.org

{ChameleonAPI}: Automatic and Efficient Customization of Neural Networks for {ML} Applications

Y Liu, C Wan, K Du, H Hoffmann, J Jiang, S Lu… - … USENIX Symposium on …, 2024 - usenix.org

ML APIs have greatly relieved application developers of the burden to design and train their
own neural network models—classifying objects in an image can now be as simple as one …

Efficient online ml api selection for multi-label classification tasks

L Chen, M Zaharia, J Zou - International conference on …, 2022 - proceedings.mlr.press

Multi-label classification tasks such as OCR and multi-object recognition are a major focus of
the growing machine learning as a service industry. While many multi-label APIs are …

被引用次数：13 相关文章所有 7 个版本

高级搜索

QQ 群