The state of documentation practices of third-party machine learning models and datasets

EL Oreamuno, RF Khan, AA Bangash, C Stinson… - IEEE …, 2024 - ieeexplore.ieee.org
EL Oreamuno, RF Khan, AA Bangash, C Stinson, B Adams
IEEE Software, 2024ieeexplore.ieee.org
Model stores offer third-party ML models and datasets for easy project integration,
minimizing coding efforts. One might hope to find detailed specifications of these models
and datasets in the documentation, leveraging documentation standards such as model and
dataset cards. In this study, we use statistical analysis and hybrid card sorting to assess the
state of the practice of documenting model cards and dataset cards in one of the largest
model stores in use today–Hugging Face (HF). Our findings show that only 21,902 models …
Model stores offer third-party ML models and datasets for easy project integration, minimizing coding efforts. One might hope to find detailed specifications of these models and datasets in the documentation, leveraging documentation standards such as model and dataset cards. In this study, we use statistical analysis and hybrid card sorting to assess the state of the practice of documenting model cards and dataset cards in one of the largest model stores in use today–Hugging Face (HF). Our findings show that only 21,902 models (39.62%) and 1,925 datasets (28.48%) have documentation. Furthermore, we observe inconsistency in ethics and transparency-related documentation for ML models and datasets.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果