What's wrong with computational notebooks? Pain points, needs, and design opportunities

S Chattopadhyay, I Prasad, AZ Henley… - Proceedings of the …, 2020 - dl.acm.org
Computational notebooks-such as Azure, Databricks, and Jupyter-are a popular, interactive
paradigm for data scientists to author code, analyze data, and interleave visualizations, all …

Managing messes in computational notebooks

A Head, F Hohman, T Barik, SM Drucker… - Proceedings of the 2019 …, 2019 - dl.acm.org
Data analysts use computational notebooks to write code for analyzing and visualizing data.
Notebooks help analysts iteratively write analysis code by letting them interleave code with …

How data scientists use computational notebooks for real-time collaboration

AY Wang, A Mittal, C Brooks, S Oney - … of the ACM on Human-Computer …, 2019 - dl.acm.org
Effective collaboration in data science can leverage domain expertise from each team
member and thus improve the quality and efficiency of the work. Computational notebooks …

Assessing and restoring reproducibility of Jupyter notebooks

J Wang, T Kuo, L Li, A Zeller - Proceedings of the 35th IEEE/ACM …, 2020 - dl.acm.org
Jupyter notebooks---documents that contain live code, equations, visualizations, and
narrative text---now are among the most popular means to compute, present, discuss and …

Towards effective foraging by data scientists to find past analysis choices

MB Kery, BE John, P O'Flaherty, A Horvath… - Proceedings of the 2019 …, 2019 - dl.acm.org
Data scientists are responsible for the analysis decisions they make, but it is hard for them to
track the process by which they achieved a result. Even when data scientists keep logs, it is …

The design space of computational notebooks: An analysis of 60 systems in academia and industry

S Lau, I Drosos, JM Markel… - 2020 IEEE Symposium on …, 2020 - ieeexplore.ieee.org
Computational notebooks such as Jupyter are now used by millions of data scientists,
machine learning engineers, and computational researchers to do exploratory and end-user …

Boba: Authoring and visualizing multiverse analyses

Y Liu, A Kale, T Althoff, J Heer - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Multiverse analysis is an approach to data analysis in which all “reasonable” analytic
decisions are evaluated in parallel and interpreted collectively, in order to foster robustness …

Restoring execution environments of Jupyter notebooks

J Wang, L Li, A Zeller - 2021 IEEE/ACM 43rd International …, 2021 - ieeexplore.ieee.org
More than ninety percent of published Jupyternotebooks do not state dependencies on
external packages. This makes them non-executable and thus hinders reproducibility of …

Better code, better sharing: on the need of analyzing jupyter notebooks

J Wang, L Li, A Zeller - Proceedings of the ACM/IEEE 42nd international …, 2020 - dl.acm.org
By bringing together code, text, and examples, Jupyter notebooks have become one of the
most popular means to produce scientific results in a productive and reproducible way. As …

Paths explored, paths omitted, paths obscured: Decision points & selective reporting in end-to-end data analysis

Y Liu, T Althoff, J Heer - Proceedings of the 2020 CHI conference on …, 2020 - dl.acm.org
Drawing reliable inferences from data involves many, sometimes arbitrary, decisions across
phases of data collection, wrangling, and modeling. As different choices can lead to …