Indiana University Bloomington

Luddy School of Informatics, Computing, and Engineering

Technical Report TR744:
Reproducibility in Scientific Computing

Jonathan Klinginsmith
(Feb 2021), 16 pages
[Part requirement of a Ph.D. candidacy reinstatement plan]
Abstract:
The Oxford English Dictionary defines the scientific method as “a method of procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses” [1].

Theory and experimentation, the first two pillars of the scientific method, have stood for centuries. Scientists have formulated theories and hypothe- ses and used experimentation to validate or refute theories. However in recent years, computing has widely been considered the third pillar of science [2]. Advances in sensors, imaging, and scientific instrumentation have led to generation of large amounts of scientific data. Data generation have become ubiquitous with science as well [3], to the point that some researchers consider data-intensive science to the fourth pillar [4].

Ivie and Thain define scientific computing “as computing applied to the physical sciences (biology, chemistry, physics, and so on) for the purposes of simulation or data analysis” [5]. Within the computational science research community, Stodden states “the digitization of science combined with the Internet create a new transparency in scientific knowledge, potentially moving scientific progress from building with black boxes, to one where the boxes themselves remain wholly transparent” [6].

As the scientific process further leverages both technology and data, the need to reproduce compu- tational experiments has become imperative in the scientific discovery process. However, as a computational science researcher reading a scientific paper, it can be challenging to reproduce the authors’ experiments. To fully reproduce the computational experiment, one must have the same versions of software installed and configured, have access to the original data, and leverage the same parameters used within the original experiment.

In many cases, having access to all these items is not possible [7]. Even if the original data are not available, it should be reasonable to expect experi- mental setup to be reproducible. Specifically, if the infrastructure setup and the software installation and configuration can be performed in a reproducible manner then scientists are much more enabled at replicating or extending the experiment in question.

Figure 1 models the progression of a computa- tional science experiment. The three phases: configuration, execution, and publication represent logical constructs where experimental activities performed and can be replicated. During the configuration phase, software must be installed and configured and when necessary infrastructure must be provisioned. This phase also includes any data preparation or downloads. Input parameters may be modified so that the experiment can be executed multiple times. Within the execution phase, the actual experiment is performed. Data and metrics are generated from experiment for use in the publication stage. In the publication stage, data tables, figures, and charts are produced for information sharing and presentation of experimental results.

Fig. 1. Experimental progression

Many computational science disciplines are leveraging machine learning or artificial intelligence, in general, to make scientific inferences from trained datasets. Hutson [8] discussed the reproducibility crisis in artificial intelligence research. One of the most basic contributors to the crisis is researchers’ lack of sharing and publishing software. Heaven [9] provides reasons why AI is dealing with issues in reproducibility. He mentions one of most basic reasons for the crisis is the “lack of access to three things: code, data, and hardware.” He continues that the divide between the “haves“ and the “have-nots“ of AI data and hardware is also contributing to the crisis.

In many cases as part of the experimental progression highlighted in in Figure 1 a scientific computing analysis requires multiple sequential or para

Available as: