Check My Work: Reproducible Science
By Victoria Stodden and Brian Wandell
Flywheel tools help researchers do reproducible research. We listen carefully to ideas from our colleagues in the reproducible research communities, and we build tools to support their goals. In this blog we discuss some ideas that we are hearing about reproducible research and how the ideas are impacting Flywheel tools.
Reproducibility principles
Flywheel is built on the fundamental reproducibility principle: others must be able to ‘check my work.’ This includes checking the quality of the data and the computational steps that lead from the data to the findings. Flywheel software provides customers with tools that are designed to be checked and build trust.
But what does checking mean? Some argue that checking means that every line of code up and down the computational system must be open. Some go further and say that the code must also be free because cost is an impediment to checking. Others disagree, saying that open and zero cost are too limiting to the scientific enterprise. We see the discussion as being about the scope of the work that must be made available for checking. Consider these examples.
Check my data
Suppose a clinician uses an MR scanner to perform a study. The image reconstruction algorithms on most MR scanners use proprietary algorithms. The same is true for nearly all advanced instrumentation – from genomics to astronomy: proprietary algorithms account for device details to produce the images. Data from simpler instrumentation also has this limitation: Machine learning often relies on camera images that are processed by vendors in multiple ways to change color, enhance dynamic range and remove noise using proprietary algorithms. Examined closely, every field committed to reproducible science includes some integral aspects derived from closed sources.
Check my algorithms
A similar point can be made about computation. Modern data analysis often relies on complex algorithms that run on proprietary infrastructure, whether in the cloud or on desktop. It is nearly impossible to compute using entirely open-source or free software. (We hear that computers are also not free).
Reality check
These examples point the way to a realistic approach to reproducibility. It is impossible to have a perfectly open set of tools, but it is possible to specify the scope over which the tools are open. In the case of MRI, we might say that the reproducibility starts from the DICOM file provided by the vendor and continues to the published result; for machine learning with a convolutional neural network (CNN) the scope might be from the images to the published result. The algorithms may be open, even if the operating system and cloud infrastructure are not.
In this world, reviewers and colleagues can build trust by describing as much as they can about the steps that created the MR or camera file, and the reader can be made aware which parts of the data and computations are completely open and which are not.
Making the scope of the open data and code clear is the reality of modern science. Acknowledging that methods are never completely open, we can still identify the range of the work that is open. The inaccessible stages will be considered ‘infrastructure’ and the accessible stage ‘research structure’.
Template and Analysis systems
Flywheel systems help scientists and clinicians check each other’s work over a defined scope. The scope includes critical parts of both the data and the analytical work.
The Flywheel tools include methods to check the validity of the research data files. This takes the form of a Template System that verifies the data. The system checks for the presence and format of the input files; in some cases, we have found methods to analyze ‘quality assurance’ methods that verify the signal-to-noise or other quality measures of the data.
The Analysis System checks the computational parts of the work. The system tracks the input data, algorithm versions, implementations with their dependencies, parameters, and outputs. An added advantage of having the Analysis System in Flywheel is that the computations are searchable and shareable. Flywheel manages both data and computations.
The Template and Analysis systems are part of the Flywheel infrastructure. Combined with search, data reuse, visualization and computational tools, systems have been used to analyze millions of images and algorithm runs. These infrastructure tools guide investigators to follow reproducible research practice in both academic and commercial settings. Tools for searching and visualizing the information in these systems continue to improve, and we continue to add features to both the Template and Analysis Systems with the goal of simplifying their use to make it easier to build trust.
What’s Next
Our customer experience has taught us more lessons that will guide new Flywheel tools. One of the most important things we have learned over the years is surprising to new customers. Many think that reproducible science is so that other people can check your work. In fact, most of our customers soon find that the most important use of these tools is so that you can check your own work. More on this and related matters in forthcoming blogs.