Wednesday, October 26, 2022

Levels of reproducibility and repeatability and replication

Knowledge progresses when one person can leverage the insights of another person. There are levels of reproducibility that require different levels investment on the part of the person looking to build on the initial knowledge. 

The levels described below are ranked from "requires lots of work to build upon" to "very easy to leverage."

Level 0: discoverable claim

Undocumented claims that no one else is aware of are irrelevant to the advancement of science. 

Level 1: claim without evidence

example: "My design for this car tire supports operational speeds of 50 miles per hour and will be usable for 50,000 miles."
No software or analytical calculations are provided. No explanation of how claim was arrived at.


Level 2: verbal hints of process used

software-based example: My design for the car tire is based on calculations from code written in Python.
analytical example: In my calculation of operational speed I used the Fourier transform.

What distinguishes 2 from 1: Advertising that code was written, or math was done. 
Most peer-reviewed scientific papers are written at this level or, if you're lucky, level 2. 
Most presentations of experiments (e.g., at conferences and lectures) also are made at this level or level 2.


Level 3: software without documentation or dependencies; or, for analytical, a few of the key equations

software-based example: Python script provided by the author to back up claim. No configuration file or random seed value. Library dependencies and versions need to be determined through trial and error (aka digital archeology).
analytical example: a few key equations (from a much more complex derivation) are mentioned, as are a few (not all) of the assumptions.

consequence: If you're smart and diligent, you may be able to recover statistically similar (though not exact) behavior for stochastic models, assuming neither you nor the original author had any bugs in the implementation.


Level 4:

software-based example: Python script with random seed value specified and configuration parameters documented. Documentation not included, nor are dependencies made explicit. 
analytical example: complete derivation provided, with explanation of assumptions

Level 5: repeatable

software-based example: Python script containerized, software versions pinned. Build process is executable. No digital archeology needed.
analytical example: portions of the complete derivation are checked by a computer algebra system (CAS) for correctness.


Level 6:

software-based example: Python script containerized with documentation of assumptions and examples of how to use. Configuration file with parameters and random seed values provided.
analytical example: complete derivation checked by a computer algebra system (CAS) for correctness. Proofs provided.  Citations where appropriate.


Caveat: the levels described above are not actually linear. There are a few meandering paths that get from 0 to 6. 


References

"Reproducibility vs. Replicability: A Brief History of a Confused Terminology"
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5778115/

No comments:

Post a Comment