Reproducing computational research

Reproducing results is a crucial part of the scientific process – given uncertainties in measurements, inherent variability and often randomness in systems under investigation, and the likelihood of human error, the only way to establish the truth of a reported result is by repeating the experiment to see if the results match. In the ironically named “soft” sciences (many of the -ologies) in particular, the systems under investigation are highly variable and almost always involve elements of randomness, so even the most careful experiments can only produce tentative results (though you’d believe otherwise from reading the news), thus requiring many repetitions in order for them to be accepted as reliable.

Computing has helped to eliminate some human errors from research, and allowed increasingly more complex and large experiments (e.g. human genome project and large hadron collider would not have been possible without advances in computer science). Computers have become essential for instrument control, data collection and storage, and automated data analysis. Additionally, computers allow very detailed and complex systems to be simulated, helping to generate or refine hypotheses that can then be tested experimentally. This is what I try to do – simulate the electrical activity in brain tissue in order to investigate hypotheses about the causes of diseases like epilepsy.

Unlike experiments in the soft sciences, computer simulations are easily reproducible*: a computer runs calculations reliably, so the same code run many times should give the same results. Unfortunately, the reality is far removed from this ideal. Complex systems require complex software to simulate them, and the more complex a piece of software, the more likely it is to contain errors. Different scientists using different operating systems with different software versions installed may not be able to run each other’s code reliably. Simulations will contain so many parameters that it is impossible to remember them all, especially when some are changed in order to alter the simulation behaviour. Even something as seemingly simple as a change in the numerical method used to solve equations can have drastic consequences on the simulation results.

Conventional software engineering techniques exist to help prevent these kinds of problems and ensure software reliability across multiple computers/operating systems etc., but many scientists have never learned anything about software engineering. ALL IS NOT LOST! Andrew Davison at the Centre National de la Recherche Scientifique in Paris has written a tutorial on best practices for writing code with reproducibility in mind. The examples are computational neuroscience oriented, but the observations and advice should apply to any scientific computing area. He makes the important point that the lack of reproducibility of many results can seriously damage the field’s credibility (cf. “Climategate” in climate science) as well as hindering scientific progress.

I am certainly guilty of ignoring many of the good practices Andrew wisely advocates, so it’s great to have a concise tutorial specific to scientific computing to refer to in future. I am currently in the process of restructuring a lot of my code, too, so it couldn’t have come at a better time from my perspective…