Runtime verification of scientific codes using statistics

MN Dinh and D Abramson and C Jin, INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 80, 1473-1484 (2016).

DOI: 10.1016/j.procs.2016.05.468

Runtime verification of large-scale scientific codes is difficult because they often involve thousands of processes, and generate very large data structures. Further, the programs often embody complex algorithms making them difficult for non-experts to follow. Notably, typical scientific codes implement mathematical models that often possess predictable statistical features. Therefore, incorporating statistical analysis techniques in the verification process allows using program's state to reveal unusual details of the computation at runtime. In our earlier work, we proposed a statistical framework for debugging large-scale applications. In this paper, we argue that such framework can be useful in the runtime verification process of scientific codes. We demonstrate how two production simulation programs are verified using statistics. The system is evaluated on a 20,000-core Cray XE6.

Return to Publications page