Just in Time: Adding Value to the IO Pipelines of High Performance Applications with JITStaging

H Abbasi and G Eisenhauer and M Wolf and K Schwan and S Klasky, HPDC 11: PROCEEDINGS OF THE 20TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 27-36 (2011).

Large scale applications are generating a tsunami of data, with understanding driven by finding information hidden within this data. The ever-increasing sizes of output, however, are making it difficult for science users to inspect the data generated by their applications, understand its important properties, and/or organize it for subsequent analysis and visualization. This paper presents JITStager, a software infrastructure with which end users can dynamically customize and thus, add value to the output pipelines of their HEC applications. JITStager is able to customize data at scale, by leveraging the computational power of both compute nodes and of additional 'data staging' nodes allocated by end users. Using existing, componentized I/O interfaces to decouple the compile-time specification of the program and the run-time customization of the data pipeline, JITStager employs efficient runtime methods for binary code generation and data movement to create custom pipelines for applications' output processes that provide end users with improved insights into the data being produced, without burdening the application's computational performance and without impeding output performance. This paper describes the JITStager architecture, evaluates its performance, and demonstrates the advantages derived from its use with representative HPC applications.

Return to Publications page