Reproducible Petascale Deep Learning Workflows Reproducibility is a feature of all good science

Date and Time: 
Tuesday 2018 Apr 3rd
CG Auditorium
Benjamin Liebersohn

At Oak Ridge National Lab (ORNL), leadership computing has included container usage in high performance computing, (HPC) with the intent to develop reproducible HPC applications and workflows. The Geographic Information Sciences and Technology (GIST) group at ORNL has been using containers in its HPC deep learning workflows, allowing its researchers to quickly create and adapt computing environments as their needs evolve. We hope that by explaining our approach to utilizing containers, such as our successes and challenges, we can give insights into the limits and benefits of using containers in an HPC environment. We seek to inform researchers who wish to use HPC systems as well as rapidly changing software workflows. Containerized projects can quickly adapt to different hardware, and thus, future projects are not limited by legacy workflows. Conversely, stable workflows can be brought to cutting edge systems, while also benefitting from the latest hardware. Many of the HPC systems at ORNL now have Singularity available, and the GIST group studies ways to use Dockerhub, Singularityhub, and Charliecloud for container management. Bridging the traditional identity-based security model (such as using virtual machines) with contemporary container-based workflows often means addressing container usage in a variety ways. This talk will focus on security-oriented methods, as well as maintenance and standardization of workflow standards. Our goal involves the standardization of container workflows, such that they can be interoperable, and thus promote collaborations with minimal overhead needed for compatibility from computational researchers.

Speaker Description: 

Benjamin Liebersohn is Researcher at Oak Ridge National Laboratory

Event Category: