The TAU Performance System is a powerful and highly versatile profiling and tracing tool ecosystem for performance analysis of parallel programs at all scales. Developed for almost two decades, TAU has evolved with each new generation of HPC systems and presently scales efficiently to hundreds of thousands of cores on the largest machines in the world. TAU has helped many projects scale up successfully on systems at Oak Ridge Leadership Computing Facility (OLCF), the National Energy Research Scientific Computing Center (NERSC), the Argonne Leadership Computing Facility (ALCF), and others. In one case, TAU helped reduced the runtime of the IRMHD INCITE code from 528 hours to 70 hours.
This tutorial will focus on performance data collection, analysis, and performance optimization of Python applications. The tutorial will introduce profiling and debugging support in TAU, cover performance evaluation of parallel programs written in pure Python or Python mixed with Fortran, C++, and/or C. The tutorial will also cover parallel performance analysis of applications using MPI, OpenMP, and other parallel runtime environments via packages like mpi4py. The common case of Python as a high-level â€œglueâ€ language for high performance components will be covered extensively. We will demonstrate different techniques for program instrumentation and highlight TAU's support for memory debugging and I/O evaluation. The hands-on portion of the tutorial will guide the developers through the instrumentation, measurement, and analysis process steps in TAU. Performance data will include MPI timings, runtime bounds checking, I/O and memory, and hardware performance counters from PAPI. The tutorial will demonstrate how TAU's instrumentation and analysis tools may be used with external tools such as Score-P, Scalasca, OTF2, PAPI, and Vampir. For the hands-on session, the participants will be able to use NCAR systems or an optional HPC Linux LiveDVD that will allow them to boot their laptops to a Linux distribution that has the above tools installed. The participants are encouraged to bring a laptop with them and install VirtualBox virtualization software and related OVA files from http://www.hpclinux.com/.
Slides, support material, etc: http://www.paratools.com/sea15
Dr. John Linford is a Scientist at ParaTools, Inc. He received his Ph.D. from Virginia Tech, where his dissertation on accelerating atmospheric modeling through emerging multi-core technologies was selected as the outstanding doctoral dissertation of 2010. John has developed a meta-programmer for chemical kinetic simulation, airborne signal processing applications, rotocraft engineering tools, and toolkits for porting parallel HPC applications to cloud computing platforms. John helps develop the TAU Performance System and has contributed to the Scalasca project and the MoinMoin project.