Parallel I/O - for Reading and Writing Large Files in Parallel

Date and Time: 
2015 April 16 - PM
FL2-1022 Large Auditorium
Ritu Arora and Si Liu

Developing an understanding of efficient parallel I/O and adapting your application accordingly can result in orders of magnitude of performance gains without overloading the parallel file system. This half-day tutorial will provide an overview of the practices and strategies for the efficient utilization of parallel file systems through parallel I/O for achieving high performance. The target audiences are analysts and application developers who do not have prior experience with MPI I/O, HDF5, and T3PIO. However, they should be familiar with C/C++/Fortan programming and basic MPI. A brief overview of the related basic concepts will be included in the tutorial where needed.

All the concepts related to the tutorial will be explained with examples and there wil l be a laboratory/hands-on session. In the hands-on session, the audience will be given a few exercises in a time period of one hour. They will be provided with the skeleton programs written in C/Fortran and the instructions to modify the programs such that the modified programs can do parallel I/O. The programs provided for the hands-on session will include comments/place-holders to guide the audience to modifying the code. The hands-on session will help the audience to test the knowledge gained during the tutorial. By the end of the tutorial, the audience will have learnt to do parallel I/O (through MPI I/O and the high-level libraries discussed in this tutorial) and will be motivated to apply the knowledge gained to getting much higher I/O performance from their applications than earlier.

Because this tutorial will include a hands-on session, the audience will be provided access to the Stampede, a 10 PFLOPS Dell Linux Cluster at TACC, to carry out the exercises. The audience will need personal laptops to remote login into Stampede via SSH. Hence the SSH client or terminal access should be available on the laptops to be used during the tutorial.

Speaker Description: 

Ritu Arora received her Ph.D. in Computer and Information Science from the University of Alabama at Birmingham. She works as an HPC researcher and consultant at the Texas Advanced Computing Center (TACC). She also teaches in the Department of Statistics and Data Sciences at the University of Texas at Austin. She has made significant contributions in the areas of developing abstractions for parallelizing legacy applications and application-level checkpointing. Currently, Ritu is providing consultancy on automating Big Data workflows on national supercomputing resources. Her areas of interest and expertise are HPC, fault-tolerance, domain-specific languages, workflow automation, and big data management.

Si Liu received his PhD in applied mathematics at University of Colorado at Boulder in 2009. He joined the High Performance Computing Group at the Texas Advanced Computing Center as a Research Associate in 2013. He has been collaborating with UT research groups, the XSEDE community, and many corporations on various projects, including HPC aware tools development, Weather Research and Forecast Model simulation and visualization, and CERN's "A Large Ion Collider Experiment project". His current research interests include parallel computing, I/O performance, test management, benchmark, and optimization. Previously, he worked as a software engineer in the Computational Information Systems Laboratory at the National Center for Atmospheric Research. He made important contributions to establishing the Yellowstone Supercomputing system at the NCAR-Wyoming Supercomputing Center. He received UCAR's special recognition award in 2011 for his contribution to Intergovernmental Panel on Climate Change.

PDF icon PIO-SEA2015.pdf2.41 MB

Event Category: