The 800 lb Gorilla - Really Big Data

Date and Time: 
2012 Thursday. February 23rd
Location: 
ML-132 Main Seminar
Speaker: 
Gary Strand

Abstract:

Teraflop-class supercomputers, like Bluefire, can generate Petabytes (PB, 10^15 bytes) of climate model output; will petaflop-class machines like Yellowstone, be used to generate Exabytes (EB, 10^18) of data? The ability and use of these computers to generate truly massive amounts of data is an subject that has gotten some degree of attention, but there are a number of outstanding issues that have been only lightly addressed.

Funding agencies, such as NSF, now require data management policies in proposal submissions, but too often data management is a *post hoc* consideration - for example, deciding how to manage 1 PB of data after it's been generated and archived is too late. In this talk, I will address how the CESM project currently handles these volumes of data, and implications for the future - both as the NWSC comes online, and beyond.

Speaker Description: 

Gary Strand is a software engineer in the Climate Change Prediction group of the Climate and Global Dynamics Division of NCAR. He began work at NCAR in 1986 as a student assistant, and has been involved in several generations of climate model development in CGD. He is the primary data manager and data scientist for the latest NCAR climate model, the Community Earth System Model (CESM). He has led the major data management activities and projects for the  CESM since 2003, including CMIP3 and the current CMIP5. He is also one of the key personnel for the Earth System Grid (ESG) project, participating since its inception in 2001. Gary has also created a number of visualizations of CESM output that have been used in many scientific presentations as well as in major broadcast media.

AttachmentSize
PDF icon STRAND_SEA_Conference_02-2012.pdf985.69 KB

Event Category: