conference-talk

Polyglot, Event Driven Computational Science Using the Actor Model

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Joe Stubbs

Data-intensive computational techniques have become indispensable in virtually every domain of science. The sheer quantity of data being generated by various instruments and devices presents a significant challenge for even the most advanced computing centers. Traditional off-line “batch” approaches to data analysis often times cannot keep pace with real-time streaming data. At the same time, an explosion of new software tools has given computational scientists an unprecedented number of quality choices for analyzing their data.

Speaker Description: 

After completing a PhD in Mathematics from the University of Michigan, Joe moved to the University of Texas where he has been building distributed systems, web services and analytic tools for a variety of scientific applications. He is currently a research scientist at the Texas Advanced Computing Center where he works on the Agave “science as a service” project, a hosted platform for hybrid cloud, HPC and high-throughput scientific computing. Joe is co-creator of several open source Python projects in use at TACC including abaco, a system that implements the actor model of concurrent programming using containers and http.

Event Category:

Expanding users’ analysis capabilities with the CMIP Analysis Platform at NCAR

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Dave Hart

Most academic researchers do not have the resources to download, store, and analyze large portions (often tens or hundreds of terabytes) of the 2 PB of data published worldwide from the Coupled Model Intercomparison Project Phase 5 (CMIP5). This limitation will be exacerbated in Phase 6, with data volumes expected to be 10 or 20 times larger. For CMIP6, NCAR alone is projecting the creation of 5 PB of data or more.

Speaker Description: 

David Hart is manager of CISL's User Services Section, where he handles allocations for CISL's high-performance computing systems and oversees the CISL Help Desk and Consulting Services Group. Prior to arriving at NCAR in 2010, David worked for 15 years at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego, in a variety of roles and leadership positions, including allocations, user support, and communications. During his time at SDSC, he also held a number of leadership positions in the TeraGrid program and continues to be involved with the XSEDE program. His professional and research interests include metrics for measuring the performance and impact of cyberinfrastructure systems and activities.

Event Category:

Jupyter Ascending: a practical hand guide to galactic scale, reproducible data science

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
John Fonner

Scientific reproducibility must be as much about accessibility and clearly communicating ideas as it is about making calculations consistent. As computation plays an ever increasing role in research, packaging computations in a way that supports simple, transparent recreation of the results is critical for many scientific domains. A number of software tools and container technologies such as Jupyter notebooks and Docker containers provide key elements toward this end, but they also have limitations both in capability and ease of use.

Speaker Description: 

John Fonner is a research associate in Life Sciences Computing at the Texas Advanced Computing Center (TACC). He earned a Ph.D. in Biomedical Engineering at the University of Texas at Austin, where he used a blend of experimental and computational techniques to study binding interactions between peptides and conducting polymers for implant applications in the nervous system. Since joining TACC in 2011, John has served on a number of projects that help life sciences researchers leverage advanced computing resources, both through training and through the development of better tools and cyberinfrastructure.

Event Category:

Data Thinking before Data Crunching

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Grace Peng

Data scientist is the hot job title du jour. Many books and sites teach the mechanics of how to make data visualizations, but skip or gloss over the foundations of data science. This talk will help you think through fundamental considerations before you plunge into your data analysis.
Can this problem be answered with data? What type of data can help me answer this question? Where can I find the best data for the job? Does the data match the data documentation? How do I cite the data for reproducibility?

Speaker Description: 

Grace Peng works in the Data Support Section.

Event Category:

One approach to ensuring that data analysis projects and research reports are reproducible

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Janine Aquino

Authors: Mike Daniels, William Cooper, Janine Aquino, Teresa Campos, William Brown (All from NCAR/EOL)

Speaker Description: 

Janine manages research data from the two NCAR/EOL research aircrafts: HIAPER, a modified Gulfstream V jet, and a four-engine turboprop C-130. Data are made available online as part of comprehensive project websites that support cutting edge atmospheric research.

Event Category:

Brown Dog: An Elastic Data Cyberinfrastrure for Autocuration and Digital Preservation

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Jay Alameda

Smruti Padhy, Jay Alameda, Rui Liu, Edgar Black, Liana Diesendruck, Mike Dietze, Greg Jansen, Praveen Kumar, Rob Kooper, Jong Lee, Richard Marciano, Luigi Marini, Dave Mattson, Barbara Minsker, Chris Navarro, Marcus Slavenas, William Sullivan, Jason Votava, Inna Zharnitsky, Kenton McHenry
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign

Speaker: Jay Alameda

Abstract

Speaker Description: 

Jay Alameda is the lead for Advanced Application Support at the National Center for Supercomputing Applications. In this role, he works with the Extreme Science and Engineering Discovery Environment (XSEDE) which is a collaboration of NSF-funded high performance computing (HPC) resource providers, working to provide a common set of services, including the provisioning of advanced user support, to the science and engineering community. In particular, Jay leads the Extended Support for Training, Education, and Outreach Service of XSEDE, which provides the technical expertise to support Training, Education, and Outreach activities organized by XSEDE. He also was the lead of the recently completed NSF funded SI2 project, “A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform”, which improved the Eclipse Parallel Tools Platform (PTP) to serve as a platform for development of HPC applications.

Event Category:

Function Follows Form: A Practical Guide to Research Data Curation

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Julia Collins

Whether your data science adventures rely on data you produced or leverage data sources managed by others, the ability to explore, analyze, and visualize data all depend on access to the data themselves. The ability of other data scientists to verify and build upon your findings also depends on their access to the same data sources. This talk will review the role of data curation in the data science workflow. We will review data management best practices applicable to all levels of data generation and use, from small exploratory studies to large satellite data sets

Speaker Description: 

Julia Collins is a software developer at the National Snow and Ice Data Center in Boulder, Colorado. She currently provides software engineering support for tools used to manage and process large Earth science data sets, as well as supporting the development of user interfaces and data storage strategies for community-based monitoring activities and qualitative data sources.

Event Category:

ARTView: A Community Weather Radar Data GUI

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Nick Guy

With file formats, naming conventions, and aging platforms, working with weather radar data has been a notoriously difficult endeavor until recently. Tools for plotting and analyzing radar data have existed generally at an institutional or proprietary level. Community tools, such as Solo II/3, have been indispensible. However, they do not necessarily take advantage of modern computing technologies and address the shift of academic and government institutions to open source platforms for cost savings, performance, and development flexibility.

Speaker Description: 
Nick Guy is Associate Research Scientist and Project Manager of King Air Research Facility at the Department of Atmospheric Science in the University of Wyoming

Event Category:

Whales On A Plane: Deploying Software To NSF / NCAR Research Aircraft w/ Docker

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Erik Johnson

Docker is a maturing open-source, Linux-based containerization technology that provides a convenient means to package, distribute and execute software in a fast and isolated environment.

NCAR's Earth Observing Laboratory (EOL) is using Docker to deploy applications and associated services to NSF / NCAR research aircraft. In this talk, I will discuss the benefits provided by Docker, and tools, such as Docker Compose and Docker Hub, and techniques used to facilitate Docker-based deployment of NCAR EOL applications and services, which include:

Speaker Description: 

Erik Johnson is a software engineer at NCAR's Earth Observing Laboratory, responsible for full-stack web development and devops for the Field Catalog and related Catalog tools using Free and Open-Source technologies. Erik has previously worked at start-ups and contracted to NOAA and NASA.

Event Category:

Extending the geographic extent of existing land cover data using active machine learning and covariate shift corrective sampling

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Galen Maclaurin

Consistent land cover data provided at national and regional scales are increasingly relevant for a wide range of research topics from landscape ecology to population dynamics. As one example, the National Land Cover Database (NLCD) provides a valuable resource for research conducted at broad geographic scales across the U.S. where survey- or field-based land cover data are not available.

Speaker Description: 

I am a geospatial data scientist at the National Renewable Energy Laboratory (NREL) in Golden, CO, where I work on diverse problems in renewable energy involving spatiotemporal data. My recently completed PhD research in the Department of Geography at the University of Colorado-Boulder focused on image-based machine learning for spatial and temporal replication of land cover data.

Event Category:

Pages

Subscribe to conference-talk