conference-talk

Data Thinking before Data Crunching

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Grace Peng

Data scientist is the hot job title du jour. Many books and sites teach the mechanics of how to make data visualizations, but skip or gloss over the foundations of data science. This talk will help you think through fundamental considerations before you plunge into your data analysis.
Can this problem be answered with data? What type of data can help me answer this question? Where can I find the best data for the job? Does the data match the data documentation? How do I cite the data for reproducibility?

Speaker Description: 

Grace Peng works in the Data Support Section.

Event Category:

One approach to ensuring that data analysis projects and research reports are reproducible

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Janine Aquino

Authors: Mike Daniels, William Cooper, Janine Aquino, Teresa Campos, William Brown (All from NCAR/EOL)

Speaker Description: 

Janine manages research data from the two NCAR/EOL research aircrafts: HIAPER, a modified Gulfstream V jet, and a four-engine turboprop C-130. Data are made available online as part of comprehensive project websites that support cutting edge atmospheric research.

Event Category:

Brown Dog: An Elastic Data Cyberinfrastrure for Autocuration and Digital Preservation

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Jay Alameda

Smruti Padhy, Jay Alameda, Rui Liu, Edgar Black, Liana Diesendruck, Mike Dietze, Greg Jansen, Praveen Kumar, Rob Kooper, Jong Lee, Richard Marciano, Luigi Marini, Dave Mattson, Barbara Minsker, Chris Navarro, Marcus Slavenas, William Sullivan, Jason Votava, Inna Zharnitsky, Kenton McHenry
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign

Speaker: Jay Alameda

Abstract

Speaker Description: 

Jay Alameda is the lead for Advanced Application Support at the National Center for Supercomputing Applications. In this role, he works with the Extreme Science and Engineering Discovery Environment (XSEDE) which is a collaboration of NSF-funded high performance computing (HPC) resource providers, working to provide a common set of services, including the provisioning of advanced user support, to the science and engineering community. In particular, Jay leads the Extended Support for Training, Education, and Outreach Service of XSEDE, which provides the technical expertise to support Training, Education, and Outreach activities organized by XSEDE. He also was the lead of the recently completed NSF funded SI2 project, “A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform”, which improved the Eclipse Parallel Tools Platform (PTP) to serve as a platform for development of HPC applications.

Event Category:

Function Follows Form: A Practical Guide to Research Data Curation

Date and Time: 
Tuesday, April 5th, 2016
Location: 
Center Green
Speaker: 
Julia Collins

Whether your data science adventures rely on data you produced or leverage data sources managed by others, the ability to explore, analyze, and visualize data all depend on access to the data themselves. The ability of other data scientists to verify and build upon your findings also depends on their access to the same data sources. This talk will review the role of data curation in the data science workflow. We will review data management best practices applicable to all levels of data generation and use, from small exploratory studies to large satellite data sets

Speaker Description: 

Julia Collins is a software developer at the National Snow and Ice Data Center in Boulder, Colorado. She currently provides software engineering support for tools used to manage and process large Earth science data sets, as well as supporting the development of user interfaces and data storage strategies for community-based monitoring activities and qualitative data sources.

Event Category:

ARTView: A Community Weather Radar Data GUI

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Nick Guy

With file formats, naming conventions, and aging platforms, working with weather radar data has been a notoriously difficult endeavor until recently. Tools for plotting and analyzing radar data have existed generally at an institutional or proprietary level. Community tools, such as Solo II/3, have been indispensible. However, they do not necessarily take advantage of modern computing technologies and address the shift of academic and government institutions to open source platforms for cost savings, performance, and development flexibility.

Speaker Description: 
Nick Guy is Associate Research Scientist and Project Manager of King Air Research Facility at the Department of Atmospheric Science in the University of Wyoming

Event Category:

Whales On A Plane: Deploying Software To NSF / NCAR Research Aircraft w/ Docker

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Erik Johnson

Docker is a maturing open-source, Linux-based containerization technology that provides a convenient means to package, distribute and execute software in a fast and isolated environment.

NCAR's Earth Observing Laboratory (EOL) is using Docker to deploy applications and associated services to NSF / NCAR research aircraft. In this talk, I will discuss the benefits provided by Docker, and tools, such as Docker Compose and Docker Hub, and techniques used to facilitate Docker-based deployment of NCAR EOL applications and services, which include:

Speaker Description: 

Erik Johnson is a software engineer at NCAR's Earth Observing Laboratory, responsible for full-stack web development and devops for the Field Catalog and related Catalog tools using Free and Open-Source technologies. Erik has previously worked at start-ups and contracted to NOAA and NASA.

Event Category:

Extending the geographic extent of existing land cover data using active machine learning and covariate shift corrective sampling

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Galen Maclaurin

Consistent land cover data provided at national and regional scales are increasingly relevant for a wide range of research topics from landscape ecology to population dynamics. As one example, the National Land Cover Database (NLCD) provides a valuable resource for research conducted at broad geographic scales across the U.S. where survey- or field-based land cover data are not available.

Speaker Description: 

I am a geospatial data scientist at the National Renewable Energy Laboratory (NREL) in Golden, CO, where I work on diverse problems in renewable energy involving spatiotemporal data. My recently completed PhD research in the Department of Geography at the University of Colorado-Boulder focused on image-based machine learning for spatial and temporal replication of land cover data.

Event Category:

Building a Distributed Oceanography Match-up Service (DOMS) to pair field observation and satellite data

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Zaihua Ji

Geoscience applications increasingly rely on the integration and collocation of data in the form of in-situ field observations with data in the form of satellite observations and global models. Both types of data reside in scattered repositories for both historical and economic (too large to replicate) reasons. Finding all possible data match-ups between distributed data repositories is a fundamental challenge for geoscience work such as satellite calibration and validation (Cal/Val).

Speaker Description: 

Zaihua Ji is Senior Software Engineer in the Data Support Section of CISL at NCAR.

Event Category:

Transmission Distribution Systems Hub

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Monte Lunacek

The electric grid is a complex network that delivers power from many, distributed sources to commercial and residential consumers. Researchers are able to understand the impact of changes to this system through modeling and simulation.  This is important because several technologies that are growing in residential use, such as solar, electric vehicles, and smart home appliances, impact the load placed on the grid in different ways. New and different retail market structures also impact how electricity is consumed.

Speaker Description: 

Monte Lunacek is member of the Modeling & Simulation Group at NREL since 2014. Before, he was an HPC Application Specialist in the Research Computing group at the University of Colorado.  Monte received his PhD in Computer Science from Colorado State University.

Event Category:

Apache Spark for scientific data at scale

Date and Time: 
Monday, April 4th, 2016
Location: 
Center Green
Speaker: 
Neal McBurnett

Apache Spark is a modern open source cluster computing platform. It is helping data scientists analyze and explore large datasets more effectively than ever before, in terms of both software development productivity and efficient use of hardware, scaling from on-premises clusters to on-demand cloud computing.

Speaker Description: 

Neal McBurnett is a consultant in Boulder Colorado. Since his career as a Distinguished Member of Technical Staff at Bell Labs, working on tools for software development, security and open source web collaboration, he has taught Artificial Intelligence at CU and worked as a techincal content developer at Databricks for courses on Apache Spark, including two massive online courses on Spark in 2015.

Event Category:

Pages

Subscribe to conference-talk