conference-talk

A Python QGIS plugin for tweeter analysis during emergencies

Date and Time: 
2015 April 14 @ 9:00am
Location: 
FL2-1022 Large Auditorium
Speaker: 
Guido Cervone and Mark Coletti

During emergencies in urban areas it is paramount to assess damage to properties, people and the environment. Remote sensing has become the de-facto standard in observing the Earth and its environment. Remote sensing generally refers to the use of space- or air-borne sensor technologies, to detect and classify objects on the Earth (from its surface, atmosphere, and oceans) by means of emitted or reflected electro-magnetic signals.

Speaker Description: 

Guido Cervoneis Director of GeoInformatics & Earth Observation Laboratory in the Department of Geography and Institute for CyberScience at the Pennsylvania State University and Associate Professor at the Department of Geography, Institute for CyberScience, GeoVISTA Center The Pennsylvania State University. He is also affiliated faculty in the Research Application Laboratory (RAL) at the National Center for Atmospheric Research (NCAR).

His fields of expertise are geoinformatics, machine learning and remote sensing. His research focuses on the development and application of computational algorithms for the analysis of spatio-temporal remote sensing, numerical modeling and social media “Big Data” related to man-made, technological and environmental hazards. He operates a satellite receiving station for NOAA POES satellites. His research us funded by ONR, DOT, NASA, Italian Ministry of Research and Education, Draper Labs, Stormcenter Communication.

Guido Cervone is a member of the advisory committee of the United National Environmental Programme, division of Disasters and Early Warning Assessment. In 2013 he received the “Medaglia di Rappresentanza” from the President of the Italian Republic for his work related to the Fukushima crisis. He received the 2013 ISNAAF award. He co-chaired the 2010 SIGSPATIAL Data Mining for Geoinformatics (DMG-10) workshop. He served as the program co-chair for the 2008 and 2009 IEEE International Conference on Data Mining (ICDM) Spatial and Spatio-Temporal Data Mining (SSTDM) workshop.

He authored two edited books, over forty fully refereed articles relative to data mining, remote sensing and environmental hazards. In 2010, he was awarded a US patent for an anomaly detection algorithm. His research on natural hazards was featured on TV news and newspapers, on general interest magazines such as National Geographic, and on international magazines.

As Assistant Director of the Pennsylvania University’s Geoinformatics and Remote Sensing Laboratory, Dr. Mark Coletti is actively performing research in the areas of geoinformatics, machine learning, and evolutionary computation. His principal focus is in big data analytics related to natural hazards, particularly that related to volunteered geographic information, as well as discerning interesting patterns of Medicare use. His research has been funded by the ONR and NSF.

Dr. Coletti is the current Chair of the Penn State Postdoctoral Society, and as such is responsible for organizing career enhancement, personal improvement, and social activities for over 460 postdoctoral scholars. He previously worked at George Mason University where he helped develop an evolutionary computation C++ toolkit; a biologically inspired cognitive model for a DARPA Grand Challenge; a Joint Improvised Explosive Device Defeat Organization related multiagent simulation; an Office of Naval Research Multidisciplinary University Research Initiative Office sponsored massive multiagent simulation of pastoral and farming behavior in eastern Africa; and a geospatial extension, GeoMason, for the multi-agent simulation toolkit MASON.

Earlier in his career he also worked as a senior software engineer in the Washington, DC, area on projects for the National Oceanic and Atmospheric Administration, Federal Highway Administration, U. S. Army's Materiel Command, the U. S. Army Topographic Engineering Center, and the United States Geological Survey. These projects included an expert system to correct human sourced sea surface meteorological data, an expert system for validating materiel purchases, a topographic visualization system, a road surface wear calculator, and a toolkit for spatial data format conversion.

He has published over a dozen papers related to evolutionary computation, machine learning, large-scale multiagent simulations, biologically inspired cognitive architectures, and geographic information systems. He has also written a book on GeoMASON that is open source and freely available to the public.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Out-of-core Computations with Blaze

Date and Time: 
2015 April 14 @ 3:30pm
Location: 
FL2-1022 Large Auditorium
Speaker: 
Matthew Rocklin

NumPy and Pandas provide usable high-level abstractions over low-level efficient algorithms. Unfortunately both NumPy and Pandas are largely limited to single- core in-memory computing. When inconveniently large data forces users beyond this context we re-enter the frontier of novel solutions.

Speaker Description: 

Matthew Rocklin is a computational scientist at Continuum Analytics. He writes open source tools to help scientists interact with large volumes of data.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications

Date and Time: 
2015 April 14 @ 4:00pm
Location: 
FL2-1022 Large Auditorium
Speaker: 
Liqiang Wang

Authors: Liqiang Wang, Hong Zhang (University of Wyoming), Hai Huang (IBM TJ Watson Research Center)

Speaker Description: 

Dr. Liqiang Wang is currently an associate professor in the Department of Computer Science at the University of Wyoming. He is currently taking sabbatical leave and working as a visiting research scientists at IBM T.J. Watson Research Center. His research focuses on an interdisciplinary area between big-data computing and software analytics. His work applies program analysis techniques to improve correctness and resilience of data-intensive computing as well as optimize its performance and scalability, especially on Cloud, GPU, and multicore platforms. He received an NSF CAREER Award in 2011.

Event Category:

Parallel I/O - for Reading and Writing Large Files in Parallel

Date and Time: 
2015 April 16 - PM
Location: 
FL2-1022 Large Auditorium
Speaker: 
Ritu Arora and Si Liu

Developing an understanding of efficient parallel I/O and adapting your application accordingly can result in orders of magnitude of performance gains without overloading the parallel file system. This half-day tutorial will provide an overview of the practices and strategies for the efficient utilization of parallel file systems through parallel I/O for achieving high performance. The target audiences are analysts and application developers who do not have prior experience with MPI I/O, HDF5, and T3PIO. However, they should be familiar with C/C++/Fortan programming and basic MPI.

Speaker Description: 

Ritu Arora received her Ph.D. in Computer and Information Science from the University of Alabama at Birmingham. She works as an HPC researcher and consultant at the Texas Advanced Computing Center (TACC). She also teaches in the Department of Statistics and Data Sciences at the University of Texas at Austin. She has made significant contributions in the areas of developing abstractions for parallelizing legacy applications and application-level checkpointing. Currently, Ritu is providing consultancy on automating Big Data workflows on national supercomputing resources. Her areas of interest and expertise are HPC, fault-tolerance, domain-specific languages, workflow automation, and big data management.

Si Liu received his PhD in applied mathematics at University of Colorado at Boulder in 2009. He joined the High Performance Computing Group at the Texas Advanced Computing Center as a Research Associate in 2013. He has been collaborating with UT research groups, the XSEDE community, and many corporations on various projects, including HPC aware tools development, Weather Research and Forecast Model simulation and visualization, and CERN's "A Large Ion Collider Experiment project". His current research interests include parallel computing, I/O performance, test management, benchmark, and optimization. Previously, he worked as a software engineer in the Computational Information Systems Laboratory at the National Center for Atmospheric Research. He made important contributions to establishing the Yellowstone Supercomputing system at the NCAR-Wyoming Supercomputing Center. He received UCAR's special recognition award in 2011 for his contribution to Intergovernmental Panel on Climate Change.

Event Category:

Profiling Python code to improve memory usage and execution time

Date and Time: 
2015 April 14 @ 3:00pm
Location: 
FL2-1022 Large Auditorium
Speaker: 
Jonathan Helmus

Python is an excellent language for rapid prototyping of algorithms and programs in order to determine the feasibility of a task, but oftentimes fails to meet the run time and memory usage requirements needed for deployment to production.  This presentation will discuss tools within the Python ecosystem for profiling Python code to identify memory and run time hot spots. Techniques will be presented which can be used to improve the performance of Python code by utilizing the information provided by these tools.

Speaker Description: 

Jonathan Helmus is a scientist and advanced algorithms engineer at Argonne National Laboratory where he develops software for the Atmospheric Radiation Measurement (ARM) climate research facility. He is the lead developer of Py-ART, an open source toolkit for analysis of weather radar data in Python as well as having contributed to a number of other Scientific Python modules. Jonathan completed a postdoc at the University of Connecticut Health Center after receiving his Ph.D. in Chemical Physics from The Ohio State University. 

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

KGEN: Fortran Kernel Generator

Date and Time: 
2015 April 14 @ 2:00pm
Location: 
FL2-1022 Large Auditorium
Speaker: 
Youngsung Kim

There are cases that we want to extract a part of Fortran codes from a software application as a stand-alone executable. For example, when a programmer debugs a large software application such as CESM of NCAR, he/she needs to run whole CESM program up to the source line to debug. If we can take the part of codes that are only relevant to the debugging, it would remove time to run CESM and time to wait in queuing system.

Speaker Description: 

After an undergraduate degree in Electronic Engineering at Dankook University of South Korea, Youngsung Kim has worked in mobile telecommunication industry for 13 years mostly as a software developer. In 2010, he returned back to school at University of Utah and majored in Scientific Computing. During the study, he participated WRF climate simulation project and brain image matching project along with taking core courses including numerical methods and parallel computing. After graduation with master's degree from Univ. of Utah, he joined NCAR and has been working on accelerator technologies until now.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Software Deployment in the Field (Technical Debt and Data-Ops)

Date and Time: 
2015 April 14 @ 11:30am
Location: 
FL2-1022 Large Auditorium
Speaker: 
Gary Granger

In the Earth Observing Laboratory at NCAR, we develop and deploy many different instruments, for platforms ranging from miniature sondes to radars and aircraft, to areas all over the world to observe all kinds of phenomena. Every field deployment is in fact a custom and very complicated configuration of systems, software, instruments, and data streams. Besides the stories about writing software in some interesting situations, field projects present some (exciting) challenges for software engineering.

Speaker Description: 

Gary Granger received a Bachelor of Science degree in Computer Engineering from Virginia Tech, then began working for the Atmospheric Technology Division at NCAR in 1992. Over the years he has worked in several software development areas related to field deployment and instrument development, including field operations, visualization, and wind profiling radars. Currently he works in the Software Systems Group of the Earth Observing Laboratory, developing software in C++ and Python for the Integrated Sounding System, and developing LabVIEW software for spectrometers. He also advocates good software engineering practices in EOL and supports related infrastructure, such as subversion, build frameworks, and continuous integration servers.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Docker for Scientific Applications

Date and Time: 
2015 April 14 @ 10:00am
Location: 
FL2-1022 Large Auditorium
Speaker: 
Joe Stubbs

Container technology, and in particular Docker, has revolutionized distributed systems in a very short time. At the Texas Advanced Computing Center, we see the potential for Docker to have an enormous impact on scientific computing as well. Containers enable developers to distribute their applications with all necessary dependencies included in a single file. As a result, the software is immensely more portable and provides far greater reproducibility of results. Additionally, the barrier to entry is greatly reduced, as installation becomes as simple as downloading a file.

Speaker Description: 

Joe Stubbs earned a PhD in Mathematics from the University of Michigan. Since then he has been at the University of Texas where he has focused on building infrastructure software in various contexts. He is currently a research scientist at TACC where he primarily works on the Agave "science as a service" platform, enabling the next generation of science gateways to harness petascale HPC over the web.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Composing and deploying a cluster of Docker containers

Date and Time: 
2015 April 14 @ 9:30am
Location: 
FL2-1022 Large Auditorium
Speaker: 
Walter Moreira

Containers, and in particular Docker, are quickly transforming how we think about software architecture. Despite its popularity, there are two big problems that have not been fully solved yet: composability, and multi-node deployment of containers. Many products from big companies are trying to address them, but there is not a clear leader yet.

Speaker Description: 

Walter Moreira received his PhD in Mathematics from Texas A&M University. He previously worked in the HET Dark Energy Experiment at the McDonald Observatory, building the control system for a large telescope. He is currently working as a research engineer at TACC, concentrating in distributed systems. His main focus is building a federated data architecture for the Arabidopsis Information Portal.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Utilizing Scientific Python Tools for the Application of Data Science Techniques to High Impact Weather Prediction

Date and Time: 
2015 April 13 @ 3:00pm
Location: 
FL2-1022 Large Auditorium
Speaker: 
David Gagne

The developments and optimizations provided by Python’s scientific libraries have enabled the development of real-time high-resolution forecast post-processing systems primarily in Python. Numpy, Scipy, Matplotlib, and a set of newer scientific libraries have made this development possible. The Pandas library introduced efficient ways to load, analyze, manipulate, and merge large datasets. Scikit-Image provides a diverse array of image processing tools, which are useful for filtering and extracting information from gridded data.

Speaker Description: 

David John Gagne is a doctoral candidate in meteorology at the University of Oklahoma and a visiting graduate research assistant with the NCAR Research Applications Lab. His main research interests involve the application of machine learning techniques to numerical weather models and observations in order to improve the prediction of high impact weather. He has developed frameworks for improving the prediction of hail, solar energy, wind energy, heavy rain, aircraft turbulence, and tornadoes. He is an active Python developer and has contributed to packages for weather data visualization, forecast verification, and gridded forecast correction.

Event Category:

Video recorded: 

If you use a non-flash enabled device, you may download the video here

Pages

Subscribe to conference-talk