Apache Spark for scientific data at scale

Date and Time:

Monday, April 4th, 2016

Location:

Center Green

Speaker:

Neal McBurnett

Apache Spark is a modern open source cluster computing platform. It is helping data scientists analyze and explore large datasets more effectively than ever before, in terms of both software development productivity and efficient use of hardware, scaling from on-premises clusters to on-demand cloud computing.

Come see examples of Spark at work on scientific datasets, and learn how the largest open source project in data processing can help unify a variety of tasks, including machine learning, streaming data and SQL queries, using Python, Scala Java or R.

Slides: http://bcn.boulder.co.us/~neal/talks/spark-science-scale/

Speaker Description:

Neal McBurnett is a consultant in Boulder Colorado. Since his career as a Distinguished Member of Technical Staff at Bell Labs, working on tools for software development, security and open source web collaboration, he has taught Artificial Intelligence at CU and worked as a techincal content developer at Databricks for courses on Apache Spark, including two massive online courses on Spark in 2015.

Event Category:

conference-talk