The Influence of Model Architecture on Multi-GPU Training of Adversarial Networks

Date and Time:

Tuesday April 9th 2019

Location:

CG Auditorium

Speaker:

Sarvesh Garimella

Graphical processing units (GPUs) provide significant computational power for artificial intelligence (AI) applications using convolutional neural networks (CNNs). The TensorFlow Python API provides flexibility for testing various architectures across a variety of platforms, including GPUs. Despite the advantages of CNN training on GPUs, large multidimensional environmental datasets are often still prohibitively costly for model training. This study explores how model architecture influences training cost of generative adversarial networks (GANs). Specifically explored are different choices for distribution of instructions across multiple GPUs, generation of multiple examples from a single training example, and synchronous vs. asynchronous updates to model loss terms.

Speaker Description:

BS in Planetary Science, Caltech, 2011; BS in Environmental Science and Engineering, Caltech, 2011; MS in Atmospheric Science, MIT, 2014; PhD in Climate Physics and Chemistry, MIT, 2016; Chief Scientist at ACME AtronOmatic 2016-present

Event Category:

conference-talk