Anthill Inside 2017

On theory and concepts in Machine Learning, Deep Learning and Artificial Intelligence. Formerly Deep Learning Conf.

Up next

Apache MXNet, a highly memory efficient deep learning framework

GP

Girish Patil

@bookworm

GPU memory is the most expensive deep learning resource. MXNet was designed to allow complex deep learning with most minimal requirements on GPU memory. This allows for training of complex models with accessible chips. This session will discuss how MXNet achieves low memory footprint, as well as other useful features of this rapidly emerging framework.

Outline

What is MXNet, where it came from, who uses it
How it support both declarative and symbolic paradigms
How it minimize memory footprint trading of compute for memory, impact of this tradeoff
Other cool features of MXNet like ease of programming, ease of distributed programming, being able to access caffe, torch layers

Requirements

None, other than being able to control my sldies

Speaker bio

Girish Patil works as an Senior Solutions Architect for AWS. He focuses on deep learning & traditional machine learning and big data projects. He helps many of India’s highly successful start-ups to use these technologies. Girish is also a subject matter expert within Amazon on these technologies and regularly participates global knowledge exchange programs.

Girish’s other interest include building internet scale applications. Girish also leads Developer Envagelism initiatives for AWS, as well Innovation Pavilion initiatives to promote young High Tech start-ups from India.

Links

  • https://www.youtube.com/watch?v=wuEKvvlcR-Y
  • I have presented on this topic in deep learning events. But, I do not have a video of my past presentation on this topic. However, this is video of a MXNet session by Amazon Principal Scientist Anima Anandkumar. It covers the topics I intend to cover around high memory efficiency of MXNet. So it will give you a good idea.

Slides

https://www.slideshare.net/AIFrontiers/scaling-deep-learning-with-mxnet (slide 73 to 97 from the full presentation, plus some additional slide for deep dive)

Comments