Anthill Inside Miniconf – Pune

Machine Learning, Deep Learning and Artificial Intelligence: concepts, applications and tools.

Inference in Deep Neural Networks

Submitted by saurabh agarwal (@saurabh-agl) on Wednesday, 1 November 2017

videocam
Preview video

Technical level

Intermediate

Section

Full talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +4

Abstract

A lot of focus is currently on training neural networks and better architecture. But we don’t focus alot on inference because well we are busy making our models work. Inference is supposed to run millions of time more than training and alot of times the inference is supposed to run on embeded devices. This talk will go into details of how the advancements in hardware have made Deep Learning possible. We will also talk of certain optimization which can be done to speed up computaion when deploying a model on CPU. We will debunk terms GeMM, SIMD, BLAS and SIMT on the way.

Outline

  • Intro DL Networks.
    • How do typical Deep Learning Architetures look.
    • A small section using example of one CNN and one LSTM on what mathematical operations do they perform.
  • Advancements in Hardware
    • Intel Knight CPU’s
    • Nervana
    • Volta GPU’s
  • How exactly the operations are done on garden-variety hardware
    • SIMD
    • SIMT
    • GeMM
  • Different type of Architectures
    • CPU and GPU’s
      • How do these work and bottlenecks
  • Role Played by Memory access in speeds
    • How a lot of times memory is the bottleneck instead of Compute
  • Changes in algortihms made to utilise these functionalities
    • Example of Google’s Inception V3 model
    • Two different type of RNN’s
  • Advice
    • How to make your model more efficient at inference.
    • Some practical examples

Speaker bio

Saurabh has been working at MAD Street Den, Chennai as a Machine Learning Engineer since past year and a half,specifically working on Deep Learning based products. He loves to train Convolutional Neural Networks of all types and sizes for different applications. Apart from CNN’s he has special interest in recurrent architectures and discovering their powers. When he is not working on DL based stuff, he loves to play around with micro-controllers.

Slides

https://docs.google.com/presentation/d/e/2PACX-1vS3npIWr-HCcKgpodNvp1-RI3gBUaXAXFS94FhYA6AtNuSbDkvPrSOYSRUni9vyYNzeIM5wBEk_kFqT/pub?start=false&loop=false&delayms=3000

Preview video

https://www.youtube.com/watch?v=qpgD8PJGao8&feature=youtu.be

Comments

Login with Twitter or Google to leave a comment