Synthetic Gradients – Decoupling Layers of a Neural Nets
Once in a while comes an (crazy!) idea that can change the very fundamentals of an area. In this talk we will see one such idea that can change how neural networks are trained.
As of now Back propagation algorithm is at the heart of training any neural net. However, the algorithm suffers from certain drawbacks which forces layers of the neural net to be trained strictly in sequential manner. In this talk we see a very powerful technique to break free from this severe limitation.
Refresher on Back propagation [5 mins]
Problems with Back propagation [5 mins]
- Forward locking
- Backward locking
- Update locking
- Impact of locking
Why does it matter [1 mins]
Applications [3 mins]
Solution [12 mins]
- Synthetic Gradients
- Breaking backward & update locking
Results [5 mins]
- Backprop vs Synthetic Gradients
Complete unlock [2 mins]
- Breaking forward locking
Closing remarks [3 mins]
To facilitate better understanding, I will be giving a github repo as a take away so that the audience can go back, download the code and play with it.
Code assosiated with this talk : https://github.com/anujgupta82/Synthetic_Gradients
Basic understanding of Back propagation algorithm
Anuj Gupta is a senior ML researcher at Freshdesk; working in the area NLP, Machine Learning, Deep learning. Earlier he was heading ML efforts at Airwoot(Now acquired by Freshdesk). He dropped out of Phd in ML to work with startups. He graduated from IIIT H with specialization in theoretical comp science.
He has given tech talks at prestigious forums like PyData DC, Fifth Elephant, ICDCN, PODC, IIT Delhi, IIIT Hyderabad and special interest groups like DLBLR. More about him - https://www.linkedin.com/in/anuj-gupta-15585792/
- Code associated with this talk: https://github.com/anujgupta82/Synthetic_Gradients
- My blog on computing gradients: https://anujgupta82.github.io/2016/08/24/gradients-0/
- Our upcoming workshop on text representation in Anthill - https://anthillinside.in/2017-nlp-workshop/
- Slides from my tutorials on “Word vector representation” in DLBLR - https://www.slideshare.net/anujgupta5095/dlblr-talk
- My talk on “Building Continuous Learning Systems” from PyData DC, 2017 - https://www.youtube.com/watch?v=VtBvmrmMJaI
- Our talk on “Building Continuous Learning Systems” from Fifth Elephant, 2016 - https://www.youtube.com/watch?v=Cv-J6GRg12A
- Work from my past life - http://dblp.uni-trier.de/pers/hd/g/Gupta:Anuj