Making Deep Neural Networks smaller and faster
Submitted by Suraj Srinivas (@surajsrinivas) on Tuesday, 31 May 2016
Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, is has been shown that models with much smaller number of parameters can also perform just as well. A smaller model has the advantage of being faster to evaluate and easier to store - both of which are crucial for real-time and embedded / mobile applications. In this talk, I intend to provide a brief overview of such model compression techniques. Using these techniques, it is possible to compress neural networks by as much as 10x and speed up inference by 3-4x.
First, I shall motivate the general problem of model compression and it’s relevance for real-world applications.
Then, I shall provide overviews of the following papers:
1) Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015
2) Deep Compression, ICLR 2016
3) Learning the Architecture of Deep Neural Networks, arxiv 2016
Familiarity with Convolutional Neural Networks
Suraj is a second year Master’s student at Video Analytics Lab, Indian Institute of Science. From the past one and a half years, he has been working on the problem of model compression. His work has been presented previously at British Machine Vision Conference (BMVC) - 2015.