Is bigger always better?

Is bigger always better?

An Overview of Model Compression Techniques in ML

Quite often, we get carried away and tend to use models which are far too powerful for our use cases. Powerful in the sense that they are ‘big’ and bulky’ and hence lead to a much larger memory footprint and more importantly to increased latency which directly affects user experience. A bit like carrying a sledge hammer for a fist fight! So, it’s important for us to have questions such as “do we really need this model or a smaller one suffices”? If a smaller alternative is not available, then can we “compress” the bigger models into smaller ones? Here is where techniques such as Distillation, Quantization, Pruning etc. come in handy. This talk is intended to introduce these techniques to the audience and also provide practical tips on the same.

This talk is relevant to data scientists, ML engineers and anyone who works or has an interest in ML. Only basics of ML knowledge will be assumed and hence ML practitioners of all levels should be able to access this talk reasonably.

By the end of the talk, the audience will have an understanding of these ideas, both from a theoretical as well as from a practical perspective.

Speaker bio

Swaroop N P, is Senior Data Scientist at PayPal.

This paper provides a good overview of these techniques: https://arxiv.org/pdf/2010.13382.pdf

RSVP to participate. The talk will be held online, via Zoom. Participants who RSVP will receive the Zoom links.

Purchase a subscription to participate in The Fifth Elephant conference on 11 August and to support activities such as online talks and in-person meetups.

Contact information:

For queries about this event, contact Hasgeek at info@hasgeek.com or call +91-7676332020.

Hosted by

All about data science and machine learning

Quite often, we get carried away and tend to use models which are far too powerful for our use cases. Powerful in the sense that they are ‘big’ and bulky’ and hence lead to a much larger memory footprint and more importantly to increased latency which directly affects user experience. A bit like carrying a sledge hammer for a fist fight! So, it’s important for us to have questions such as “do we really need this model or a smaller one suffices”? If a smaller alternative is not available, then can we “compress” the bigger models into smaller ones? Here is where techniques such as Distillation, Quantization, Pruning etc. come in handy. This talk is intended to introduce these techniques to the audience and also provide practical tips on the same.

This talk is relevant to data scientists, ML engineers and anyone who works or has an interest in ML. Only basics of ML knowledge will be assumed and hence ML practitioners of all levels should be able to access this talk reasonably.

By the end of the talk, the audience will have an understanding of these ideas, both from a theoretical as well as from a practical perspective.

Speaker bio

Swaroop N P, is Senior Data Scientist at PayPal.

This paper provides a good overview of these techniques: https://arxiv.org/pdf/2010.13382.pdf

RSVP to participate. The talk will be held online, via Zoom. Participants who RSVP will receive the Zoom links.

Purchase a subscription to participate in The Fifth Elephant conference on 11 August and to support activities such as online talks and in-person meetups.

Contact information:

For queries about this event, contact Hasgeek at info@hasgeek.com or call +91-7676332020.

Hosted by

All about data science and machine learning