Activations, Objectives and Optimisers - Nuts & Bolts of a DeepNet
Submitted by Anuj Gupta (@anuj-gupta) on Tuesday, 24 May 2016
Section: Full talk Technical level: Intermediate
Building a good Deep Network is not an easy task. From the architecture of the network to various parameters - each choice is very crucial as it has a huge bearing on the performance (both accuracy and efficiency) of the DeepNet. Of these, three important component any practitioner must choose are : which Activation function, Loss/Objective Function and Optimiser to use ?
Given the large number of choices for each of these, the task doesn’t get any easier. In this talk we will take a deeper look into each of the choices available to us while choosing these components.
We will address what, why, how, Pros & Cons for each of these choices:
Activation Function - Softmax, Softplus, Softsign, relu, tanh, sigmoid, hard sigmoid, linear
Loss/Objective Function - mean squared error, mean absolute error, mean absolute percentage error, mean squared logarithmic error, squared hinge, hinge, binary crossentropy(logloss), categorical crossentropy(multiclass logloss), sparse categorical crossentropy, poisson, cosine proximity
Optimizer - SGD, RMS prop, Adagrad, Adadelta, Adam, Adamax
This talk aims to bring out a better understanding of these Nuts & Bolts.
I currently work as a pricipal ML researcher at Airwoot(Now acquired by Freshdesk), building inteligent applications using NLP + Deep Learning. Before to joining industry I was a part of Data-Science group at IIT Delhi and research scholar with theory group at IIIT - hyderabad.
Prior to this I have given talks at IIIT Hyd, ICDCN, IIT Delhi, ISOC
You can find more about me on