Anthill Inside 2017
On theory and concepts in Machine Learning, Deep Learning and Artificial Intelligence. Formerly Deep Learning Conf.
Jul 2017
24 Mon
25 Tue
26 Wed
27 Thu
28 Fri
29 Sat 09:00 AM – 05:40 PM IST
30 Sun
On theory and concepts in Machine Learning, Deep Learning and Artificial Intelligence. Formerly Deep Learning Conf.
Jul 2017
24 Mon
25 Tue
26 Wed
27 Thu
28 Fri
29 Sat 09:00 AM – 05:40 PM IST
30 Sun
Sachin Kumar
Humans have been captioning images involuntary since decades and now in the age of social media where every image have a caption over various social platforms. Psychologically those things are affected by events and scenarios running in mind or infulenced by nearby activities and emotion. Sometimes those are far-far away from real context. Describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.
This talk will cover some of the common deep learning architectures, latest state-of-the-art pre-trained models for image captioning, describe advantages and concerns, and provide hands-on experience.
This talk shall be beneficial for those who are interested in the advance applications of Deep Neural Networks and what can be achieved with the combination of different state-of-the-art models. We also aim to provide an open source implementation in Keras, a higher abstraction library which uses Theano/TensorFlow/CNTK as backend for writing Deep Neural Networks (DNN) over CPU and GPU.
In this talk, I will present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on COCO dataset shows the accuracy of the model and the fluency of the language it learns solely from image descriptions.
Learning a model like this would be incredible. It would be a great way to relate how relevant an image and caption are to each other. For a batch of images and captions, we can use the model to map them all into this embedding space, compute a distance metric, and for each image and for each caption find its nearest neighbors.
I will cover the following:
Basic knowledge of DeepLearning, MLP, Backpropagation, CNN, RNN, pre-trained models such as VGG and lot of enthusiasm.
Sachin Kumar is currently second year undergraduate pursuing Bachelor of Engineering in Information Technology at Netaji Subhash Institute of Technology(NSIT), New Delhi. He is also Teaching Assistant in Machine Learning Course at Coding Blocks. His interests includes Machine Learning, Artificial Intelligence, Deep Learning and Evolutionary Computing.
https://slides.com/sachinkmr/decoding-neural-image-captioning
Jul 2017
24 Mon
25 Tue
26 Wed
27 Thu
28 Fri
29 Sat 09:00 AM – 05:40 PM IST
30 Sun
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}