Anthill Inside 2019

A conference on AI and Deep Learning

GAN-inspired Innovations in Computer Vision

Submitted by Pushkar Pushp (@ppushp7) on Apr 30, 2019

Technical level: Intermediate Session type: Lecture Section: Tutorials Status: Rejected

Abstract

“The most interesting idea in the last 10 years in ML.” - Yann LeCun, Facebook AI research director.

In this talk, we will focus on Generative Adversarial Networks, one of the most interesting concepts in deep learning. A GAN is a generative model, which captures the patterns in the data so that it can generate new data points from the estimated data distribution. In the recent years, there has been tremendous research in the field of GANs, some of which include text-to-image synthesis, photo realistic image generation from doodles and a lot more.

We will cover the working of GANs with implementation and some of these interesting applications in this talk.

Keywords : StackGAN , DCGAN, Generators, Autoencoders, VAE

Outline

Generative vs Discriminative Models
Why GANs?
Introduction to GAN
How do GANs work?
Generators and Discriminators
Cost function and optimization
GANs vs Autoencoders and VAE
DCGAN
Recent applications/case studies of GANs
Text-to-image synthesis
StackGANs
Pose Guided Person Image Generation
SRGAN
Nvidia’s GauGAN (Doodles into photo realistic images)
Deep Fakes
We will also cover an implementation of DCGAN using Jupyter notebook and keras for better understanding of the implementation and the concept.

Requirements

Basic understanding of deep learning and how neural networks are trained. Beginner level knowledge about Python and Keras will be helpful in understanding the concepts more efficiently.

Speaker bio

Pushkar Pushp is working as a Data Scientists with WalmartLabs having done his graduation and masters in statistics from ISI, Kolkata. His areas of interests range from pure Mathematics, Python to Computer Vision, Deep Learning. He has extensively work on Keras/tensorflow to develop various state of art models such as Face Recognition,Trigger Word detection ,Machine Translation and other sequence models.

Co-Author
Shivani Naik
I have a Master’s degree in Information Technology with a Data Science major from IIIT Bangalore. Currently, I am working on Computer Vision as a Statistical Analyst at Walmart Labs India. With projects that make use of different ML techniques like object detection, GANs, CNNs, recommendation systems, I have worked with Machine Learning for the past 4 years. I also have a provisionally filed patent titled ‘System and method for produce detection and classification’ for an image classification algorithm.

Slides

https://www.slideshare.net/secret/BKaoqAEZx0glH3

Preview video

https://photos.app.goo.gl/kc3XgKLDrhZBzPTG6

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}