Anthill Inside 2019

On infrastructure for AI and ML: from managing training data to data storage, cloud strategy and costs of developing ML models

Propose a session

GAN-inspired Innovations in Computer Vision

Submitted by Pushkar Pushp (@ppushp7) on Tuesday, 30 April 2019


Preview video

Technical level: Intermediate Session type: Lecture Section: Tutorials

Abstract

“The most interesting idea in the last 10 years in ML.” - Yann LeCun, Facebook AI research director.

In this talk, we will focus on Generative Adversarial Networks, one of the most interesting concepts in deep learning. A GAN is a generative model, which captures the patterns in the data so that it can generate new data points from the estimated data distribution. In the recent years, there has been tremendous research in the field of GANs, some of which include text-to-image synthesis, photo realistic image generation from doodles and a lot more.

We will cover the working of GANs with implementation and some of these interesting applications in this talk.

Keywords : StackGAN , DCGAN, Generators, Autoencoders, VAE

Outline

Generative vs Discriminative Models
Why GANs?
Introduction to GAN
How do GANs work?
Generators and Discriminators
Cost function and optimization
GANs vs Autoencoders and VAE
DCGAN
Recent applications/case studies of GANs
Text-to-image synthesis
StackGANs
Pose Guided Person Image Generation
SRGAN
Nvidia’s GauGAN (Doodles into photo realistic images)
Deep Fakes
We will also cover an implementation of DCGAN using Jupyter notebook and keras for better understanding of the implementation and the concept.

Requirements

Basic understanding of deep learning and how neural networks are trained. Beginner level knowledge about Python and Keras will be helpful in understanding the concepts more efficiently.

Speaker bio

Pushkar Pushp is working as a Data Scientists with WalmartLabs having done his graduation and masters in statistics from ISI, Kolkata. His areas of interests range from pure Mathematics, Python to Computer Vision, Deep Learning. He has extensively work on Keras/tensorflow to develop various state of art models such as Face Recognition,Trigger Word detection ,Machine Translation and other sequence models.

Co-Author
Shivani Naik
I have a Master’s degree in Information Technology with a Data Science major from IIIT Bangalore. Currently, I am working on Computer Vision as a Statistical Analyst at Walmart Labs India. With projects that make use of different ML techniques like object detection, GANs, CNNs, recommendation systems, I have worked with Machine Learning for the past 4 years. I also have a provisionally filed patent titled ‘System and method for produce detection and classification’ for an image classification algorithm.

Slides

https://www.slideshare.net/secret/BKaoqAEZx0glH3

Preview video

https://photos.app.goo.gl/kc3XgKLDrhZBzPTG6

Comments

  • Abhishek Balaji (@booleanbalaji) Reviewer 2 months ago (edited 2 months ago)

    Hello Pushkar/Shivani,

    Thank you for submitting a proposal. As per the policy of Anthill Inside, we only allow one presenter on stage per session. Please make a decision on who among you would be presenting this proposal, if selected.

    To proceed with evaluation, we need to see detailed slides and a preview video to supplement your proposal. Your slides must cover the following:

    • Problem statement/context, which the audience can relate to and understand. The problem statement has to be a problem (based on this context) that can be generalized for all.
    • What were the tools/options available in the market to solve this problem? How did you evaluate alternatives, and what metrics did you use for the evaluation?
    • Why did you pick the option that you did?
    • Explain how the situation was before the solution you picked/built and how it changed after implementing the solution you picked and built? Show before-after scenario comparisons & metrics.
    • What compromises/trade-offs did you have to make in this process?
    • What is the one takeaway that you want participants to go back with at the end of this talk? What is it that participants should learn/be cautious about when solving similar problems?
    • Is the tool free/open-source? If not, what can the audience takeaway from the talk?

    We need to see the updated slides on or before 21 May in order to close the decision on your proposal. If we do not receive an update by 21 May we’ll move the proposal for consideration at a future event.

Login with Twitter or Google to leave a comment