The Fifth Elephant

The Fifth Elephant: Paper Reading meet-up - November 2023

High-resolution image synthesis with latent diffusion models

Nov 2023

30 Mon

31 Tue

1 Wed

2 Thu

3 Fri 05:30 PM – 06:35 PM IST

4 Sat

5 Sun

Nov 2023

30 Mon

31 Tue

1 Wed

2 Thu

3 Fri 05:30 PM – 06:35 PM IST

4 Sat

5 Sun

About the paper

This paper introduces the Latent Diffusion Model (LDM) which led to the release of Stable Diffusion - an open-source text-to-image model released in August 2022 that paves the way for a number of other innovations in text-to-image, image-to-image, and image editing applications.

This paper introduced the idea of applying the diffusion process in the latent space of an autoencoder and not directly on image pixels. This approach reduces computational demands significantly while retaining image synthesis quality and flexibility.

Given the open-source nature and simple building blocks, there has been tremendous community engagement leading to multiple interesting contributions to the model architecture that enable applications like ControlNet (https://arxiv.org/abs/2302.05543), Dreambooth (https://dreambooth.github.io/) and even audio generation (https://github.com/riffusion/riffusion)

Key takeaways for the audience

If you’re interested in the space of text-to-image generation - then this session is for you. We will discuss how the image generation process works and how it can be applied to multiple use cases.

We will approach the paper in the following steps

a high-level system design view of the various components in a Latent Diffusion Model
deep-dive into the diffusion process itself and how it works in conjunction with the CLIP Vision-Text transformer
diving into code implementations and other design choices
understanding some extensions like ControlNet and Dreambooth and how they work with the model architecture

About the presenter and discussant

Sidharth Ramachandran works at a large European media company and has been applying text-to-image techniques as part of building data products for a streaming platform. He is also a part-time instructor and has co-authored a book published by O’Reilly. He is an enthusiastic learner who is fascinated to see where AI research is heading and what applications it can unlock for humanity.

Amiruddin Nagri, founder of Memex AI, with experience previously in Gojek and ThoughtWorks will lead the discussion and also share insights in the domain of diffusion models. Amir previously conducted a houseful workshop on Stable Diffusion.

About The Fifth Elephant monthly paper discussions

The Fifth Elephant member - Bharat Shetty Barkur - is the curator of the paper discussions.

Bharat has worked across different organizations such as IBM India Software Labs, Aruba Networks, Fybr, Concerto HealthAI, and Airtel Labs. He has worked on products and platforms across diverse verticals such as retail, IoT, chat and voice bots, edtech, and healthcare leveraging AI, Machine Learning, NLP, and software engineering. His interests lie in AI, NLP research, and accessibility.

The goal is for the community to understand popular papers in Generative AI, DL, and ML domains. Bharat and other co-curators seek to put together papers that will benefit the community, and organize reading and learning sessions driven by experts and curious folks in GenerativeAI, Deep Learning, and Machine Learning.

The paper discussions will be conducted every month - online and in person.

How you can contribute

Suggest a paper to discuss. Post a comment here to suggest the paper you’d like to discuss. This should involve slides, and code samples to make parts of the paper simpler and more understandable.

Moderate/discuss a paper someone else is proposing.

Spread the word among colleagues and friends. Join The Fifth Elephant Telegram group or WhatsApp group.

The Fifth Elephant: Paper Reading meet-up - November 2023

About the paper

Key takeaways for the audience

About the presenter and discussant

RSVP and venue

About The Fifth Elephant monthly paper discussions

How you can contribute

About The Fifth Elephant

Contact

Videos

Videos

Related events

Call for Papers: The Fifth Elephant Papers Reading community