Comments
Language models are few shot learners
The Fifth Elephant For members

Language models are few shot learners

Building NLP applications by prompting LLMs

Tickets

Loading…

About the paper

Language Models are few short learners is an important paper in the space of GenerativeAI and Natural Language Processing. It introduced GPT-3 and showed the capability of large language models to generalize as task-agnostic learners.

The paper sowed the seeds for building NLP applications by prompting large language models with zero-shot, one-shot, and few-shot learning prompts. This was a huge advancement from task-specific modeling and also closer to how the human brain works by applying past learning to new data.

GPT-3 used the similar but scaled-up(100x) model architecture as GPT-2 except for the use of Sparse Attention (introduced in the Sparse Transformer paper).

Key takeaways for the audience

The following discussion endeavors to provide a comprehensive understanding of GPT-3 by addressing various facets.

  • GPT architecture
  • Dataset used for training
  • Training process
  • Capabilities acquired by GPT-3 because of large scale (expected and unexpected)
  • Performance comparison on different benchmarks
  • Limitations and impacts

About the presenter

Simrat Hanspal has a career spanning over a decade in the AI and ML space, specializing in Natural Language Processing (NLP).
Simrat is currently spearheading AI product strategy at Hasura. She has previously led AI teams at renowned organizations such as VMware, FI Money, and Nirvana Insurance.

RSVP and venue

This is an in-person paper reading session. RSVP to be notified about the venue.

About The Fifth Elephant monthly paper discussions

The Fifth Elephant member - Bharat Shetty Barkur - is the curator of the paper discussions.

Bharat has worked across different organizations such as IBM India Software Labs, Aruba Networks, Fybr, Concerto HealthAI, and Airtel Labs. He has worked on products and platforms across diverse verticals such as retail, IoT, chat and voice bots, edtech, and healthcare leveraging AI, Machine Learning, NLP, and software engineering. His interests lie in AI, NLP research, and accessibility.

The goal is for the community to understand popular papers in Generative AI, DL, and ML domains. Bharat and other co-curators seek to put together papers that will benefit the community, and organize reading and learning sessions driven by experts and curious folks in GenerativeAI, Deep Learning, and Machine Learning.

The paper discussions will be conducted every month - online and in person.

How you can contribute

  1. Suggest a paper to discuss here - https://hasgeek.com/fifthelephant/call-for-papers/sub
    This should involve slides, and code samples to make parts of the paper simpler and more understandable.
  2. Moderate/discuss a paper someone else is proposing.
  3. Pick up a membership to support the meet-ups and The Fifth Elephant’s activities.
  4. Spread the word among colleagues and friends. Join The Fifth Elephant Telegram group or WhatsApp group.

About The Fifth Elephant

The Fifth Elephant is a community funded organization. If you like the work that The Fifth Elephant does and want to support meet-ups and activities - online and in-person - contribute by picking up a membership

Contact

For inquiries, leave a comment or call The Fifth Elephant at +91-7676332020.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more