Myths and Realities of Data Labeling for Deep Learning

Nov 2019

18 Mon

19 Tue

20 Wed

21 Thu

22 Fri

23 Sat 08:30 AM – 05:30 PM IST

24 Sun

Taj M G Road, Bangalore, Bangalore

Tickets

All submissions

Previous Next

This submission has been added to the schedule

Myths and Realities of Data Labeling for Deep Learning

Submitted Aug 15, 2019

Section: Birds Of Feather (BOF) session Technical level: Intermediate Session type: Discussion

In this BoF, we will explore data labeling tasks for NLP and CV problems. Specifically, we will discusses nuiances around defining, crowd sourcing and executing data labeling tasks, along with quality assurance processes. We shall also discuss machine aided data taggint to save cost, time and efforts on different data labeling tasks. Finally, we shall also touch upon feedback loopswhen some of the unseen and real-time inputs are labeled to fine-tune the deep learning models.

Outline

setting the context : data labeling for NLP and CV
how to define a data labeling task : novice vs expert
does crowd sourcing of data labeling really work : adv vs disadv.
how to manage in house data labeling teams : adv vs disadv
what is the criticality of the correctness of data labels
what is the experience and expertise expectation of data labelers
how to ensure correctness of data labels : manual vs automated checks
how to resolve labeling conflicts
how does an engineer know if she has enough labeled data
what are the time, cost, correctness trade-offs
how to ensure and execute class balanced data labeling
how to plan and execute weakly supervised data labeling
how to train models on small set of labeled data and generate ‘soft tags’ for the rest of the unlabeled data
how does one know if a model is performing well in practice on unseen and real-time inputs
how does feedback loop work when some of the unseen and real-time inputs are labeled to fine-tune the models

Requirements

Familiarity with NLP, CV, Deep Learning

Speaker bio

Vijay is the co-founder and CTO of Infilect Technologies, a Computer Vision and Deep Learning start-up, builidng B2B SaaS products for global retail industry. Vijay has a PhD in CSE, from IIT Bombay. Vijay has worked as research scientist in IBM Research Labs.

All submissions

Previous Next

Comments

Nov 2019

18 Mon

19 Tue

20 Wed

21 Thu

22 Fri

23 Sat 08:30 AM – 05:30 PM IST

24 Sun

Hybrid access (members only)

Hosted by

Anthill Inside

Anthill Inside is a forum for conversations about risk mitigation and governance in Artificial Intelligence and Deep Learning. AI developers, researchers, startup founders, ethicists, and AI enthusiasts are encouraged to: more

Anthill Inside 2019

Myths and Realities of Data Labeling for Deep Learning

Outline

Requirements

Speaker bio

Comments