Anthill Inside 2017
On theory and concepts in Machine Learning, Deep Learning and Artificial Intelligence. Formerly Deep Learning Conf.
Jul 2017
24 Mon
25 Tue
26 Wed
27 Thu
28 Fri
29 Sat 09:00 AM – 05:40 PM IST
30 Sun
Anuj Gupta
Think of your favorite NLP application that you wish to build - sentiment analysis, named entity recognition, machine translation, information extraction, summarization, recommender system, to name a few. A key step to building it is - using the right technique to represent the text in a form that machine can understand. In this workshop, we will understand key concepts, maths, and code behind state-of-the-art techniques for text representation.
This will be a very hands-on workshop with jupyter notebooks to create various representations, coupled with the key concepts & maths that forms the basis of their respective theory.
Deep Learning in Images has had a phenomenal success story. One of the key reasons for it is: Rich representation of data - raw image in matrix form with RGB values.
While in images, directly using the pixel values is a very natural representation; However, when it comes to text, there is no such natural representation. No matter how good is your ML algorithm, it can do only so much unless there is a richer way to represent underlying text data. Thus, whatever NLP task/application you are building, it’s imperative to find a good representation for your text. Motivated from this, the subfield of representation learning of text for NLP has attracted a lot of interest in the past few years.
__ Various representation learning techniques have been proposed in literature, but still there is a dearth of comprehensive tutorials that provides full coverage with the mathematical explanation and implementation details of these algorithms to a satisfactory depth. __ This workshop aims to bridge this gap. This workshop aims ot demystify, both - Theory (key concepts, maths) and Practice (code) that goes into these various representation schemes. At the end of workshop participants would have gained a fundamental understanding of these schemes and will be able to implement embeddings on their datasets.
We will cover the following topics:
Old ways of representing text
Introduction to Embedding spaces
Word-Vectors
Sentence2vec/Paragraph2vec/Doc2Vec
Character2Vec
For each of the above representation scheme, we will understand and implement both - evaluation and visualization techniques.
Target audience: This workshop is meant for NLP enthusiast, ML practitioners, Data science teams who work with text data and wish to gain a deeper understanding of text representations for NLP.
Laptop and Lots of enthusiasm
We will provide pre installed virtual machine which will help you get started without fuss.
He has given tech talks at prestigious forums like PyData DC, Fifth Elphant, ICDCN, PODC, IIT Delhi, IIIT Hyderabad and special interest groups like DLBLR. More about him - https://www.linkedin.com/in/anuj-gupta-15585792/
https://www.slideshare.net/anujgupta5095/representation-learning-for-nlp
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}