The Fifth Elephant 2018

The Fifth Elephant 2018

The seventh edition of India's best data conference

Hitesh Mantrala

@hitman_hittudiv

NLP on भारतीय भाषाओं

Submitted May 23, 2018

With millions of Indian users coming online recently with the penetraion of internet, It becomes crucial to address these users with Indian/Local languages support.
Most of the users are not comfortable with english and are more comfortable in hindi or some south indian languages. With the current technology, there are ways to address things like intent classification and entity extraction with ease.

We can rely on word vectors like word2vec / fasttext (character ngram based).
The talk takes in depth on how to do intent classification over fasttext in Telugu & Hindi unicode characters.
The same technology can be leveraged for intent classification from voice using google input tools sdks.
MindMap :

Outline

The talk takes in depth on how to do intent classification over fasttext in Telugu & Hindi unicode characters using deep neural networks leveraging existing word vectors like word2vec and fasttext.

Speaker bio

I, Hitesh, am a self taught machine learning guy, and moved to the data science team at Ibibo group by learning things in deep learning for the last 2 years. Having taken live the entire chat system of goibibo and makemytrip in English and Hindi, wanted to share the learnings with the audience

Slides

https://www.mindmeister.com/1112805945?t=SH64Op4PT6#

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures