Text made Understandable by Machines
Submitted by Ashish Kumar (@ashish122) on Monday, 30 May 2016
Understanding language is a trivial task for humans, but when it comes to mimic that task by machines it doesn’t remain that trivial. For humans, everything(image, text, speech etc.) is in terms for electrical impulses. In the same way for machines, everything is numbers either in the vector form (in the case of text or speech) or matrix form (in the case of images or videos). Deep learning has recently shown many promises for Natural Language Processing(NLP) applications. Traditionally in most NLP approaches documents or sentences are represented by a sparse bag-of-words representation.
A lot of work has been done, which goes beyond this by adopting a distributed representation of words by constructing a so-called “neural embedding” or vector space representation for each word(word2vec), sentence(thought vectors) or document(doc2vec).
1) Introduction and the importance of Word Embedding
2) Old methods used for Text representaion
3) Word2Vec and its pros and cons
4) Thought Vectors and its pros and cons
5) Doc2Vec and its pros and cons
I’m a software engineer at Snapshopr. You can also go through my profile https://in.linkedin.com/in/ashish30