PyCon Pune 2017

A conference on the Python programming language

Akhil Gupta

@codeorbit

Pattern Recognition of Twitter Users Through Semantic Topic Modelling

Submitted Nov 29, 2016

In machine learning and NLP, topic modeling is a type of statistical model for discovering the abstract “topics” that occur in a collection of documents. This project aims to provide similarity percentage between two or more twitter profiles (not necessarily person) among multiple domains on the basis of their tweets and hashtags used in them, and users they follow using topic modelling as the main approach apart from that, we will also use sentimental analysis on tweets and emoji’s used in them to get more accurate result. The discovery of similar accounts for each twitter user has a variety of applications including user recommendation and advertisement targeting.

Outline

  • Topic Modelling - A brief overview.
  • Algorithms and framework to perform Topic Modelling.
  • Problems and Solution in handling small document(tweets).
  • Various analysis performed on hashtags and tweets.
  • Handling abbreviations and emoji(s).
  • Detecting sentiment on particular topic using tweets and emoji(s).
  • Generating user personality matrix/graph.
  • comparing user personality graph with other user to generate recommendation.

Requirements

Python2
Scikit
Scipy
Numpy
Word2vec

Speaker bio

Akhil Gupta


  • Currently interning at Amazon, Chennai.
  • He has previously spoken @Pycon Delhi,2016 on” Creating a recommendation engine based on NLP and contextual word embedding.”
  • He also worked in areas of:
  • Natural Language Processing.
  • Classification Algorithm.
  • Topic Modelling.
  • Clustering using probabilistic models.
  • Twitter Mining.
  • He likes to build software which in some ways eases human effort, some of them are:
  • Content based semantic image retrieval.
  • Language model, having features such as Autocomplete, Entity Tagger, Spell Check,Word Segmenter etc.
  • Entity tagger, made on Wikipedia dataset for tagging entity as well as domain identification.
  • Restaurant Recommendation engine, on the basis of food items.
  • His Detailed linkedin Profile.
  • Other Profiles Github, Twitter

Varun Dey


  • Winter intern 2017 at Symantec Chennai
  • Backend Software Developer with significant Python experience as main stack
  • Loves automating things for the lazy humans
  • Built Google PageSpeed Insights Extension which gets humble thousand downloads per month
  • Built Semantic Search Engine on Wikipedia from the open source dumps provided by DBPedia
  • Developed a better implementation of DuckDuckGo’s Zero Click Info Goodies cheatsheet which is much more extensible and doesn’t break on simple queries
  • Find all his online footprints and portfolio at his website
  • Find his open source projects on Github or find him on Linkedin

Links

  • Github link and Slides will soon be uploaded.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}