PyCon Pune 2017

A conference on the Python programming language

Manas Ranjan Kar


Creating a multilingual resume ranking API engine based on NLP and contextual word embeddings

Submitted Nov 30, 2016

How can we create a resume ranking engine based purely on job descriptions on job boards? Can I create recommendations purely based on the skillset & job role? How do I use natural language processing techniques to create valid recommendations of related skillsets? For example, how can I recommend “AngularJS” to an HTML developer who wants to prop up his CV? The other challenge lies in dealing with multilingualism - from English & Dutch job boards?

This talk will showcase how a recommendation engine can be built with job descriptions using a state-of-the-art technique - word2vec. We will create something that not only matches the existing recommender systems deployed by job websites, but goes one step ahead - ranking & scoring a resume from its content. The beauty of such a framework is that not only does it support online learning, but is also not too sensitive to language differences.

How do we account for the proper skillsets and build it in our ranking systems? The talk will answer these questions and showcase effectiveness of such a resume ranking engine.


  • Resumes & CVs on job boards
  • Introduction to word2vec
  • Data Collection from job posts
  • Handling multilingualism - Dutch & English
  • Preprocessing steps
  • Ranking - Logic & Algorithms
  • Deployment via API
  • Results & Discussions

Speaker bio

Manas likes helping clients making sense of their data and build a powerful case for business change using analytics in their respective companies.

He has architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. He is deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. He has contributed to Gensim & ConceptNet

To sum up his experience, he has worked on;

  • Application of machine learning to build text analytics solutions
  • Automate business processes for efficiency & productivity
  • Build algorithms for extracting multiple facets from text - gender of author, keywords, sentiment, taxonomies, concepts, entities
  • Combine and augment unstructured insights with structured data
  • Build recommendation engine for automated medical coding services
  • Build models to predict taxonomies for textual content
  • Create machine learning algorithms for topic detection & sentiments
  • Competitive intelligence algorithms to monitor events & trends for startups & SMEs

His detailed LinkedIn profile is . His Github profile is


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

PyCon Pune 2017 more