Creating a multilingual resume ranking API engine based on NLP and contextual word embeddings

Feb 2017

13 Mon

14 Tue

15 Wed

16 Thu 09:00 AM – 06:00 PM IST

17 Fri 09:00 AM – 06:00 PM IST

18 Sat

19 Sun

AMANORA THE FERN HOTELS AND CLUB, PUNE, Pune

Creating a multilingual resume ranking API engine based on NLP and contextual word embeddings

Submitted Nov 30, 2016

Technical level: Intermediate

How can we create a resume ranking engine based purely on job descriptions on job boards? Can I create recommendations purely based on the skillset & job role? How do I use natural language processing techniques to create valid recommendations of related skillsets? For example, how can I recommend “AngularJS” to an HTML developer who wants to prop up his CV? The other challenge lies in dealing with multilingualism - from English & Dutch job boards?

This talk will showcase how a recommendation engine can be built with job descriptions using a state-of-the-art technique - word2vec. We will create something that not only matches the existing recommender systems deployed by job websites, but goes one step ahead - ranking & scoring a resume from its content. The beauty of such a framework is that not only does it support online learning, but is also not too sensitive to language differences.

How do we account for the proper skillsets and build it in our ranking systems? The talk will answer these questions and showcase effectiveness of such a resume ranking engine.

Outline

Resumes & CVs on job boards
Introduction to word2vec
Data Collection from job posts
Handling multilingualism - Dutch & English
Preprocessing steps
Ranking - Logic & Algorithms
Deployment via API
Results & Discussions

Speaker bio

Manas likes helping clients making sense of their data and build a powerful case for business change using analytics in their respective companies.

He has architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. He is deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. He has contributed to Gensim & ConceptNet

To sum up his experience, he has worked on;

Application of machine learning to build text analytics solutions
Automate business processes for efficiency & productivity
Build algorithms for extracting multiple facets from text - gender of author, keywords, sentiment, taxonomies, concepts, entities
Combine and augment unstructured insights with structured data
Build recommendation engine for automated medical coding services
Build models to predict taxonomies for textual content
Create machine learning algorithms for topic detection & sentiments
Competitive intelligence algorithms to monitor events & trends for startups & SMEs

His detailed LinkedIn profile is https://in.linkedin.com/in/manasranjankar . His Github profile is https://github.com/manasRK

PyCon Pune 2017