How we are building serverless architectures for Deep Learning & NLP at Episource

This submission has been added to the schedule

How we are building serverless architectures for Deep Learning & NLP at Episource

Submitted Apr 30, 2017

Section: Crisp talk for data engineering track Technical level: Intermediate

Serverless is the new kid on the block, and an exciting one at that ! As Anand Chitipothu puts it, it’s rapidly becoming the Uber of cloud computing resources.

At Episource, we have been working on creating a scalable NLP pipeline for scalable information extraction from medical discharge summaries. However, processing millions of charts can be expensive too. In this talk, we will show how we created a serverless and event driven architecture for deep learning. The platform requires no manual intervention and minimal upkeep & maintenance. We will share the challenges in creating such an architecture, especially when requirements of immutable configuration and heavy data workloads don’t go away easily. A detailed demo of the architecture would be showcased as well.

The audience can expect to learn a lot from our experiences and maybe go serverless at their roles as well !

Outline

Problems & Challenges
Why Serverless ?
Components of this serverless NLP architecture
Towards an immutable configuration
Architecture Diagram & Details
Impact of the serverless architecture
Demo

The slides link has been shared, and will be updated regularly.

Speaker bio

I am currently leading the NLP & Data Science practice at Episource, a US healthcare company. My daily work revolves around working on semantic technologies and computational linguistics (NLP), building algorithms and machine learning models, researching data science journals and architecting secure product backends in the cloud.

Techstack that my team and I typically work on includes;

Language: Python
Testing Frameworks: unittest, pytest
Automation & Configuration Management: Ansible, Docker, Vagrant
CI: Travis CI
Cloud Services: AWS, Google Cloud, MS Azure
APIs: Bottle, CherryPy, Flask
Databases: MySQL, SQLite, MSSQL, RDF stores, Neo4J, ElasticSearch, MongoDB, Redis
Editor: Sublime, Pycharm

I have architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. I am deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. I have contributed to multiple NLP libraries like Gensim and Conceptnet5. I blog regularly on NLP on multiple forums like Data Science Central, LinkedIn and my blog NLP Wave.

I love teaching and mentoring students. I speak regularly on NLP and text analytics at conferences and meetups like Pycon India and PyData. I have also taught multiple hands-on session at IIM Lucknow and MDI Gurgaon. I have mentored students from schools like ISB Hyderabad, BITS Pilani, Madras School of Economics. When bored - I like to fall back on Asimov to lead me into an alternate reality.

Links

LinkedIn : https://in.linkedin.com/in/manasranjankar
Contribution to Gensim (PR #625): https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/scripts/glove2word2vec.py
Blog: http://unlocktext.com/
Related Blog Article: http://unlocktext.com/index.php/2015/12/14/using-glove-vectors-in-gensim/
Context oriented NLP: https://www.linkedin.com/pulse/context-extraction-better-sentiment-analysis-manas-ranjan-kar?trk=prof-post
Analysing product reviews for context cues: http://www.datasciencecentral.com/profiles/blogs/impactful-text-analytics-for-smarter-businesses

Slides

https://docs.google.com/presentation/d/1MhCp3Q0voxkbH1HviuEzjhRKzSIFJfTPNCihu83pHC8/edit?usp=sharing

The Fifth Elephant 2017

How we are building serverless architectures for Deep Learning & NLP at Episource

Outline

Speaker bio

Links

Slides

Comments