The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Manas Ranjan Kar

@manasrkar_episource

How we are building serverless architectures for Deep Learning & NLP at Episource

Submitted Apr 30, 2017

Serverless is the new kid on the block, and an exciting one at that ! As Anand Chitipothu puts it, it’s rapidly becoming the Uber of cloud computing resources.

At Episource, we have been working on creating a scalable NLP pipeline for scalable information extraction from medical discharge summaries. However, processing millions of charts can be expensive too. In this talk, we will show how we created a serverless and event driven architecture for deep learning. The platform requires no manual intervention and minimal upkeep & maintenance. We will share the challenges in creating such an architecture, especially when requirements of immutable configuration and heavy data workloads don’t go away easily. A detailed demo of the architecture would be showcased as well.

The audience can expect to learn a lot from our experiences and maybe go serverless at their roles as well !

Outline

  • Problems & Challenges
  • Why Serverless ?
  • Components of this serverless NLP architecture
  • Towards an immutable configuration
  • Architecture Diagram & Details
  • Impact of the serverless architecture
  • Demo

The slides link has been shared, and will be updated regularly.

Speaker bio

I am currently leading the NLP & Data Science practice at Episource, a US healthcare company. My daily work revolves around working on semantic technologies and computational linguistics (NLP), building algorithms and machine learning models, researching data science journals and architecting secure product backends in the cloud.

Techstack that my team and I typically work on includes;

Language: Python
Testing Frameworks: unittest, pytest
Automation & Configuration Management: Ansible, Docker, Vagrant
CI: Travis CI
Cloud Services: AWS, Google Cloud, MS Azure
APIs: Bottle, CherryPy, Flask
Databases: MySQL, SQLite, MSSQL, RDF stores, Neo4J, ElasticSearch, MongoDB, Redis
Editor: Sublime, Pycharm

I have architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. I am deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. I have contributed to multiple NLP libraries like Gensim and Conceptnet5. I blog regularly on NLP on multiple forums like Data Science Central, LinkedIn and my blog NLP Wave.

I love teaching and mentoring students. I speak regularly on NLP and text analytics at conferences and meetups like Pycon India and PyData. I have also taught multiple hands-on session at IIM Lucknow and MDI Gurgaon. I have mentored students from schools like ISB Hyderabad, BITS Pilani, Madras School of Economics. When bored - I like to fall back on Asimov to lead me into an alternate reality.

Slides

https://docs.google.com/presentation/d/1MhCp3Q0voxkbH1HviuEzjhRKzSIFJfTPNCihu83pHC8/edit?usp=sharing

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures