The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

How we are building serverless architectures for Deep Learning & NLP at Episource

Submitted by Manas Ranjan Kar (@manasrkar-episource) on Sunday, 30 April 2017

videocam
Preview video

Technical level

Intermediate

Section

Crisp talk for data engineering track

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +7

Abstract

Serverless is the new kid on the block, and an exciting one at that ! As Anand Chitipothu puts it, it’s rapidly becoming the Uber of cloud computing resources.

At Episource, we have been working on creating a scalable NLP pipeline for scalable information extraction from medical discharge summaries. However, processing millions of charts can be expensive too. In this talk, we will show how we created a serverless and event driven architecture for deep learning. The platform requires no manual intervention and minimal upkeep & maintenance. We will share the challenges in creating such an architecture, especially when requirements of immutable configuration and heavy data workloads don’t go away easily. A detailed demo of the architecture would be showcased as well.

The audience can expect to learn a lot from our experiences and maybe go serverless at their roles as well !

Outline

  • Problems & Challenges
  • Why Serverless ?
  • Components of this serverless NLP architecture
  • Towards an immutable configuration
  • Architecture Diagram & Details
  • Impact of the serverless architecture
  • Demo

The slides link has been shared, and will be updated regularly.

Speaker bio

I am currently leading the NLP & Data Science practice at Episource, a US healthcare company. My daily work revolves around working on semantic technologies and computational linguistics (NLP), building algorithms and machine learning models, researching data science journals and architecting secure product backends in the cloud.

Techstack that my team and I typically work on includes;

Language: Python
Testing Frameworks: unittest, pytest
Automation & Configuration Management: Ansible, Docker, Vagrant
CI: Travis CI
Cloud Services: AWS, Google Cloud, MS Azure
APIs: Bottle, CherryPy, Flask
Databases: MySQL, SQLite, MSSQL, RDF stores, Neo4J, ElasticSearch, MongoDB, Redis
Editor: Sublime, Pycharm

I have architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. I am deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. I have contributed to multiple NLP libraries like Gensim and Conceptnet5. I blog regularly on NLP on multiple forums like Data Science Central, LinkedIn and my blog NLP Wave.

I love teaching and mentoring students. I speak regularly on NLP and text analytics at conferences and meetups like Pycon India and PyData. I have also taught multiple hands-on session at IIM Lucknow and MDI Gurgaon. I have mentored students from schools like ISB Hyderabad, BITS Pilani, Madras School of Economics. When bored - I like to fall back on Asimov to lead me into an alternate reality.

Links

Slides

https://docs.google.com/presentation/d/1MhCp3Q0voxkbH1HviuEzjhRKzSIFJfTPNCihu83pHC8/edit?usp=sharing

Preview video

https://drive.google.com/open?id=0B4HOxXZQF1yzZ0UwY1JPazZiT0k

Comments

  • 2
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Share two-min preview video explaining what this talk is about and why participants should attend.

    • 1
      Manas Ranjan Kar (@manasrkar-episource) Proposer a year ago

      Hi Zainab - I have posted the video with the link to the Google Drive. Please let me know if you require any more information.

      Have a nice evening !

Login with Twitter or Google to leave a comment