How we are building serverless architectures for Deep Learning & NLP at Episource
Manas Ranjan Kar
Serverless is the new kid on the block, and an exciting one at that ! As Anand Chitipothu puts it, it’s rapidly becoming the Uber of cloud computing resources.
At Episource, we have been working on creating a scalable NLP pipeline for scalable information extraction from medical discharge summaries. However, processing millions of charts can be expensive too. In this talk, we will show how we created a serverless and event driven architecture for deep learning. The platform requires no manual intervention and minimal upkeep & maintenance. We will share the challenges in creating such an architecture, especially when requirements of immutable configuration and heavy data workloads don’t go away easily. A detailed demo of the architecture would be showcased as well.
The audience can expect to learn a lot from our experiences and maybe go serverless at their roles as well !
- Problems & Challenges
- Why Serverless ?
- Components of this serverless NLP architecture
- Towards an immutable configuration
- Architecture Diagram & Details
- Impact of the serverless architecture
The slides link has been shared, and will be updated regularly.
I am currently leading the NLP & Data Science practice at Episource, a US healthcare company. My daily work revolves around working on semantic technologies and computational linguistics (NLP), building algorithms and machine learning models, researching data science journals and architecting secure product backends in the cloud.
Techstack that my team and I typically work on includes;
Testing Frameworks: unittest, pytest
Automation & Configuration Management: Ansible, Docker, Vagrant
CI: Travis CI
Cloud Services: AWS, Google Cloud, MS Azure
APIs: Bottle, CherryPy, Flask
Databases: MySQL, SQLite, MSSQL, RDF stores, Neo4J, ElasticSearch, MongoDB, Redis
Editor: Sublime, Pycharm
I have architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. I am deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning. I have contributed to multiple NLP libraries like Gensim and Conceptnet5. I blog regularly on NLP on multiple forums like Data Science Central, LinkedIn and my blog NLP Wave.
I love teaching and mentoring students. I speak regularly on NLP and text analytics at conferences and meetups like Pycon India and PyData. I have also taught multiple hands-on session at IIM Lucknow and MDI Gurgaon. I have mentored students from schools like ISB Hyderabad, BITS Pilani, Madras School of Economics. When bored - I like to fall back on Asimov to lead me into an alternate reality.
- LinkedIn : https://in.linkedin.com/in/manasranjankar
- Contribution to Gensim (PR #625): https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/scripts/glove2word2vec.py
- Blog: http://unlocktext.com/
- Related Blog Article: http://unlocktext.com/index.php/2015/12/14/using-glove-vectors-in-gensim/
- Context oriented NLP: https://www.linkedin.com/pulse/context-extraction-better-sentiment-analysis-manas-ranjan-kar?trk=prof-post
- Analysing product reviews for context cues: http://www.datasciencecentral.com/profiles/blogs/impactful-text-analytics-for-smarter-businesses