The Fifth Elephant 2020 edition
On data governance, engineering for data privacy and data science
Alok Kumar
A successful data science project is not just about building powerful models, but the efficient execution of the entire project life-cycle. Unfortunately, the data science has been made like ART and ARTIST that uses hard to guess and unexplainable tricks. The purpose of this talk is to make the “art” “science” again.
In this talk, I will introduce a collection of powerful, open-source tools that helps build an effective data science pipeline. While all the code and examples will be in Python, most recommendations/ideas will be language-agnostic and hence can be applied to a range of software languages and concepts. I use these tools at my workplace and it has helped immensly in making the pipeline repeatable , reliable, accurate and fast.
By the end of the talk, you will have a good collection of tools for all the different project stages like - project organization, data import, data exploration, analysis, debugging, monitoring etc.
Author, speaker and a creative technologist! I am Director of engineering and innovation, with extensive experience in leading strategic initiatives and driving cutting edge fast paced innovations to help build AI-first organisation. I am a full stack AI practitioner and a certified architect with demonstrated track record of architecting and engineering large-scale solutions ranging from products to platforms.
I have authored book on the topic and run local chapters of not for profit international organizations AI Saturdays and Creative morning in NCR.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}