The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

How to build scalable and robust data pipeline iteratively.

Submitted by Danish M (@pixelgenie) on Sunday, 4 June 2017

videocam_off

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +3

Abstract

I will drill down to understand how startups can build scalable data pipeline using open source tools. What do all these tools do and how do they fit into the ecosystem? And how to iteratively build a scalable and robust data engineering pipeline as you grow as a company ?

Outline

Companies, non-profit organizations, and governments are all starting to realize the huge value that data can provide to customers, decision makers, and concerned citizens. What is often neglected is the amount of engineering required to make that data accessible. Simply using SQL is no longer an option for large, unstructured, or real-time data. Building a system that makes data usable becomes a monumental challenge for data engineers.

There is no plug and play solution that solves every use case. A data pipeline meant for serving ads will look very different from a data pipeline meant for retail analytics. Since there are unlimited permutations of open-source technologies that can be cobbled together, it can be overwhelming when you first encounter them.

I will drill down to understand how startups can build scalable data pipeline using open source tools. What do all these tools do and how do they fit into the ecosystem? And how to iteratively build a scalable and robust data engineering pipeline as you grow as a company ?

Requirements

NA

Speaker bio

Co-founder & Growth Marketer at PixelGenie. Former Insight Data Science Fellow 2016, NYC.

Started as a self-taught product geek. Worked and helped many startups to build their analytics infrastructure from grounds up. Amalgamation of tech + marketing is what i am most interested about. Believes, product is not just the technology. Its is the whole experience around an offering - from Tech to Marketing to sales and post-sales experience.

I read/write about Growth & Data Science. I believe, Internet is the most democratic system ever created. A person sitting in his or her bedroom with willingness, a laptop and a good internet connection can change this world for better. This is possible in the age of Internet.

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    This proposal is too open ended. Please share slides and preview video, explaining the content that will be covered and key takeaways for the audience.

Login with Twitter or Google to leave a comment