The Fifth Elephant round the year submissions for 2019
Submit a talk on data, data science, analytics, business intelligence, data engineering and ML engineering
Anshul Singhle
In this talk, we will explore the advantages and challenges faced while running an in-house data platform using spark and S3. We will also discuss how to add some essential features to your platform like autoscaling and access control. The latter part of the talk will also address some ways to organise data in S3, storage formats for big data and indexing to improve read performance for big-data use cases. Overall the intention of this talk is to share the problems we faced while scaling our data platform and some of the solutions that worked for us.
I have been working on big-data pipelines for the past 5 years, first at my startup, retention.ai , then later at inShorts. Currently working as backend engineer at Zendrive
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}