The Fifth Elephant 2016

India's most renowned data science conference

High performance computing using Spark

Submitted by Anand Katti (@anandkatti) on Friday, 29 April 2016

videocam_off

Technical level

Intermediate

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +2

Abstract

Spark has revolutionized the way Big data computation are done. It provides efficient way of distributed data processing computation. In this session, I will cover our experience of implementing a large scale big data platform (> 100 TB) using Spark and challenges faced/lessons learned

Outline

Spark has revolutionized the way Big data computation are done. It provides efficient way of distributed data processing computation. In this session, I will cover our experience of implementing a large scale big data platform (> 100 TB) using Spark and challenges faced/lessons learned

Speaker bio

Over 17 years of IT industry experience in Data technologies
More than 3+ years of experience in Big Data
Extensive experience in Hadoop, Spark & NOSQL
Architected and delivered multiple end to end Big Data projects

Comments

  • 1
    t3rmin4t0r (@t3rmin4t0r) Reviewer 2 years ago

    our experience of implementing a large scale big data platform

    Would be a good idea to add more detail on the plural “our” there.

    The 100Tb+ problems in ad-tech are very different from those faced in fin-tech.

    Are the problems discussed primarily about high-volume, but low-value click-streams or extremely dense high value data like insurance filings?

Login with Twitter or Google to leave a comment