Submit a talk on data

Submit talks on data engineering, data science, machine learning, big data and analytics through the year – 2019

Tickets Propose a session

Generating Data Analytics Reports using Scalable Config Driven Framework

Submitted by Satish Gopalani (@satishg) on Tuesday, 4 September 2018

Technical level: Intermediate


Generating a prolific number of Analytics Reports from 100’s of different dimensions and metrics for customers and internal stakeholders has been a critical work of BigData Analytics team at PubMatic.
Writing custom jobs to provide analytic reports, leads to repetitive efforts and redundancy of business logic in many different jobs.
Another challenge is scaling the platform which already processes 500 billion transactions (50 terabytes of data) per day on a 900-node cluster with ever-growing volume.
Therefore, we built a platform that allows creating a configuration driven data processing pipeline with highly re-usable business functions. It is also extensible to utilize cutting-edge technologies in the ever-changing big data ecosystem. This platform enables our development teams to build a robust batch data processing pipeline to power analytics dashboards. It also empowers novice users to provide a configuration with fact and dimensions to generate ad-hoc reports in a single data processing job. Framework intelligently identifies and re-uses existing business functions based on user inputs. It also provides an abstraction layer that keeps core business logic un-affected by any technology changes. This framework is currently powered by Spark, but it can be easily configured with other technologies.


  • Overview of Data Pipelines @ PubMatic
  • Scale and its issues
  • Data Framework Details
  • Uses of the framework and future use cases

Speaker bio

Satish Gopalani
A Machine Learning/AI and Distributed Systems engineer who enjoys solving complex problems and design application and systems to work at scale.Have worked on engineering various complex projects which include building predictive ML project for online advertising, deriving interseting insights on IPL(Indian Premier League), building connectors to offload data to Hadoop and even modifying Hadoop HDFS source code to make Namenode more scalable. I have B.Tech in Computer Science from VIT, Pune and have specialization in “Big Data Analytics” from IIM Bangalore.

Akshay Habbu
A Big Data Engineer with ample of experience working at scale with Spark, MapReduce and HDFS. Handled more than 60TB of data streaming everyday in the cluster of 900 nodes with 45PB under management. Deeply intereseted in designing & implementing complex & scalable data processing pipelines. Have varied interests ranging from bigdata, analytics, software engineering to being a food blogger.


  • Zainab Bawa (@zainabbawa) Reviewer 10 months ago (edited 10 months ago)

    We only accept one speaker per session. Let us know who is the primary contact here, who will present, if this proposal is selected. Also, submit draft slides and preview video for this proposal by no later than 5 October.

  • Satish Gopalani (@satishg) Proposer 10 months ago

    Hi Zainab Bawa,

    Will upload the draft slides and preview video soon.

    Regarding speakers, My colleague and I had presented together in ML Mini Conference 2017 Pune.
    You can refer to this proposal:
    Actually, Akshay and I had together worked on this and had prepared slides and other stuff, so wanted to present together.

    Satish Gopalani

    • Zainab Bawa (@zainabbawa) Reviewer 8 months ago

      Our policy is one speaker per session. If you can’t comply with it, please withdraw your proposal.

  • Muhammad Shoaib (@jessicaalex) 4 months ago

    Good post. Thanks for sharing with us. I just loved your way of presentation. I enjoyed reading this .Thanks for sharing and keep writing. It is good to read blogs like this visit . As constantly, we appreciate yourself assurance and accept as true within us.

  • bovave (@bovave) 4 months ago

  • fedders (@fedders) 3 months ago

Login with Twitter or Google to leave a comment