The Fifth Elephant 2018

The seventh edition of India's best data conference

Qubole Sparklens: understanding the scalability limits of Spark applications

Submitted by Rohit Karlupia (@sixthelephant) on Monday, 26 March 2018

Section: Full talk Technical level: Intermediate Status: Confirmed & Scheduled


One of the common requests we receive from customers (at Qubole) is debugging slow spark application. Usually this process is done with trial and error, which takes time. Moreover, it doesn’t tell us where to looks for further improvements. We at Qubole are looking into making this process more self-serve.

Towards this goal we have built a tool (OSS based on spark event listener framework. From a single run of the application, Sparklens provides insights about scalability limits of given spark application. In this talk we will cover the what Sparklens does and theory behind Sparklens. We will talk about how structure of spark application puts important constraints on its scalability. How can we find these structural constraints and how to use these constraints as a guide in solving performance and scalability problems of spark applications.

This talk will help audience in answering the following questions about their spark applications:
1) Will their application run faster with more executors?
2) How will cluster utilization change as number of executors change?
3) What is the absolute minimum time this application will take even if we give it infinite executors?
4) What is the expected wall clock time for the application when we fix the most important structural limits of these application?


1) Single threaded applications
2) Multi-threaded applications
3) Distributed applications using spark
4) When the applicaton “does nothing” and why?
5) Driver, Parallelism & Skew
6) Critical Path of spark application
7) Defining Ideal Spark application
8) Introduction to Sparklens
9) Understanding Sparklens report
10) Where to fish for further improvements



Speaker bio

Rohit Karlupia has been mainly writing high performance server applications, ever since completing his Bachelors of Technology in Computer Science and Engineering from IIT Delhi in 2001. He has deep expertise in the domain of messaging, API gateways and mobile applications. His primary research interests are performance and scalability. At Qubole, his focus is making Big Data as a Service, debuggable, scalable and performant. His current work includes SparkLens (open source Spark profiler), GC/CPU aware task scheduling for spark and Qubole Chunked Hadoop File System.


Preview video


  • Fletcher Lester (@sizokokem) 4 months ago

    One of the most common and general mistakes the user can do is to implement the code without solving the error. In comparison with others you can testify your code by passing the simple test data from it.

  • Elmer John (@elmerjohn) 2 months ago

    This profiling tool helps in understanding how efficiently a given Spark application is using the compute resources provided to it. Maybe application will run faster with more executors and may be it wont It can answer this question by looking at a single run of application.

Login to leave a comment