The Fifth Elephant 2016

India's most renowned data science conference

Bharani

@bharanisub

Timely Dataflow

Submitted Mar 22, 2016

Many data processing tasks require low-latency interactive access to results, iterative sub-computations, and consistent intermediate outputs so that sub-computations can be nested and composed. Timely Dataflow is the computational model that addresses these challenges as an unified systems as suppose to bolting batch & stream processing system together. It is first presented as part of Naiad (SOSP 2013).

Outline

What are the challenges faced in steram processing: Imagine a system where the data is continuously updated and you need to support both historical data + recent stream and avoid the costly recomputation

How does timely dataflow fit in the stream processing model: Will be covering what timely dataflow offers - cyclic computation, notification mechanism, concept of time in stream processing

Why is it different from other stream processing systems like spark/storm/flink : Not all computation can be easily expressed in Directed Acyclic Graphs which most of the stream processing systems offers - one such example is cyclic computation which can be elegantly modelled in timely dataflow

Pros & Cons: Will take a practical example of an aggregation and showcase pros & cons of the timely dataflow model , with code and time taken

Speaker bio

I am a passionate developer and a speaker. I regularly speak in the monthly geeknight meetup in chennai and have spoken in GIDS 2014,2015 both the years on dealing with systems that handle large volume of data with unique challenges of near real time processing. I have built and maintained systems for Banking, Media, and Retail domain. I continuously challenge the status quo and constantly thrive to improve on the solutions i have built in the past. This journey has made me build & rebuild real time analytics solutions that crunches large volume of data carefully balancing throughput & low latency

  • alldaycoding.blogspot.in

Slides

https://docs.google.com/presentation/d/1mtGyIWsdEEHvcCnMVOobMp78S6FvBlgoZ58JRMSeZ-c/edit?usp=sharing

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures