The Fifth Elephant round the year submissions for 2019

Submit a talk on data, data science, analytics, business intelligence, data engineering and ML engineering

Propose a session

Real-Time DataQuality on Flink

Submitted by Jaydeep Vishwakarma on Monday, 17 June 2019


Preview video

Session type: Full talk of 40 mins

Abstract

My use case is to provide monitoring, and improving the overall search data quality, also to find the unusual patterns of user’s search behavior, and notifying the intent on-site back to the respective business stakeholders. To achieve the same, I explored various big data processing engines, which can process the huge data with complex business logic in real time. Eventually, I used Flink Stream processing. This talk will showcase how I used Flink to accomplish my goal.

Outline

What is Real Time Aggregation​?
System Requirement
Flink vs Spark
Flink Cluster setup
Flink on Yarn
Architecture
100% data completeness
Batch vs Realtime
Open Items

Speaker bio

I am a Staff Software Engineer in Walmart and Apache Oozie Committer. I am currently trying to solve some of the search problems. I am in Big Data space since last 10 years.

Slides

https://www.slideshare.net/jaydeepmail/real-time-data-quality-on-flink

Preview video

https://youtu.be/1iqFRvW4wrQ

Comments

  • Abhishek Balaji (@booleanbalaji) Reviewer 5 months ago

    Hi Jaydeep,

    Thank you for submitting a proposal. We need to see detailed slides and a preview video to evaluate your proposal. Your slides must cover the following:

    • Problem statement/context, which the audience can relate to and understand. The problem statement has to be a problem (based on this context) that can be generalized for all.
    • What were the tools/frameworks available in the market to solve this problem? How did you evaluate these, and what metrics did you use for the evaluation? Why did you pick the option that you did?
    • Explain how the situation was before the solution you picked/built and how it changed after implementing the solution you picked and built? Show before-after scenario comparisons & metrics.
    • What compromises/trade-offs did you have to make in this process?
    • What is the one takeaway that you want participants to go back with at the end of this talk? What is it that participants should learn/be cautious about when solving similar problems?

    We need your updated slides and preview video by Jun 28, 2019 to evaluate your proposal. If we do not receive an update, we’d be moving your proposal for evaluation under a future event.

    • Abhishek Balaji (@booleanbalaji) Reviewer 5 months ago

      Marked as rejected since proposer hasnt responded to comments/updated content before deadline. Will be considered for a future event if content is updated.

  • Jaydeep Vishwakarma Proposer 5 months ago

    Hi Abhishek I have updated the conent and video by Thurshday itself. Please have a look on it

  • Jaydeep Vishwakarma Proposer 5 months ago

    Hi Abhishek, I have updated the content and video before the deadline, Check you can still consider it

Login with Twitter or Google to leave a comment