Scalable Realtime Analytics using Druid

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

NIMHANS Convention Centre

All submissions

Previous Next

This submission has been added to the schedule

Scalable Realtime Analytics using Druid

Submitted Jul 6, 2016

Section: Full talk Technical level: Intermediate

Traditional SaaS solutions based on hadoop datastore Hive/Hbase or classical RDBMS work well for storing data, although they are not optimized for ingesting data and making it immediately available for interactive ad-hoc low latency queries at a very high scale. Long query latencies make these solutions suboptimal choices to power interactive applications. This talk will introduce Druid as a complementing solution for scalable real-time ingestion and analytics.

Druid is an open source distributed data warehouse, designed to support OLAP-like queries and is used in production at numerous companies. It was inspired by Google’s Dremel, PowerDrill and search framework. This talk will cover druid architecture, its storage internals and the common use cases druid is a good fit for.

Outline

History and Motivation
Live Demo
Druid Architecture
Storage Internals
Druid in Practice
Common Use Cases

Speaker bio

Nishant is an active contributor and PMC member for Druid. He is part of Business Intelligence team at Hortonworks. Prior to that he was part of Metamarkets backend team and was responsible for analytics infrastructure, including real-time analytics in Druid. He holds a B.Tech in Computer Science from National Institute of Technology, Kurukshetra, India.