Baking a cloud-native data warehouse from enterprise database leftovers

Jul 2018

23 Mon

24 Tue

25 Wed

26 Thu 07:45 AM – 06:15 PM IST

27 Fri 07:45 AM – 05:35 PM IST

28 Sat

29 Sun

NIMHANS Convention Centre, Bengaluru

Baking a cloud-native data warehouse from enterprise database leftovers

Submitted Mar 27, 2018

Section: Crisp talk Technical level: Intermediate

dataxu® deals with collection, storage, processing, analysis, and projection of data at massive scale.

For the growing needs of interactive analysis and querying, we incorporated an MPP database as our warehouse solution. This on-premise solution served us well as the cluster scaled over the initial years. However with business growing, we ran into significant operational challenges such as constant maintenance in terms of hardware and software, unpredictable load times resulting in SLA misses. With an aim to address these pain points and deliver sustainably, we chose to leverage the cloud for it’s scaling capabilities. The bedrock of the new Reporting system is an Apache Spark driven ETL solution over S3 data lake.

In this talk, we will focus on the challenges posed by the existing system and design choices that were made in our quest for a new system.

Outline

What we do at dataxu
dataxu’s Reporting infrastructure
Key challenges
Design choices for new system
Recipe for a cloud-native data warehouse
Key benefits
Takeaways

Speaker bio

Vineeti Louis has been working with dataxu for the last 3 years. She has worked with the Reporting Services team at dataxu and was involved in building the reporting system on enterprise data warehouse, and later moving it to the cloud.
https://www.linkedin.com/in/vineeti-louis-b8275546/

Slides

https://docs.google.com/presentation/d/1a_euHDzzONwJXAkcuUFm9591lkRJOexM3jJ_SYXc16w/edit?usp=sharing

The Fifth Elephant 2018