The Fifth Elephant 2013

An Event on Big Data and Cloud Computing

Abhishek Kona

@sheki

The database cannot be better than the underlying datastructure

Submitted May 1, 2013

Understand the common underlying datastructures in current storage engines, the trade offs and why this should drive decide which database to use for your next app.

Outline

  • look at the different datastrucutres behind different storage engines, give examples.
  • an introduction to the three datastrucutres in use today - Binary Trees - InnoDB, LSM Trees - LevelDB, BigTable, HBase, Cassandra . Fractal Trees - TokuDB, TokyoTyrant.
  • understand the trade-offs when you decide to pick the next database for you application.

Previously, I chose databases for my applications by treating databases as a black box. I believe a lot of Engineers start with this approach.
Knowing about the underlying datastrucutres and the optimization parameters will make the decision more scientific.

I will also briefly explain how Facebook benchmarks databases and makes the decision of investing in a database technology.

Requirements

  • Basic understanding of the Binary Tree.
  • Understand basic performance terminology Throughput and Latency.

Speaker bio

I currently work on the Database Engineering team at Facebook.
We here are hacking on the next generation storage engine.

I have worked on data storage problems for 2+ years now.
I have been on both sides of the coin, a database user, and now a database Engineer.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures