The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Sunil Sayyaparaju

How big data moved the needle from monolithic SQL RDBMS to distributed NoSQL

Submitted Jun 11, 2012

The objective of this talk is to talk about different types of DBMS solutions (SQL/NoSQL - monolith/distributed) and the type of applications that are appropriate for a type of DBMS. The talk will highlight the design choices made by the developers for these different types of DBMS.

Outline #

As we know, we are in an age of data explosion. More and more data started to pour in. New types of data like unstructured data also started to flow in. Different use-cases (like real-time queries) started to emerge. In this talk we will take a journey from the monolithic
SQL RDBMS solution, to distributed shared-disk SQL, to distributed shared-nothing SQL, to distributed shared-nothing NoSQL solutions.

Along the journey we will see what factors contributed to the evolution of the next thing and what kind of design choices were made by the engineers along the evolution. We will also see what we got rid of (or the tradeoffs) during the evolution process. We will talk about what kind of applications will be best suited to a particular type of database.

Speaker bio #

I am in the database field for the past 8 years. I worked in internals of different types of SQL RDBMS solutions like single machine(monolithic), in-memory, distributed shared-disk, and distributed shared-nothing architectures. I worked primarily in transaction management, storage, access, performance tuning, and recovery areas.

I am currently working on distributed shared-nothing NoSQL solution(Citrusleaf). Citrusleaf is a high-performance, self-balancing, Immediately consistent, distributed NoSQL database. We developed an addon product for replication across data-centers over WAN which supports different complex topologies.

Slides #

http://www.slideshare.net/sunilvirus/how-big-data-moved-the-needle-from-monolithic-sql-rdbms-to-distributed-nosql

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more