The Fifth Elephant 2014

A conference on big data and analytics

Akash Mishra

Real Time Secure API delivering data @ scale

Submitted Jun 4, 2014

At ThoughtWorks, we have used a Hybrid Approach for designing a Real Time secure API, which gives various adhoc querying capability on large amount of data.

This talk will focus on various pro’s and con’s of traditional database system [RDBMS] as well as big data system [Hadoop/Hive] to build an API. Also I will present how both of these system counterbalance each others problems and deliver large amount of data in Real Time.

Outline

Traditional databases are known for their fast response and security whereas Big Data system are known for their scalability and ability to handle large amount of data.

What happens when we want to build a system that required following?

  • Real Time Response

  • Security

  • Handle GB’s of Data

This presentation will walk you through various requirement, system architecture and key design decision for designing a Secure Real time API. This hybrid approach uses both batch processing on Hadoop and real time processing using a Java application and RDBMS.

Various tech used for API are:

  1. Hadoop/Hive
  2. Oozie
  3. Sqoop
  4. RDBMS

Speaker bio

I am a Developer @ ThougtWorks where I’ve been helping our clients on delivery projects using Big Data Technologies. I have deep interest in Distributed computing and Real time system.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures