big data analytics with machine learning arrow_forward
Overcoming problems that you will face when trying to break speed limit
Submitted by Sunil Sayyaparaju (@sunils) on Sunday, 15 June 2014
Section: Full talk Technical level: Intermediate
It is everyone's continuous quest to improve the speed at which we do things. What is fast in the past is no longer fast. We need to continuously improve things. In those efforts, we face problems. We also get new opportunities because of the evolving technologies. This talk is to share our knowledge about the opportunities we used and how we overcame some of the problems.
Scaling can be done in two ways
1. Vertical Scaling: Vertical Scaling is about how much more we can achieve on a single system. Its more about better utilization of the resources CPU, Disk I/O, Network I/O, RAM, Interrupts etc. All these resources should be given their due respect. The newer kernels and libraries over them give better hooks at controlling and using these resources.
2. Horizontal Scaling: Horizontal Scaling is about using multiple machines as a single unit (cluster) to split the problem and solve them in parallel. All the distributed systems like Hadoop, NoSQL databases etc fall into this category. Distributed systems does not magically give speed. A few principles and disciple needs to be followed or else a distributed system may perform worse than a single system.
We will discuss these two aspects in the talk
Sunil Sayyaparaju, Engineering Lead at Aerospike, has over 9 years experience working on different types of SQL RDBMS solutions, such as single machine (monolithic), in-memory, distributed shared-disk, and distributed shared-nothing architectures with emphasis in transaction management, storage, access, performance tuning, and recovery areas.
Sunil currently leads Aerospike's Bangalore office, working on their distributed shared-nothing NoSQL solution. Aerospike is a high-performance, self-balancing, immediately consistent, distributed NoSQL database. Aerospike also has an add on product for replication across data-centers over WAN which supports different complex topologies.