The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Sriram R

@brewkode

Suuchi - Toolkit to build distributed systems

Submitted Apr 26, 2017

At Indix, we have a bunch of services that need to operate on top of large volume of product data. We started out with using open source distributed systems (like Hadoop, HBase, Solr, Spark, etc) to build some of our solutions. Along the way, we’ve also had problems where existing solutions wouldn’t really work for our requirements and operational cost associated with them started to shoot up. This pushed us to build a couple of “simple” distributed systems.

As we built them, some common functionality (or abstractions) started to emerge. We put them into the following buckets

  • Communication
  • Membership
  • Data Sharding and Requesting Routing
  • Replication
  • (Optional) Storage Abstraction

In parallel, we started to embrace microservices as a practice for rolling out new systems. This meant that, new systems needed to scale by default. Instead of solving these problems at every system, we wanted to provide these building blocks so that new systems can be built in a distributed fashion with a lot less effort. This also re-emphasized our engineering culture where developers own everything end to end, including scale & distribution.

With this context, we started out building Suuchi⁰, a toolkit for building distributed data systems.

Take Aways

As a Developer, if you are building a new system, you would end up writing a service specification and definition using protobuf & gRPC respectively. Now, this service can be built into a distributed service using the primitives provided by Suuchi. You can setup membership support, plug-in the partitioner / router, decide the replication strategy & bootstrap your distributed service.

As an Ops Person, in the team if most of your systems are built using a common set of primitives it makes your life easier to build elegant & detailed tooling around it - like, metrics, request tracing, monitoring, alerting, etc.

⁰ - Suuchi is an opensource project written in Scala available under Apache v2 License.

Outline

  • Context setting for the topic: bring in notion of distributed systems and the need for state
  • Complexities involved & advantages if done right
  • Common problems to be solved when building them
  • What does Suuchi provide
  • Suuchi @ Indix
  • Example code
  • Learning & take aways

Speaker bio

Sriram R is one of the early members of the Engineering team at Indix and has been part of various systems that is ‘live’ in production today. Apart from writing code to foot his bills, he does it to get some adrenaline going, when he is not riding his bike on unchartered terrains. He is co-author of Suuchi project along with his fellow engineer Ashwanth kumar. Currently, he works on problems involving ML at Indix.

Slides

https://speakerdeck.com/brewkode/suuchi-fifthelephant-talk-outline

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures