The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Suuchi - Toolkit to build distributed systems

Submitted by Sriram R (@brewkode) on Wednesday, 26 April 2017

videocam
Preview video

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +26

Abstract

At Indix, we have a bunch of services that need to operate on top of large volume of product data. We started out with using open source distributed systems (like Hadoop, HBase, Solr, Spark, etc) to build some of our solutions. Along the way, we’ve also had problems where existing solutions wouldn’t really work for our requirements and operational cost associated with them started to shoot up. This pushed us to build a couple of “simple” distributed systems.

As we built them, some common functionality (or abstractions) started to emerge. We put them into the following buckets

  • Communication
  • Membership
  • Data Sharding and Requesting Routing
  • Replication
  • (Optional) Storage Abstraction

In parallel, we started to embrace microservices as a practice for rolling out new systems. This meant that, new systems needed to scale by default. Instead of solving these problems at every system, we wanted to provide these building blocks so that new systems can be built in a distributed fashion with a lot less effort. This also re-emphasized our engineering culture where developers own everything end to end, including scale & distribution.

With this context, we started out building Suuchi⁰, a toolkit for building distributed data systems.

Take Aways

As a Developer, if you are building a new system, you would end up writing a service specification and definition using protobuf & gRPC respectively. Now, this service can be built into a distributed service using the primitives provided by Suuchi. You can setup membership support, plug-in the partitioner / router, decide the replication strategy & bootstrap your distributed service.

As an Ops Person, in the team if most of your systems are built using a common set of primitives it makes your life easier to build elegant & detailed tooling around it - like, metrics, request tracing, monitoring, alerting, etc.

⁰ - Suuchi is an opensource project written in Scala available under Apache v2 License.

Outline

  • Context setting for the topic: bring in notion of distributed systems and the need for state
  • Complexities involved & advantages if done right
  • Common problems to be solved when building them
  • What does Suuchi provide
  • Suuchi @ Indix
  • Example code
  • Learning & take aways

Speaker bio

Sriram R is one of the early members of the Engineering team at Indix and has been part of various systems that is ‘live’ in production today. Apart from writing code to foot his bills, he does it to get some adrenaline going, when he is not riding his bike on unchartered terrains. He is co-author of Suuchi project along with his fellow engineer Ashwanth kumar. Currently, he works on problems involving ML at Indix.

Links

Slides

https://speakerdeck.com/brewkode/suuchi-fifthelephant-talk-outline

Preview video

https://youtu.be/I8Hsan4PDGE

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Thanks for an interesting proposal. Please share draft slides and a two-min preview video explaining what this talk is about and the key takeaway for the audience. Upload this information here.

    • 1
      Sriram R (@brewkode) Proposer a year ago

      Zainab,

      Will do the needful in a day or so. Thank you.

  • 1
    Sriram R (@brewkode) Proposer a year ago

    Zainab,
    Both the preview video & outline slides are put up here. Do take a look. I’ve made the links publicly accessible too - checked it from incognito too!

    • 1
      Zainab Bawa (@zainabbawa) Reviewer a year ago

      Perfect!

  • 1
    Sriram R (@brewkode) Proposer a year ago

    Zainab,
    Any updates on this? Wondering if there’s anything that you guys are waiting on from my side.

    • 1
      Zainab Bawa (@zainabbawa) Reviewer a year ago

      Sriram, the content on the slides is sparse. When do you plan to fill in more details? You will hear from us on 10 June about the status of your proposal.

      • 1
        Sriram R (@brewkode) Proposer a year ago (edited a year ago)

        Zainab, are you looking for the full talk slides? Right now, I’ve just shared with you the talk outline. Please help me with what exactly you would like to see and I will do the needful.

        • 1
          Zainab Bawa (@zainabbawa) Reviewer a year ago

          Hi Sriram, yes, I noticed this is an outline. We have an editorial review call tomorrow. My colleagues from the editorial team will update you on the nature of details required.

          • 1
            Sriram R (@brewkode) Proposer a year ago

            Sounds good, thanks Zainab.

Login with Twitter or Google to leave a comment