Rootconf 2017

On service reliability

##Submit proposals for flash talks
Rootconf is on 11-12 May. If you have:

  1. Tips and tricks for simplifying infrastructure management and maintenance;
  2. Experiences with new tools to share;
  3. Cool demos;

then propose a flash talk here, or on the spot, at the venue.

The flash talk session is on 11 May, from 17:20-18:20. We have room for about 12 flash talks. Each presentation should be no more than 5 minutes.

A final note of caution when presenting at flash talks: we have a code of conduct at the conference. You must refrain from making remarks that may be perceived as sexist or derogatory. If you want to double check your presentation, contact Sandhya Ramesh, Karthik B. or Zainab Bawa at the venue.

The theme for the 2017 edition is service reliability. The conference will feature talks on state of the art deployment strategies and appropriate monitoring technologies at different scales. Rootconf this year will broadly cover topics like toil, on-call, outage handling, and post-mortem analysis. We are inviting presentation proposals from academics and practitioners on these topics.

Rootconf aims to appeal to the widest possible range of DevOps practitioners: from embryonic startups to the largest established enterprises. We are keen to schedule presentations that appeal both to attendees’ current needs as well as their future aspirations.

##About the Conference
Rootconf is India’s principal conference where systems and operations engineers share real world knowledge about building reliable systems. We are now accepting submissions for our next edition which will take place in Bangalore on 11-12 May 2017.

Topics for Round 2 of the CfP were:

  1. Capacity planning.
  2. Deploying microservices, and issues concerning monitoring and reliability of microservices.
  3. Deployment and orchestration of container based infrastructures.
  4. Open tracing.

Topics for Round 1 of the CfP were:

  1. Monitoring strategies
  2. Deployment strategies
  3. Capacity planning
  4. Automation beyond deployment and monitoring
  5. Eliminating toil
  6. On-call outage handling
  7. Postmortem / root cause analysis
  8. Incident response

Rootconf is a three track conference:

We are inviting proposals for:

  • Full-length 40-minute talks – which cover conceptual topics and include case studies.
  • Crisp 15-minute how-to talks or introduction to a new technology.
  • Sponsored sessions, of 15 minutes and 40 minutes duration (limited slots available; subject to editorial scrutiny and approval).
    Hands-on workshop sessions of 3 and 6 hour duration where participants follow the instructors on their laptops.

##Selection Process
Proposals will be filtered and shortlisted by an Editorial Panel. Please make sure to add links to videos / slide decks when submitting proposals. This will help us understand your speaking experience and delivery style. Blurbs or blog posts covering the relevance of a particular problem statement and how it is tackled will help the Editorial Panel better judge your proposals. We might contact you to ask if you’d like to repost your content on the official conference blog.

We expect you to submit an outline of your proposed talk, either in the form of a mind map or a text document or draft slides within two weeks of submitting your proposal.

Selection Process Flowchart

You can check back on this page for the status of your proposal. We will notify you if we either move your proposal to the next round or if we reject it. Selected speakers must participate in one or two rounds of rehearsals before the conference. This is mandatory and helps you to prepare well for the conference.

A speaker is NOT confirmed a slot unless we explicitly mention so in an email or over any other medium of communication.

There is only one speaker per session. Entry is free for selected speakers.

##Travel Grants
As our budget is limited, we prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. HasGeek provides these limited grants where applicable:

  • Two grants covering travel and accommodation for international speakers.
  • Three grants covering travel and accommodation for domestic speakers.

Grants will be made available to speakers delivering full sessions (40 minutes or longer).
*Speaker travel grants will be given in the order of preference to students, women, persons of non-binary genders, and speakers from Asia and Africa.

##Commitment to Open Source
HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like for it to be available under a permissive open source licence. If your software is commercially licensed or available under a combination of commercial and restrictive open source licences (such as the various forms of the GPL), please consider picking up a sponsorship. We recognise that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a “sponsored session”.

##Important Dates:

  • Deadline for submitting proposals: 10 April, 2017
  • Final conference schedule: 15 April 2017
  • Conference dates: 11-12 May, 2017

For more information about speaking proposals, tickets and sponsorships, contact or call +91-7676332020.

Related events

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

Imran Basha


AWS Simple Workflow Service as an architectural solution for building Distributed Scalable Background Scheduled Jobs

Submitted Feb 6, 2017

This talk is about Simple Workflow Service as an infrastructure support for developing Distributed Scalable Background Scheduled Jobs. We will teach how one can replace Cron based Workflow with Amazon Simple Workflow Service

Key take aways
What is Cron based Workflow ?
Issues in Cron based Workflow
What is SWF ?
How to replace Cron based workflow using SWF
How to Monitor workflows based on SWF using AWS CloudWatch?


Problem statement

We need background jobs that can run on multiple clusters on a daily scheduled basis. These jobs process millions of data every day. In order to better load balance across clusters we need to divide the data across all the clusters and than trigger the jobs to run on their portion of data within a cluster. This is explained in Slide #3 of the video. we were spawning Worker processs on the cluster machines from a master machine where Crontab files are setup. This is a typical requirement of running background job in distributed setup.

Issues with Cron based scheduling

  1. Lack of Failure handling
  2. lost tasks
  3. Scale
  4. Not an option on shared hosting setup
  5. Single point of failure

How SWF helped in our particular use case ?

Cron are good for running a job on that particular machine on a scheduled basis but when it comes to distributed execution in a Cluster setup we need co-ordination, failure handling, scalability etc... which doesn’t come out of box with Cron based solutions. SWF helps in creating a distributed workflow which we can run at scheduled intervals and submit commands to Worker processes running in individual machines which can pick up the tasks and start executing on it.

Benefits on using SWF

  1. SWF takes care of co-ordination between workflow and worker processes
  2. Architecture becomes scalable as state management is owned by SWF and there exists a loose coupling between workflow nad worker processes
  3. SWF is a better way of handling distributed execution as it provides Flow Framework for managing issues in dsitributed application like failures, retries etc...
  4. Solution has clear separation of concerns
  5. If any of the machine goes down the load automatically gets transferred to other machine
  6. SWF provides End-End solution including Monitoring metrics

This way we end in a loosely coupled, highly scalable, distributed solution with Co-ordination and State management taken care by Amazon SWF.

SWF cannot be described as Job Scheduler. Better way to describe SWF is it provides the necessary framework and services that enables us to create a distibuted workflow that can executed on multiple machines in a loosely coupled and high scalable manner.

I made an attempt in explaining the above things in the submitted video. Would be happy to clarify any subsequent questions.

Speaker bio

I am Imran. I have been working with Inutit since 6 years. Totally I have around 13 years of experience. I was fortunate enough to explore and contribute from breadth of Technologies to Depth. Primarily into Full stack web application development in both .Net using WEBAPI’s and Java based on Jersey, SPA application development based on Backbone, Marionette, React + Relay + GraphQL. I am a technology enthusiast. I was the Architect involved in migrating Cron based workflow to AWS SWF. I encountered lot of learnings in the journey of transformation migrating from an unreliable Cron based infrastructure to a Reliable, Distributed and Highly scalable architecture based on AWS SWF. Wanted to share the learnings so that it can benefit other people.

Slides - RootConf.pptx


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more