Of the building of a Postgres cluster

Apr 2016

11 Mon

12 Tue

13 Wed

14 Thu 08:30 AM – 10:05 PM IST

15 Fri 08:30 AM – 05:30 PM IST

16 Sat 08:00 AM – 05:30 PM IST

17 Sun 08:30 AM – 01:00 PM IST

Make a submission

MLR Convention Centre, J P Nagar, Bangalore

Rootconf is India’s principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.

We are now accepting submissions for our next edition which will take place in Bangalore 14-15 April 2016.

##Theme

The theme for this edition will be learning from failure. We are keen to explore how devops think about failure when designing, building and scaling their systems. We invite presentations related to failure in database systems, servers and network infrastructure.

We encourage presentations that relate to failure not only in terms of avoidance but also in terms of mitigation and education. How do we decide which parts of our systems cannot fail? What measures do we take to mitigate failure when it does inevitably happen? And most importantly: what lessons can be learned from failure?

Format

This year’s edition spans two days of hands-on workshops and conference. We are inviting proposals for:

Full-length 40 minute talks.
Crisp 15-minute talks.
Sponsored sessions, 15 minute duration (limited slots available; subject to editorial scrutiny and approval).
Hands-on Workshop sessions, 3 and 6 hour duration.

Selection process

Proposals will be filtered and shortlisted by an Editorial Panel. We urge you to add links to videos / slide decks when submitting proposals. This will help us understand your past speaking experience. Blurbs or blog posts covering the relevance of a particular problem statement and how it is tackled will help the Editorial Panel better judge your proposals.

We expect you to submit an outline of your proposed talk – either in the form of a mind map or a text document or draft slides within two weeks of submitting your proposal.

We will notify you about the status of your proposal within three weeks of submission.

Selected speakers must participate in one-two rounds of rehearsals before the conference. This is mandatory and helps you to prepare well for the conference.

There is only one speaker per session. Entry is free for selected speakers. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. HasGeek will provide a grant to cover part of your travel and accommodation in Bangalore. Grants are limited and made available to speakers delivering full sessions (40 minutes or longer).

Commitment to open source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source licence. If your software is commercially licensed or available under a combination of commercial and restrictive open source licences (such as the various forms of the GPL), please consider picking up a sponsorship. We recognise that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Key dates and deadlines

Paper submission deadline: 31 January 2016
Schedule announcement: 29 February 2016
Conference dates: 14-15 April 2016

##Venue
Rootconf will be held at the MLR Convention Centre, J P Nagar.

##Contact
For more information about speaking proposals, tickets and sponsorships, contact info@hasgeek.com or call +91-7676332020.

Hosted by

Rootconf

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

All submissions

Previous Next

This submission has been added to the schedule

Of the building of a Postgres cluster

Submitted Jan 17, 2016

Section: Full talk Technical level: Intermediate

Learn about the problems we will encounter while building or using postgres clusters for high availability, and how to solve them.

Outline

What this talk is about

We engineered a Postgres database cluster last year. It was a lot of learning and a lot of fun! This talk is about the failure scenarios we designed for, the times when the designed system failed, and what we learnt from them.

A brief introduction to the talk:

Database clusters are built for one purpose – dealing with
failure. Thinking about what can go wrong, designing for failure
scenarios, and building multiple lines of defence was most of the
work involved.
Building, instrumenting, monitoring and automating the setup of a
database cluster isn’t easy. It involves many moving parts, each of
which is subject to a certain amount of failure. We had to do this
ourselves because there isn’t an existing solution out there.
Obviously, we ran into issues: The failover wasn’t quick enough,
there were network issues, we had multiple masters, we had to
recover from filesystem snapshots, wait days for standbys to catch
up, etc. Each of these circumstances helped us understand and refine
our cluster setup.

As an aside

Given the theme “learning from failure”, and given database systems
is the first category mentioned, it feels like this talk would fit
hand in glove.

Skeleton of the talk:

Introduction to Postgres clusters
- Introduce the cluster setup, it’s purpose, how it is expected to
  work, and the moving parts in the system.
- [5 minutes]
Postgres replication
- Briefly explain “streaming replication”, then explain what can go
  wrong here. Hardware constraints, WAL config, long running
  queries on standbys, and timeouts. This will broadly cover the
  cases invovling two databases.
- [10 minutes]
Failover setup
- Briefly explain what repmgr does, then explain what can go
  wrong. Multiple masters, no masters, automatic failover doesn’t
  work, node isn’t reachable, node is partially reachable,
  etc. This will cover the cases invovling at least 3 databases.
- [10 minutes]
Application <=> Database communication
- Explain what can go wrong here, and then the Push/Pull mechanisms
  we built to deal with it.
- [5 minutes]
Disaster scenarios
- What to do when the cluster is down, what to do to save your
  data, which backup/restore mechanism will work best for you, how
  to use filesystem backups, when not to rely on them.
- [10 minutes]

Speaker bio

Srihari is a FOSS enthusiast. He has contributed to Gimp, Eclipse, Diaspora and is excited about opportunities to give back. Over the last couple of years, he has worked on building an experimentation platform, delving into a particularly dense domain, meeting tight latency SLAs, and engineering assembly lines in software using Clojure.

He loves postgres – he has worked on implementing a high availability solution using repmgr and postgres’ streaming replication, and has spent an inordinate amount of time optimizing queries.

He is a partner at nilenso, a hippie tree hugging bicycle riding software cooperative based in Bangalore. He blogs, plays basketball, and performs carnatic music occasionally.

Slides

https://speakerdeck.com/srihari/on-the-building-of-a-postgres-cluster

All submissions

Previous Next

Comments

Apr 2016

11 Mon

12 Tue

13 Wed

14 Thu 08:30 AM – 10:05 PM IST

15 Fri 08:30 AM – 05:30 PM IST

16 Sat 08:00 AM – 05:30 PM IST

17 Sun 08:30 AM – 01:00 PM IST

Make a submission

MLR Convention Centre, J P Nagar, Bangalore

Hosted by

Rootconf

Rootconf 2016

Format

Selection process

Commitment to open source

Key dates and deadlines

Of the building of a Postgres cluster

Outline

What this talk is about

A brief introduction to the talk:

As an aside

Skeleton of the talk:

Speaker bio

Links

Slides

Comments