Submissions for Data Stores track

Guide on how to select datastores to solve different problems

This is a call for Submissions for the Data Stores conferences that will be held between September 2021 and August 2022 for the Data Stores track under Rootconf. Choosing Datastores conferences aim to help technology practitioners learn about how to select a datastore and be aware of their limitations when applied to the problem at hand.

We are accepting experiential talks on:

  • Hidden criteria of database selection such as operations or impact on other teams within an organization such as data platform teams.
  • Data modelling and developer experience with datastore.
  • Running datastores at scale - the true meaning of tuning, debugging and operations.
  • Solving specific use cases with a certain datastore.
  • DatastoreOps workflows.
  • Use of datastore in novel ways to solve for legal regulations.

We invite engineers who work closely with datastores to speak about their experience in selecting, using and scaling these technologies.

About Rootconf

Rootconf is a community funded organization. If you like the work that Rootconf does and want to support meet-ups and activities - online and in-person - contribute by picking up a membership

Contact information

Join the Rootconf Telegram group at https://t.me/rootconf or follow @rootconf on Twitter.
For inquiries, contact Rootconf at rootconf.editorial@hasgeek.com or call 7676332020.

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

Amandeep Singh

@devey

When, why and what database to choose for time-series data analytics?

Submitted Aug 24, 2021

Introduction
Time series database (TSDB) is optimized for storing and serving data through associated pairs of time and value. They are different from other datastores that track changes to the overall system as INSERTs not UPDATEs. TSDB largely help in forecasting and anomaly detection with seamless application of moving average, exponential smoothing, stationarity, autocorrelation, SARIMA, and other techniques. Hence Knowing where and how to store time-series data can have a dramatic impact on future success.
With so many TSDB available and the fastest growing segment in the market, it becomes difficult to choose the one which fits and solves the data challenge efficiently. In my presentation I’ll compare top 3 TSDB, in the context of the following features

Comparison methodology

Scale - We are looking for a TSDB which can capture change efficiently and later on forecast it accurately and all this has to happen at scale. With ever increasing need to store everything and anything, the scalability feature becomes a top priority for any data system.

Performance - TSDBs must be designed to handle exceptionally large volumes, specifically, performant ingestion (WRITE operations).
Ability to handle large time series dependent scans of many records, and time series aware queries is a second priority in my list.

Precision - Data systems these days are capturing time series at second, millisecond, microsecond, or nanosecond precision levels. Being able to accurately capture change at such a precise level should be a key feature of any TSDB.

Compression - Although it is variable depending on the level of precision the user needs, but it also depends upon the way a database stores it.

Storage - All this data has to be stored in an easy to write/retrieve/analyse/summarize/aggregate manner. Database architectural design plays a major role here and we will see how a purpose built TSDB performs better over a re-engineered one.

Optimization - Data cleaning, lifecycle management and indexing are important too and should be considered while selecting a database.

Trade-0ffs - No system is perfect and there are many trade-offs that one has to experience while choosing the best one for the problem at hand.
I’ll discuss how downsampling, interval search and partitioning help in overcoming the common trade-offs.

Takeaway
One should be able to independently compare and select a time series database for their data systems and analytics.

Submitted by: Amandeep Singh (asingh26@paypal.com)
Company: PayPal India
Designation: Software Developer 3

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more