The Fifth Elephant 2012

Finding the elephant in the data.

Ashok Banerjee

@ashokbanerjee

Exponential Growth Models and Impact on Sales Forecast,Data Volume, Query Latency, Capacity Planning and Search Latency

Submitted Jun 14, 2012

We often loosely talk about exponential growth in this talk we will delve into the mathematical models of when a domain or market will undergo exponential growth. We often mistakenly believe the execution of one company is better than that of another, when in fact the domains and fundamental mathematical growth models of the 2 markets are in fact different.

Exponential growth and exponential decay are often seen in many domains not just business. These mathematical models have great fertility, from the growth of bacteria in your mouth every night, to the growth of population, to the spread of infections, to distribution of allergens, dust or mosquitos, to radioactive decay, to revolutions in the middle east and the decay of interest in topics on Twitter, to the decay of your sorrows, infatuation and many more domains.

This lecture will help users connect to the spaces/domains their current businesses, their lives via a fundamental mathematical model.

This understanding informs everything, from scaling of database, to scaling of message systems (the 2 have very different challenges), to demand forecasting, inventory planning to operations planning for (base, trend, seasonality and spike) and even staffing. Most often organizations undergoing these changes cannot comprehend the challenges that barrel at them but this structure enables deeper thinking.

We will also talk at the end time permitting when the exponential growth really ends and how the “epidemic” stabilizes.

This talk will not work at just 30 minutes - the lowest we can target is 40-45 minutes with opportunity to ask some questions.

I would highly highly advise not to be late, the first slide is where we will spend a lot of time and really build on a few basic concepts. If you need to leave early but dont come late :)

Outline

-Exponential Growth markets
---- Analyze mathematically the market spaces
Word of mouth Facebook, Foursquare, Twitter, Flipkart
Advertising driven growth and mathematical model

---- Model Fertility in other domains
- Radioactive decay
- Revolution, Love, Infatuation
- Mosquito/Allergens distribution with height

----- Basic Demand Forecasting
Base demand
Trends on demand
Seasonality on demand
Peak Demand Modelling (Exponential + Poisson)

---- Impact to OLTP
Web Scaling
Scaling Message Systems - traditional databases ->Custom solutions
Caching
Large DB scaling (compression, indexing, archiving, sharding and federated query)

---- Impact to OLAP
Hadoop and Scaling operations
Recommendation systems and the impact from time series

When does exponential growth end?
Epidemic models applied here
How to prevent exponential growth of a competitor (vaccine models in disease spread)

Q&A

Requirements

I will be working from the basics so people need not do anything special. But being late for a 1st slide will diminish understanding of the entire flow significantly.

Repeating please leave early if you need to, but do avoid coming late the entire thing may just be missed then.

Speaker bio

Ashok Banerjee is VP of Data Platform and Supply Chain Engineering at Flipkart and has to date 22 patents approved and counting. Prior to Flipkart Ashok has worked at Twitter in San Francisco and Google in Mountain View.

Experience Summary (reverse chronologically)

Ashok today leads the technology team for Data Platform and the largest online Supply Chain infrastructure in India (Flipkart)

  • At Google he led a large scale Datawarehouse infrastructure which converts SQL (approximately) into execution on a platform built on MapReduce, GFS, Columnar compressed data using block oriented computing. This was at the scale of many billion rows added per day (cannot disclose how many billions)
  • At Google Ashok had led the payment processing infrastructure which processes payments for Adwords, Adsense, Checkout and Google Apps
    At BEA he worked on WebLogic Server and led infrastructure teams on EJB Container, Web Container, Classloading, Application Deployment within a Server etc.
  • At Oracle Ashok led the Oracle Application Server Clustering infrastructure and also worked on EJB container and RMI-IIOP Protocols

Ashok takes interest in Large Data Systems (Databases and alternative databases - NOSQL, Message Systems), Parallel Computing, Distributed Systems, Fault Tolerant Computing, Database, Recommendation Systems, Supply Chain and Mathematical Models and Investments.

On the non-work side Ashok enjoys - sailing, wind surfing, horse riding, german shepherd dogs and soccer.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures