The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Bhasker Kode

Bhasker Kode

@bhaskerkode

POC: How to slice, dice & search billions of users events in seconds (from scratch)

Submitted Jun 14, 2015

results from a proof of concept business intelligence tool, where each bit in a multi-billion bitmap, represented a user performing an event. a minimal 100 LOC implementation gave encouraging results, and also areas that could improve - caveats, ideas to roll out your own BI tool.

Outline

Supported actions:

Couting the number of users performing an event(cardinality)
Counting the number of users performing a combination of 1000 events
How to support searching events across time ranges (to suit your case)

  1. bits to be compared
  2. bits intersection
  3. time for running intersection
  4. time for counting intersection
  5. hacks & tricks when working with bits, popcount & bitmaps

First version up in two days. Will show you how to get it up and running for your internal team as well.

Speaker bio

Bosky (@bhaskerkode) leads a product engg team at Helpshift & works on erlang, clojure and golang.

building distributed systems since ‘06 across edtech, adtech & mobile in erlang, clojure & go.

more talks at http://slideshare.net/bosky101

more about bosky at http://in.linkedin.com/in/bhaskerkode

Slides

https://www.dropbox.com/s/80ou4xrqwm2am1j/fifthel15-bhaskerkode-billion-bits.png?dl=0

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures