The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

POC: How to slice, dice & search billions of users events in seconds (from scratch)

Submitted by Bhasker Kode (@bhaskerkode) on Sunday, 14 June 2015

videocam_off

Technical level

Beginner

Section

Crisp Talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +11

Objective

results from a proof of concept business intelligence tool, where each bit in a multi-billion bitmap, represented a user performing an event. a minimal 100 LOC implementation gave encouraging results, and also areas that could improve - caveats, ideas to roll out your own BI tool.

Description

Supported actions:

Couting the number of users performing an event(cardinality)
Counting the number of users performing a combination of 1000 events
How to support searching events across time ranges (to suit your case)

1) bits to be compared
2) bits intersection
3) time for running intersection
4) time for counting intersection
5) hacks & tricks when working with bits, popcount & bitmaps

First version up in two days. Will show you how to get it up and running for your internal team as well.

Speaker bio

Bosky (@bhaskerkode) leads a product engg team at Helpshift & works on erlang, clojure and golang.

building distributed systems since ‘06 across edtech, adtech & mobile in erlang, clojure & go.

more talks at http://slideshare.net/bosky101

more about bosky at http://in.linkedin.com/in/bhaskerkode

Links

Slides

https://www.dropbox.com/s/80ou4xrqwm2am1j/fifthel15-bhaskerkode-billion-bits.png?dl=0

Comments

Login with Twitter or Google to leave a comment