BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//HasGeek//NONSGML Funnel//EN
DESCRIPTION:A conference on big data and analytics
X-WR-CALDESC:A conference on big data and analytics
NAME:The Fifth Elephant 2014
X-WR-CALNAME:The Fifth Elephant 2014
REFRESH-INTERVAL;VALUE=DURATION:PT12H
SUMMARY:The Fifth Elephant 2014
TIMEZONE-ID:Asia/Kolkata
X-PUBLISHED-TTL:PT12H
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
SUMMARY:Check-in
DTSTART:20140723T040000Z
DTEND:20140723T043000Z
DTSTAMP:20260801T124109Z
UID:session/41sXfBL1gH1MGWQpydvtzG@hasgeek.com
SEQUENCE:0
CREATED:20140527T025158Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T054718Z
LOCATION:Auditorium - TERI\, Domlur\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Check-in in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Machine Learning using R :  Crash course in Classification Methods
DTSTART:20140723T043000Z
DTEND:20140723T073000Z
DTSTAMP:20260801T124109Z
UID:session/QvLc1FqscSazCoHocAEeWP@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20140623T124249Z
DESCRIPTION:The following topics would be covered. The format would be a b
 it of theory and then implementation using R\n\nIntroduction to Machine le
 arning\n1)	Types of Learning (Supervised/Unsupervised/Reinforced)\n2)	Intr
 oduction to Generalization\n3)	Train/Test/Validation Datasets\n4)	Bias –
  Variance tradeoff\n5)	Overfitting\n6)	Cross-validation\n7)	Regularization
 \n8)	Grid Search\n9)	Hyperparameter Optimization\n10)	Feature Selection/Tr
 ansformation\na.	        Greedy feature selection (forward\, backward\, st
 epwise)\nb.	        Non-linear transformations\, Kernels \n\n\nClassificat
 ion Techniques covered:\n1)	Linear Regression\n2)	Logistic Regression\n3)	
 LASSO\, Ridge and Elastic net regression\n4)	kNN\n5)	Discriminant Analysis
 \n6)	Decision Trees\, CART\, CHAID\n7)	Support Vector Machines\n8)	Naïve 
 Bayes\n9)	Ensemble Methods\na.	Boosting\nb.	Bagging\nc.	Random Forest\nd.	
 Regularized Random Forest\ne.	Gradient Boosting Machines\n\n\nUnsupervised
  learning techniques covered:\n1)	Dimensionality Reduction: Principal Comp
 onent Analysis\n2)	K-Means clustering\n\nIllustrating common pitfalls\n1)	
 Data snooping\n2)	Occam’s Razor\n\nBig Data Analytics (*need AWS credit 
 for implementation.)\n1)	Introduction to Big Data and Hadoop\n2)	R and Big
  Data\na.	Hadoop\nb.	Linear Model\nc.	Random Forest\n\n### Speaker bio\n\n
 Data Analytics professional at Cisco Systems India Pvt Ltd.\n
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium - TERI\, Domlur\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/machine-learning-using
 -r-crash-course-in-classification-methods-QvLc1FqscSazCoHocAEeWP
BEGIN:VALARM
ACTION:display
DESCRIPTION:Machine Learning using R :  Crash course in Classification Met
 hods in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20140723T073000Z
DTEND:20140723T083000Z
DTSTAMP:20260801T124109Z
UID:session/BfzHg9yLtKUXnpvMwLPfCS@hasgeek.com
SEQUENCE:0
CREATED:20140527T025251Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T025254Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Building distributed search applications using Apache SOLR
DTSTART:20140723T083000Z
DTEND:20140723T113000Z
DTSTAMP:20260801T124109Z
UID:session/4xr7gMXN72Da6BYn9JZfy4@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20140617T095810Z
DESCRIPTION:For the workshop\, we will index and search data from 'StackEx
 change' sites using dumps available [here](https://archive.org/details/sta
 ckexchange) and build backend for following demo application: [saumitra.me
 /solrdemo/](http://saumitra.me/solrdemo/). (Tested only on Chrome and Fire
 fox).\n\nAgenda:\n\n1. What is Solr? Use cases and architecture\n2. Solr s
 chema\, config\, tokenizers and filters\n3. Indexing data:\n    * From dis
 k using SolrJ\n    * Importing from database(MySQL) with DataImport Handle
 r\n4. Querying Solr (Filters\, Faceting\, highlighting\, sorting\, groupin
 g\, boosting\, range\, function and fuzzy queries)\n5. Using 'More Like Th
 is' component to show similar docs\n6. Adding 'Auto Suggest' component to 
 auto complete user queries\n7. Using 'Clustering' component to cluster sim
 ilar results.\n8. SolrCloud\n    * Architecture\n    * Setting up a multin
 ode cluster with Zookeeper\n    * Creating a distributed index\n    * Coll
 ections API\n9. Solr Admin UI\n10. Understanding Solr performance factors\
 n11. Solr vs. ElasticSearch\n\n### Speaker bio\n\n* Engineer at [Glassbeam
 ](http://glassbeam.com) working on machine data search and analytics\n* Or
 ganizer of [Bangalore-Baby-Apache-Solr](http://www.meetup.com/Bangalore-Ba
 by-Apache-Solr-Group/) meetup group\, where we organize sessions to help p
 eople get started with Solr\n* [https://twitter.com/_saumitra_](https://tw
 itter.com/_saumitra_)\n
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium - TERI\, Domlur\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/building-distributed-s
 earch-applications-using-apache-solr-4xr7gMXN72Da6BYn9JZfy4
BEGIN:VALARM
ACTION:display
DESCRIPTION:Building distributed search applications using Apache SOLR in 
 Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Check-in
DTSTART:20140724T041500Z
DTEND:20140724T043000Z
DTSTAMP:20260801T124109Z
UID:session/XTGEmnqSpvypDuUiMsZ7UF@hasgeek.com
SEQUENCE:0
CREATED:20140527T025345Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T025348Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Check-in in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Real world machine learning
DTSTART:20140724T043000Z
DTEND:20140724T073000Z
DTSTAMP:20260801T124109Z
UID:session/E5Ph8Qcbdmn8mMwYXityNN@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Intermediate
CREATED:20140721T050718Z
DESCRIPTION:Machine learning has evolved to a very popular\, rapidly chang
 ing\, sometimes over-hyped domain with extremely diverse set of ideas. Dis
 cussion about machine learning often tends to get lost into jargon of tool
 s\, market buzzwords\, libraries and diverted from real purpose which is i
 nsights! This workshop will focus on insights and practical applications.\
 n\n### What background is essential to understand this workshop ?\n\nThis 
 is introductory session and participants are not expected to know 100 mach
 ine learning algorithms or have a PhD in Maths. That said\, the following 
 are bare minimum requirements\,\n\n* Reasonable knowledge of at-least some
  programming language (C\, C++\, Python\, Ruby\, Java\, R\, Matlab\, Julia
  and so on). \n\n* Some familiarity with machine learning (for example\, i
 t should be enough to vaguely understand what linear regression is)\n\n* C
 uriosity!\n\n[Those who are not familiar with Python are advised to go thr
 ough the tutorial given here](https://docs.python.org/2/tutorial). Python 
 is an extremely elegant and simple language\, you'd get started in no time
 !\n\n### Speaker bio\n\nHarshad is senior data scientist at Sokrati\, a di
 gital advertising startup based out of Pune\, where he works closely with 
 the engineering team to extract meaning out of millions of data points fro
 m advertising world. He's been applying machine learning to real world pro
 blems in telecom\, banking and advertising since last 4 years. He has most
 ly worked with tools like R\, Python\, SAS and lately fallen in love with 
 Clojure ecosystem too. He conducted a similar session in Fifth Elephant 20
 13\, focussed on R and text mining. Harshad holds a master's degree in Ope
 rations Research from IIT\, Mumbai\n
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium - TERI\, Domlur\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/real-world-machine-lea
 rning-E5Ph8Qcbdmn8mMwYXityNN
BEGIN:VALARM
ACTION:display
DESCRIPTION:Real world machine learning in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20140724T073000Z
DTEND:20140724T083000Z
DTSTAMP:20260801T124109Z
UID:session/L1bQDYpPx6RCQsWCMhHVUB@hasgeek.com
SEQUENCE:0
CREATED:20140527T025418Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T025421Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Getting your hands dirty with Aerospike
DTSTART:20140724T083000Z
DTEND:20140724T113000Z
DTSTAMP:20260801T124109Z
UID:session/2DHysqhBrKdNssKizDY7Ph@hasgeek.com
SEQUENCE:2
CATEGORIES:Sponsored workshop,Intermediate
CREATED:20140702T032014Z
DESCRIPTION:We will go over the following in the workshop\n1. Talk about t
 he architecture of Aerospike\n2. Install\, setup\, configure\n3. Use tools
  built around aerospike\n4. Coding using its API (java/node.js)\n5. Using 
 Aerospike in existing app frameworks (TBD)\n6. Using Aerospike on Amazon c
 loud.\n7. Unleash the high performance of Aerospike\n\nWe will organize th
 e people into teams to encourage discussion and mutual help. Also\, in the
  later part of the workshop\, the team will use their individual EC2 machi
 nes to form a cluster and learn how to deal with cluster.\n\n### Speaker b
 io\n\nSunil Sayyaparaju\, Engineering Lead at Aerospike\, has over 9 years
  experience working on different types of SQL RDBMS solutions\, such as si
 ngle machine (monolithic)\, in-memory\, distributed shared-disk\, and dist
 ributed shared-nothing architectures with emphasis in transaction manageme
 nt\, storage\, access\, performance tuning\, and recovery areas.\n\nSunil 
 currently leads Aerospike's Bangalore office\, working on their distribute
 d shared-nothing NoSQL solution. Aerospike is a high-performance\, self-ba
 lancing\, immediately consistent\, distributed NoSQL database. Aerospike a
 lso has an add on product for replication across data-centers over WAN whi
 ch supports different complex topologies.\n\n* we will have multiple prese
 nters\n
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium - TERI\, Domlur\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/getting-your-hands-dir
 ty-with-aerospike-2DHysqhBrKdNssKizDY7Ph
BEGIN:VALARM
ACTION:display
DESCRIPTION:Getting your hands dirty with Aerospike in Auditorium in 5 min
 utes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Check-in and breakfast
DTSTART:20140725T030000Z
DTEND:20140725T034500Z
DTSTAMP:20260801T124109Z
UID:session/Xw8no16jGrxpAw4cD4aatS@hasgeek.com
SEQUENCE:0
CREATED:20140527T025500Z
DESCRIPTION:\n
LAST-MODIFIED:20140630T165817Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Check-in and breakfast in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Introductions to HasGeek\, The Fifth Elephant 2014\, EP
DTSTART:20140725T034500Z
DTEND:20140725T040000Z
DTSTAMP:20260801T124109Z
UID:session/KwcELTLZgGnnHtEU5Zr5qQ@hasgeek.com
SEQUENCE:0
CREATED:20140527T025528Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140630T165953Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Introductions to HasGeek\, The Fifth Elephant 2014\, EP in Aud
 itorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:'Know Your Customer!' - Advanced Data Science for Audience Segment
 ation
DTSTART:20140725T040000Z
DTEND:20140725T044500Z
DTSTAMP:20260801T124109Z
UID:session/KY1U9RqALnVCMfD9enNYyB@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Advanced
CREATED:20140704T081723Z
DESCRIPTION:Audience Segmentation is a very important practical necessity 
 in pretty much every field. Whether it is internet subscribers or paytv su
 bscribers there is an intense need from the advertisers and service provid
 ers to know who is living in a household and what their demography profile
 s are and what their interests are?\n\nAn ensemble of techniques which inc
 lude advanced linear and non-linear dimensionality reduction\, unsupervise
 d learning algorithms\, bayesian predictive analytics and expert systems c
 ome together to form a compelling pragmatic data science solution stack fo
 r the big data environment.\n\n### Speaker bio\n\nI am Prabhakar Srinivasa
 n. I work as a data scientist at Cisco. I have invented a unique technique
  to do customer segmentation which works for the PayTv domain of Cisco pro
 ducts running on Big data infrastructure. I wish to share this success sto
 ry with the community with the hope that others who are trying to solve a 
 similar problem can gain some practical insight on doing data science on b
 ig data stack that really gives business value.\n\nI have successfully dep
 loyed my invention in Europe for some Telecom giants. I have presented thi
 s technique during talks in Monaco during Cisco's internal conferences. It
  is time to share this success story with the wider Big data community.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/know-your-customer--ad
 vanced-data-science-for-audience-segmentation-KY1U9RqALnVCMfD9enNYyB
BEGIN:VALARM
ACTION:display
DESCRIPTION:'Know Your Customer!' - Advanced Data Science for Audience Seg
 mentation in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Unified analytics platform for Bigdata
DTSTART:20140725T040000Z
DTEND:20140725T044500Z
DTSTAMP:20260801T124109Z
UID:session/NQcBUreyeVpP2oKdNLxo8g@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T110528Z
DESCRIPTION:Conventional columnar databases (RDBMS) systems lend themselve
 s well for interactive SQL queries over reasonably small datasets in the o
 rder of 10-100s of GB\, while hadoop based warehouses operate well over la
 rge datasets in the order of TBs and PBs and scales fairly linearly. Thoug
 h there have been some improvements recently in storage structures in the 
 Hadoop warehouses such as ORC\, queries over hadoop still typically adopts
  a full scan approach. Choosing between these different data stores based 
 on cost of storage\, concurrency\, scalability and performance is fairly c
 omplex and not easy for most users. This talk presents Grill\, the new ana
 lytics platform for InMobi\, a system built at InMobi to precisely solve t
 his problem on top Hive metastore. \n\nThe Hive metastore in its current s
 tate allows users to represent structured data in simple tables. However\,
  it does not allow expressing relationships or richer DWH concepts like fa
 cts\, dimensions and etc. With Hive data cubes\, users can query data stor
 ed in HDFS\, S3\, Redshift etc\, with a single query language and schema. 
 Underlying execution engines like Hive\, Impala\, Shark etc can be plugged
  in and utilized at run time. The execution engine used is transparent to 
 the user. The system provides a unified logical schema to users consisting
  of cubes\, facts and dimensions\; and users can issue queries at a concep
 tual level without knowing about roll-up intervals\, partitions\, data typ
 es\, underlying storage and table relationships\; they will be figured out
  automatically.\n\n### Speaker bio\n\nAmareshwari is currently working as 
 Architect in platform team at Inmobi\, where she works on Hadoop and relat
 ed projects for data collection and analytics. She is member of Apache Had
 oop PMC and is Apache Hive committer. She has been working on Hadoop and i
 ts eco system since 2007. Prior to Inmobi\, she was working with Yahoo! in
  core Hadoop team. She holds bachelor's degree in computer science and eng
 ineering from National institute of technology\, Waragal\, India\; and mas
 ter's degree in Internet science and engineering from Indian Institute of 
 Science (IISc)\, Bangalore\, India.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/unified-analytics-plat
 form-for-bigdata-NQcBUreyeVpP2oKdNLxo8g
BEGIN:VALARM
ACTION:display
DESCRIPTION:Unified analytics platform for Bigdata in Auditorium 1 in 5 mi
 nutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:How to build a Data Stack from scratch
DTSTART:20140725T044500Z
DTEND:20140725T053000Z
DTSTAMP:20260801T124109Z
UID:session/V2AjYGfQfnUtN1LZhLBKWL@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T111158Z
DESCRIPTION:In the talk I will talk about my experience of how to build a 
 data stack from stratch. I have built a big data analytics stack at Akamai
  and Inmobi before and am currently building one now at Helpshift. These a
 re three different domains - Content Delivery Networks (Akamai)\, Mobile A
 dvertising (Inmobi) and now Customer Service (Helpshift).\n\nMore specific
 ally\, my talk will try to cover these questions and more\n\n* What are th
 e different components of an analytics stack and what function does each l
 ayer have ?\n* How do you choose the right software for different layers o
 f your analytics data stack ?\n* Do you use real-time analytics or batch p
 rocessing is right for you ? What are the costs/benefits of both ?\n* What
  is the relation between statistical and probabilistic techniques ? Which 
 to choose when ?\n* How to decide on the right structure and storage for y
 our data and how they influence your analytics stack ?\n* How to decide on
  the right metrics for your business and how they influence your analytics
  stack ?\n\nI will use specific industry examples how each of these questi
 ons were answered differently in different contexts. I will also talk the 
 factors that influenced these decisions and how they influenced the final 
 output and architecture.\n\n### Speaker bio\n\nVinayak is an early adopter
  of technologies having worked across diverse and complex computer systems
  including embedded systems\, networking\, large-scale distributed systems
  and data-processing systems. He has more than a decade of experience in h
 ardcore product development & software/deployment architecture.\n\nHe has 
 led engineering teams at Akamai\, Inmobi and Helpshift to build big data s
 tacks from scratch. He organised one of the first Cloudcamps and Barcamps 
 in India. He co-founded Headstart\, a grass-roots community driven by volu
 nteers for helping startups. Other than his interests in tech and startups
 \, he is an avid traveller and amateur photographer.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/how-to-build-a-data-st
 ack-from-scratch-V2AjYGfQfnUtN1LZhLBKWL
BEGIN:VALARM
ACTION:display
DESCRIPTION:How to build a Data Stack from scratch in Auditorium 1 in 5 mi
 nutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Data sciences (is) in fashion @ Myntra
DTSTART:20140725T044500Z
DTEND:20140725T053000Z
DTSTAMP:20260801T124109Z
UID:session/EH9Bf9MvUtq9MESMancEft@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T081650Z
DESCRIPTION:Fashion is a hard category to sell online given most of the pu
 rchases happen on impulse. It fundamentally differs from selling categorie
 s like mobiles which are more driven through reviews and ratings. Data sci
 ences help bring a differentiating angle to selling fashion and can be app
 lied in a variety of fashion e-tailing problems ranging from product ranki
 ngs (the extreme form of which is personalised store for every user)\, sto
 re organisation and navigation\, better customer engagement\, better offer
  creation\, better merchandising decisions\, etc. We @ Myntra are working 
 towards these specific problems and would love to talk about the approache
 s that have worked for us.\n\nWe are personalising the customer experience
  in multiple ways\n- by personalising customer communications through vari
 ous channels\n- by personalising the website to tailor to customers' prefe
 rences \n- by creating unique customer specific offers\n\nWe would as well
  be talking about the data platform which powers all these efforts.\n\n###
  Speaker bio\n\nDivya Alok\, Devashish\, Debdoot\nData scientists working 
 at myntra. We look at tons of data and actively look towards creating data
  products used by our customers - external and internal.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/data-sciences-is-in-fa
 shion-myntra-EH9Bf9MvUtq9MESMancEft
BEGIN:VALARM
ACTION:display
DESCRIPTION:Data sciences (is) in fashion @ Myntra in Auditorium 2 in 5 mi
 nutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Morning tea break
DTSTART:20140725T053000Z
DTEND:20140725T060000Z
DTSTAMP:20260801T124109Z
UID:session/UFEFoZbszDfYYQSup71puR@hasgeek.com
SEQUENCE:0
CREATED:20140527T025658Z
DESCRIPTION:\n
LAST-MODIFIED:20140529T032006Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Morning tea break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sponsored session: What enterprises can learn from real-time biddi
 ng
DTSTART:20140725T060000Z
DTEND:20140725T064500Z
DTSTAMP:20260801T124109Z
UID:session/XisbAqoWV8Hpy6kjV3dEEs@hasgeek.com
SEQUENCE:0
CREATED:20140527T025635Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20200619T062516Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Sponsored session: What enterprises can learn from real-time b
 idding in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Serving user intent : Facebook style notifications using HBase and
  Event streams  
DTSTART:20140725T064500Z
DTEND:20140725T073000Z
DTSTAMP:20260801T124109Z
UID:session/6E5cBiTfjoM6VZmGjJGJUd@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T114013Z
DESCRIPTION:Relevant and Personalized notifications in near real-time is a
  great way of serving user intent. The intent may vary - say liking a Face
 book update as compared to a price drop for a browsed product on an e-comm
 erce website. The system characteristics and solution patterns in both the
 se instances may be very similar though.\n\nThis talk will cover the desig
 n of the [Flipkart](http://www.flipkart.com) Notifications platform. The t
 echniques and technologies used to serve product related intent can be eas
 ily applied to a different domain. This talk will also introduce projects 
 that were Open Sourced while building the platform.\n\nArchitecture\, Desi
 gn patterns and technologies used in this system include:\n\n* Pre-creatin
 g data that matches user intent - so as to significantly reduce data servi
 ng latencies\n* Storing immutable events and interpreting change\n* Event 
 driven architectures(EDA) and its variant Staged EDA (SEDA) using technolo
 gies like RabbitMQ and Mule.\n* Complex Event Processing (CEP) using techn
 ologies like Esper\n* Data stores like HBase that organize data between me
 mory and disk as Log Structured Merged (LSM) trees - leveraging Disk trans
 fer better over Disk seek\n* A data serving API that is resilient to failu
 res and latencies - using Hystrix and Netty\n\nThe talk uses a typical e-c
 ommerce experience where user intent is either implicit or interpreted fro
 m actions - for example a user browsing a product of interest\, adding an 
 item to a shopping cart or adding it for future reference via a wish-list.
  In a dynamic e-commerce marketplace\, product data (such as price\, stock
  quantity) is constantly changing across millions of listed products even 
 as user intent is being expressed on the website. User intent may be seen 
 as one Event stream while Product attribute changes is another. An interse
 ction of these two streams is the Notification data. An efficient data sto
 re that can store and serve tens of millions of such notifications with ve
 ry low latencies is the Notification service.\n\nThe following projects we
 re open sourced before or when building the Notifications Platform : \n\n*
  [Trooper](https://github.com/regunathb/Trooper)\n* [Phantom](https://gith
 ub.com/Flipkart/phantom)\n\nThe talk will also feature a live view of the 
 data serving metrics with millisecond response times.\n\n### Speaker bio\n
 \nArchitect and Open source committer. My areas of interest are Distribute
 d Systems\, Big Data\, Text Mining and Data Stores.\n\nMy experience as Ar
 chitect includes:\n\n* Building the World's largest biometric identity pla
 tform in [Aadhaar](http://uidai.gov.in/)\n* Customer facing Mobile and Web
  platforms at India's leading e-Commerce company - [Flipkart](http://www.f
 lipkart.com)\n\nMost of my work in recent years has been around OSS - usin
 g it to build large scale systems and in contributing projects back to the
  community. Some of my OSS work is downloaded and used worldwide:\n\n* [Ph
 antom](https://github.com/Flipkart/Phantom)\n* [Trooper](https://github.co
 m/regunathb/Trooper)\n* [Insight](http://sourceforge.net/projects/mindtree
 insight/)\n\nActive projects on github : https://github.com/regunathb\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/serving-user-intent-fa
 cebook-style-notifications-using-hbase-and-event-streams-6E5cBiTfjoM6VZmGj
 JGJUd
BEGIN:VALARM
ACTION:display
DESCRIPTION:Serving user intent : Facebook style notifications using HBase
  and Event streams   in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:How we do research on high frequency finance
DTSTART:20140725T064500Z
DTEND:20140725T073000Z
DTSTAMP:20260801T124109Z
UID:session/UGEarTipc6Ycs4x7tNhHq@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T113532Z
DESCRIPTION:It is the story of the Finance Research Group at Indira Gandhi
 \nInstitute of Development Research. How a set of researchers from non\nco
 mputer science backgrounds helped build a computing facility which\nis one
  of the largest amongst finance labs in India.\n\nI will be talking about 
 our challenges and solutions we came up with:\n\n** Problem statement\n\n-
  Analyse Indian financial markets data to assess market quality after intr
 oduction of algorithmic trading.\n- Do computation to judge high frequency
  trading activity in the Indian markets.\n- Manage and validate gigabytes 
 of data inflow every day.\n- No metadata for existing data.\n- Utilise exi
 sting infrastructure.\n- Prepare for disasters.\n- Keep it all simple: the
 y're not geeks!\n- Automate!\n- Low budget (!!)\n\n** Managing new giganti
 c data\n\n   - From the National Stock Exchange\, Mumbai.\n   - \\#1 excha
 nge in the world in terms of volumes traded daily.\n   - Around 15GB compr
 essed data every day.\n   - Derived data uncompresses to around 120GB.\n  
  - 170 million orders daily in Futures and Options markets.\n   - Flat fil
 e structure over NFS.\n\n** Deciding the hardware\n\n   - Computations req
 uire loading entire files into memory.\n   - Need fast network to transfer
  this data for computation.\n   - Reliable storage\n   - Low cost cluster:
  beowulf\n   - A Storage Area Network with in-house Network Attached Stora
 ge\n\n** Building a new server room\n\n   - Heating issues\n   - Space con
 straints\n   - Adequate power supply\n\n** Software\n\n   - Metadata syste
 m: home-brewed using R and MySQL\n   - Home-grown systems for daily proces
 sing: downloading\, verification\, checksumming\, and validation.\n   - Mo
 nitoring system: Icinga\n   - Batch scheduler: SLURM\n   - Logging server:
  Sentry\n   - Data shared over NFS\n   - Cluster computation: R + TCP sock
 ets/MPI\n   - Backups on Amazon Glacier\n   - Languages used: R\, C\, Pyth
 on\, Shell\, Java\n   - Use C where R gives up\n   - Better algorithms\n\n
 ** Open source contribution\n\n   - ifrogs R package\n   - eventstudies R 
 package\n   * Future\n     - LDDB\n\n### Speaker bio\n\nI am a Research Pr
 ogrammer in the Finance Research Group at the Indira Gandhi Institute of D
 evelopment Research in Mumbai. My experience includes working with a start
 up\, a corporate and research groups in Delhi and Mumbai. My areas of inte
 rest include high performance computing\, high frequency trading\, and fin
 ancial data management.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/how-we-do-research-on-
 high-frequency-finance-UGEarTipc6Ycs4x7tNhHq
BEGIN:VALARM
ACTION:display
DESCRIPTION:How we do research on high frequency finance in Auditorium 2 i
 n 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20140725T073000Z
DTEND:20140725T083000Z
DTSTAMP:20260801T124109Z
UID:session/YGZcVMCdtiTBbGjXPWDXiU@hasgeek.com
SEQUENCE:0
CREATED:20140527T025814Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T025824Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Large Scale Modelling and Analytics Challenges at a Payments Compa
 ny
DTSTART:20140725T083000Z
DTEND:20140725T091500Z
DTSTAMP:20260801T124109Z
UID:session/ScCVnB78jwk1id7hf37NPE@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T113500Z
DESCRIPTION:At American Express\, we serve tens of millions of credit card
  members who transact at several million merchants globally. Apart from th
 is  transaction level data\, there is petabyte scale data generated from c
 ard members' other interactions with American Express:-  through visiting 
 and interacting with our website\, phone and chat interactions with custom
 er care representatives\, the surveys we conduct to gauge the pulse of our
  customers\, etc. All these result in  large scale data coming from multip
 le modalities. In this talk\, we will focus on some key challenges which a
 rise from dealing with this data. The talk will first give a broad overvie
 w of the various challenges in the context of machine learning in the paym
 ents industry and then focus in some details on a particular problem of mo
 delling purchase intent of credit card holders.\n\n### Speaker bio\n\nhttp
 s://www.linkedin.com/in/subhajitsanyal\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/large-scale-modelling-
 and-analytics-challenges-at-a-payments-company-ScCVnB78jwk1id7hf37NPE
BEGIN:VALARM
ACTION:display
DESCRIPTION:Large Scale Modelling and Analytics Challenges at a Payments C
 ompany in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Circuitscape - A Case Study on Scientific Computing
DTSTART:20140725T083000Z
DTEND:20140725T091500Z
DTSTAMP:20260801T124109Z
UID:session/LungitQdoGFWLxLtrKEhC9@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate,Lecture
CREATED:20140704T114000Z
DESCRIPTION:Circuitscape is an open-source program\, which borrows algorit
 hms from electronic circuit theory to predict patterns of movement\, gene 
 flow\, and genetic differentiation among plant and animal populations in h
 eterogeneous landscapes. It is used by academics\, policy makers\, and gov
 ernments around the world in conservation planning.\n\nCreated by Brad McR
 ae and Viral B. Shah\, it equates life forms to electrons\, the landscape 
 as a grid of resistances and the movement of life forms across a landscape
  as current flowing through a circuit. Interestingly\, this has been able 
 to model reality much better than many other approaches. Circuitscape has 
 been used to model raster landscapes containing as many as 20 million cell
 s\, covering vast geographies over thousands of square kilometres\, result
 ing in jobs that run over days to compute wildlife corridors.\n\nThis talk
  is about Circuitscape and its application. I will cite the example of Cir
 cuitscape usage: "Connectivity of Tiger (Panthera tigris) Populations in t
 he Human-Influenced Forest Mosaic of Central India” during the talk\, an
 d how this concept can be applied across other domains.\n\n["Connectivity 
 of Tiger (Panthera tigris) Populations in the Human-Influenced Forest Mosa
 ic of Central India”](http://www.plosone.org/article/info%3Adoi%2F10.137
 1%2Fjournal.pone.0077980)\n\n### Speaker bio\n\nI am one of the co-creator
 s of the Julia programming language and Circuitscape.\nhttp://in.linkedin.
 com/in/viralbshah\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/circuitscape-a-case-st
 udy-on-scientific-computing-LungitQdoGFWLxLtrKEhC9
BEGIN:VALARM
ACTION:display
DESCRIPTION:Circuitscape - A Case Study on Scientific Computing in Auditor
 ium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Why we built the most adopted Polyglot Object Mapper for NoSQL?
DTSTART:20140725T091500Z
DTEND:20140725T100000Z
DTSTAMP:20260801T124109Z
UID:session/5TJkNk4tffhYVbv9rUswFv@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T115220Z
DESCRIPTION:NoSQL Datastores are tough. But JPA interface of dealing with 
 Datastores is easy\, well known and has been popular with developers since
  long. So we combined both the two things and created Kundera. \n\nKundera
  is innovative in many ways:\n•	Kundera supports polyglot persistence i.
 e. an application can use multiple and any combination of datastores. All 
 the hard work related to mapping\, persisting\, reading\, indexing and tra
 nsacting object model across multiple datastores is handled by Kundera\,\n
 •	Kundera supports 8 NoSQL datatsores and any RDBMS.\n•	Kundera provid
 es very easy and well known interfaces like JPA and REST to interact with 
 all the datastores. Hence it hides all the complexities of NoSQL stores an
 d reduces the learning curve enormously.\n•	Kundera also facilitates eas
 y migration of your existing hibernate/JPA powered applications to NoSQL.\
 n•	Kundera is extremely extensible and one can easily add support for ne
 w datastores with very minimal effort.\n\nImpact of this product:\n•	It 
 has made the task of a NoSQL developer extremely easy. \n•	It has simpli
 fied working with 8 leading NoSQL datastores\; Cassandra\, MongoDB\, HBase
 \, Neo4j\, Redis\, Oracle NoSQL\, CouchBase and ElasticSearch.\n•	It has
  enabled 10 very big scale applications to use NoSQL and move to productio
 n.\n•	It has enabled number of people in open source community to build 
 exciting solutions (like NoSQL Datastore Migrator\, NoSQL Data Viewer) on 
 top of Kundera.\n\nWhile coming up with such a product that talks to so ma
 ny different types of datastore\, there have been challenges. And the lear
 nings. The talk would cover these challenges and learnings.\n\n### Speaker
  bio\n\nVivek Shrivastava\, Technical Architect & NoSQL Evangelist\, Impet
 us Technologies          \n\nVivek Shrivastava has over 14 years of experi
 ence on product development\, designing and engineering enterprise grade s
 olutions for Finance\, Travel\, Retail and Interactive Television domains.
  His expertise includes NoSQL Data-stores\, Cloud Computing and BigData. V
 ivek has spearheaded the design & architecture of several large & cloud sc
 ale solutions for data lifecycle management at petabyte scale. Vivek also 
 leads the development at Open source project Kundera which is a JPA 2.0 co
 mpliant Object-NoSQL Mapping Library and is a major advocate of Polyglot P
 ersistence.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/why-we-built-the-most-
 adopted-polyglot-object-mapper-for-nosql-5TJkNk4tffhYVbv9rUswFv
BEGIN:VALARM
ACTION:display
DESCRIPTION:Why we built the most adopted Polyglot Object Mapper for NoSQL
 ? in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Scaling Spatial Data - OpenStreetMap as Infrastructure.
DTSTART:20140725T091500Z
DTEND:20140725T100000Z
DTSTAMP:20260801T124109Z
UID:session/VTpsPK2QwTrr7nKqV6gEFF@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T115806Z
DESCRIPTION:OpenStreetMap has over 9 years of lessons in managing large sc
 ale crowdsourced spatial data with open source infrastructure. The project
  now has over 1\,600\,000 contributors out which about 3000 users map ever
 y day. They create over 16\,35\,608 nodes. The changes are replicated acro
 ss several official and unoffical instances and all the maps rendered with
  OpenStreetMap data are updated in under a minute on the Internet.\n\nWith
  PostgreSQL and PostGIS as the backbone of data storage\, OpenStreetMap ha
 s [created an ecosystem of tools](http://wiki.openstreetmap.org/wiki/Compo
 nent_Overview) that is now available for anyone to use for different kinds
  of spatial data. The data can be processed\, queried and styled in severa
 l ways. We will discuss setting up the OpenStreetMap infrastructure for cu
 stom data models and explore the edit-style-render toolchain. There is a r
 ich API and database dump/restore mechanisms that is built into the infras
 tructure. OpenStreetMap also takes quality assurance seriously that there 
 is a process in place for identifying and rectifying vandalism or errors.\
 n\n### Speaker bio\n\n[Sajjad Anwar](http://twitter.com/geohacker) is a ha
 cktivist and programmer based in Bangalore. He works in the research and d
 esign of data infrastructure\, analytics and infographics. Being involved 
 with OpenStreetMap for over 5 years\, he has extensive experience working 
 with spatial data and advocates open geographic data. He helps organisatio
 ns to build and maintain their data infrastructure. He is found working wi
 th other technologists\, social activists and researchers in education\, h
 uman rights and policy making. Along with two others\, he runs the [geohac
 kers.in](http://geohackers.in) collective.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/scaling-spatial-data-o
 penstreetmap-as-infrastructure-VTpsPK2QwTrr7nKqV6gEFF
BEGIN:VALARM
ACTION:display
DESCRIPTION:Scaling Spatial Data - OpenStreetMap as Infrastructure. in Aud
 itorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:High-tea – sponsored by Flipkart
DTSTART:20140725T100000Z
DTEND:20140725T104500Z
DTSTAMP:20260801T124109Z
UID:session/55fN7Pqiby5KWqqdrtHVbe@hasgeek.com
SEQUENCE:0
CREATED:20140612T000056Z
DESCRIPTION:\n
LAST-MODIFIED:20140721T050920Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:High-tea – sponsored by Flipkart in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Using Cascalog and Clojure to make the elephant move!
DTSTART:20140725T104500Z
DTEND:20140725T110000Z
DTSTAMP:20260801T124109Z
UID:session/6X9nev2diTWy8ewkGTBied@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Intermediate
CREATED:20140705T025546Z
DESCRIPTION:Machine learning practitioners often spend 70%-80% of their ti
 me on cleaning\, transforming and scrubbing data. Quick iterations and abi
 lity to create large number of features out of raw data are pre-requisites
  for any serious machine learning activity. There is also a large gap betw
 een a skill sets and objectives of data scientists and programmers. Data s
 cientists need to slice and dice through real world _big data_. Implementi
 ng all the low level plumbing of multiple\, dependent Hadoop jobs is a ser
 ious impediment to this goal. On the other hand\, companies cannot afford 
 to have _lab_ tools running\, as they have large impedance mismatch with p
 roduction environments.\nWe have seen Cascalog library and functional abst
 ractions of Clojure providing a sweet spot. It lets data scientists focus 
 on their job of understanding data without worrying about complicated clas
 s hierarchies or un-intuitive domain specific languages. On the other hand
 \, Cascalog and Clojure seamlessly integrate with JVM based stack. The res
 ult is a fun and productive way to process data at scale!\n\n### Speaker b
 io\n\nHarshad Saykhedkar is senior data scientist at Sokrati\, a digital a
 dvertising startup based out of Pune. He has experience of applying machin
 e learning to problems in advertising\, banking and telecom sectors for pa
 st four years. He has used multiple tools (R\, Python\, SAS) in this journ
 ey and lately fallen in love with the Clojure ecosystem\, hacking his way 
 to create data processing tools at Sokrati. Harshad holds a master's degre
 e in Operations Research from IIT\, Mumbai.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/using-cascalog-and-clo
 jure-to-make-the-elephant-move-6X9nev2diTWy8ewkGTBied
BEGIN:VALARM
ACTION:display
DESCRIPTION:Using Cascalog and Clojure to make the elephant move! in Audit
 orium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Analytics on Large Scale\, Unstructured\, Dynamic Data using Lambd
 a Architecture
DTSTART:20140725T104500Z
DTEND:20140725T113000Z
DTSTAMP:20260801T124109Z
UID:session/2Cm7dJhKzTqkVCqKzxVUGN@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140704T115156Z
DESCRIPTION:Indix is a product intelligence platform. Our catalog has seve
 ral million products and billions of price points collected from thousands
  of e-commerce web sites and is constantly growing. We collect product dat
 a via crawling product pages from these web sites. Our parsers extract pro
 duct attributes from these pages which are then run through a series of ma
 chine learning algorithms to classify and extract deeper product attribute
 s. This data gets deduped between stores and then matched across stores an
 d is finally fed into our analytics engine which provides insights to our 
 customers.\n\nOur first attempt at building this system around two years a
 go was chaotic. We were dealing with e-commerce sites whose pages were uns
 tructured and were constanly changing. Our parsers and machine learning al
 gorithms were also improving regularly. All this meant that we had to run 
 our algorithms on the entire data set very often. It was not uncommon for 
 our data refreshes to run for days which meant high latency for product an
 d price data. In addition to that\, our data systems were not human fault 
 tolerant. We had issues where an incorrect algorithm would get accidentall
 y deployed to production and corrupt the data we were serving. Since our d
 ata store was mutable and did not mantain these changes\, it was not easy 
 to fix these corruption issues.\n\nWe realized soon that we had to re-thin
 k about our data system from ground up. We needed a simpler approach that 
 would scale\, be tolerant to human errors and can evolve with our product.
 \n\nLambda architecture\, coined by Nathan Marz\, the creator of Storm and
  Cascalog\, seemed like a step in the right direction for us. \nThe system
  has been in production for more than a year now\, handling 3X more data t
 han our older system and most importantly is more robust. \n\nLambda archi
 tecture\, at its core\, is a set of architecture principles that allows bo
 th batch and real-time or stream data processing to work together while bu
 ilding immutability\, recomputation and human fault tolerance into the sys
 tem.\n\nIt has three layers - batch\, serving and speed.\n\nThe batch laye
 r is responsible for computing arbitrary views on the master data. Our mas
 ter data is an immutable store in HDFS and we compute views using a series
  of Map Reduce jobs using Scalding and Spark.  Our batch system runs recom
 putation every day on our entire data set. \n\nThe serving layer indexes a
 nd exposes precomputed views to be queried ad-hoc with low latency.  We us
 e HBase\, Solr and our own inhouse inmemory implementation for the serving
  layer.\n\nThe speed layer deals only with new data and compensates for th
 e high latency updates of the serving layer by creating realtime views. Ou
 r real time latency requirements are in few hours and not in seconds\, whi
 ch allows us to use a micro-batch architecture that is a stripped down ver
 sion of our batch layer and uses the same technologies.\n\nTo get the fina
 l result\, the batch and realtime views must be queried and the results me
 rged together.\n\nTopics I will cover \n\n- Why Lambda Architecture? What 
 problems did it solve for us? \n- Technical Challenges encountered in buil
 ding the lambda architecture\n    - Schema Evolution\n    - HDFS Small Fil
 es Issue\n    - Code re-use between batch and real time systems \n- Modeli
 ng the data pipelines for each layer\n- Open problems\n\n### Speaker bio\n
 \nRajesh Muppalla is a co-founder and Director of Engineering at Indix\, w
 here he leads the data platform team that is responsible for collecting\, 
 organizing and structuring all the product related data collected from the
  web.\n\nHe is passionate about big data\, large scale distributed systems
 \, continuous delivery and algorithms. He also likes mentoring and coachin
 g developers in pursuit of building better software.\n\nPrior to Indix\, h
 e was a technical lead on Go-CD\, an agile and release management product\
 , at Thoughtworks. The product has been recently open-sourced. \n\nHe is a
  gold medallist in Computer Science from Pune University. In his final yea
 r of graduation\, his team represented India at Asia finals of Microsoft I
 magine Cup (then called Microsoft .NET campus challenge).\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/analytics-on-large-sca
 le-unstructured-dynamic-data-using-lambda-architecture-2Cm7dJhKzTqkVCqKzxV
 UGN
BEGIN:VALARM
ACTION:display
DESCRIPTION:Analytics on Large Scale\, Unstructured\, Dynamic Data using L
 ambda Architecture in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Storing relationships in large data-sets using Graphs
DTSTART:20140725T110000Z
DTEND:20140725T113000Z
DTSTAMP:20260801T124109Z
UID:session/DXy2rEmNjvB1Pzr18m9hSM@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Advanced
CREATED:20140705T025640Z
DESCRIPTION:This work was motivated to store large amounts of linkeddata i
 n an ad system and make it available for programmatic/analytics consumptio
 n. \n\nThis talk outlines our journey which started from researching exist
 ing graphdb's/processing frameworks\, why they didn't work for us at our s
 cale and then moving on to build something.\nWe will go in depth to explai
 n the data-structures  used and how we supported the tinker-pop graph API 
 specification( used by all graph databases). We will also touch upon how o
 ur ad-system unique data model allowed us to come up with a fairly simplis
 tic technique to shard the entire thing and query over it.\n\nTakeaways fr
 om this talk -\n\n* what are graphdb's\, when should you choose one.\n* di
 fferent use-cases require different stores.\n* what it takes to build a gr
 aph store for allo-centric(alike OLAP) graph traversals.\n\n### Speaker bi
 o\n\nInder Singh - have been working on solving data related problems at I
 nmobi(World's largest independent ad-network) for the past ~3 years.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/storing-relationships-
 in-large-data-sets-using-graphs-DXy2rEmNjvB1Pzr18m9hSM
BEGIN:VALARM
ACTION:display
DESCRIPTION:Storing relationships in large data-sets using Graphs in Audit
 orium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Keynote: Realizing Large-scale Distributed Deep Learning Networks 
 over GraphLab
DTSTART:20140725T113000Z
DTEND:20140725T121500Z
DTSTAMP:20260801T124109Z
UID:session/HmYDFDLLXnQCKK9eaqsfpB@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140703T095601Z
DESCRIPTION:Large scale distributed deep learning networks are the holy gr
 ail of the machine learning/AI/data science fields. The applications of de
 ep learning networks are in image processing\, speech recognition and vide
 o analytics. We have implemented such a network over GraphLab\, the open s
 ource graph processing framework. As can be expected\, several extensions 
 were made to GraphLab for implementing such deep learning networks. The ke
 y extension was the ability to run multiple instances of the distributed G
 raphLab engine in the same cluster – this allows the training data to be
  parallelized. Another missing abstraction was that of a layer – which c
 omprises/aggregates several nodes (vertices of GraphLab). An additional re
 quirement was mass communication between two layers\, the ability to send 
 a message to all vertices that belong to a layer. We have implemented thes
 e abstractions in GraphLab and have consequently realized the deep learnin
 g network over a cluster of nodes. We have used this network for Arrhythmi
 a detection from ECG images. The talk would also articulate some of the pe
 rformance studies we have conducted on our distributed deep learning netwo
 rk.\n\n### Speaker bio\n\nDr. Vijay Srinivas Agneeswaran has a Bachelor's 
 degree in Computer Science & Engineering from SVCE\, Madras University (19
 98)\, an MS (By Research) from IIT Madras in 2001 and a PhD from IIT Madra
 s (2008). He was a post-doctoral research fellow in the LSIR Labs\, Swiss 
 Federal Institute of Technology\, Lausanne (EPFL) for a year. He has spent
  the last seven years creating intellectual property and building products
  in the big data area in Oracle\, Cogniizant and Impetus\, where he is now
  Director\, Big Data Labs. He has built PMML support into Spark/Storm and 
 is building a big data governance product for a role-based fine-grained ac
 cess control inside of Hadoop YARN. He is a professional member of the ACM
  and the IEEE for the last 8+ years. He has filed patents with US and Euro
 pean patent office's (with two issued US patents) and published in leading
  journals and conferences\, including IEEE transactions. His research inte
 rests include distributed systems - cloud\, grid\, peer-to-peer computing 
 as well as machine learning for Big-Data and other emerging technologies.\
 n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/realizing-large-scale-
 distributed-deep-learning-networks-over-graphlab-HmYDFDLLXnQCKK9eaqsfpB
BEGIN:VALARM
ACTION:display
DESCRIPTION:Keynote: Realizing Large-scale Distributed Deep Learning Netwo
 rks over GraphLab in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:CIO Panel
DTSTART:20140725T121500Z
DTEND:20140725T134500Z
DTSTAMP:20260801T124109Z
UID:session/QWLoSnZYM1WoUQvsGfE6T3@hasgeek.com
SEQUENCE:0
CREATED:20140705T030533Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20200619T062516Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:CIO Panel in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Check-in and breakfast
DTSTART:20140726T030000Z
DTEND:20140726T034500Z
DTSTAMP:20260801T124109Z
UID:session/QwiEUPZWzvndb9BiVuhVWr@hasgeek.com
SEQUENCE:0
CREATED:20140527T030358Z
DESCRIPTION:\n
LAST-MODIFIED:20140527T030405Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Check-in and breakfast in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Summary of day 1
DTSTART:20140726T034500Z
DTEND:20140726T040000Z
DTSTAMP:20260801T124109Z
UID:session/Biw9YFt1jw1mGPQmc9EC5A@hasgeek.com
SEQUENCE:0
CREATED:20140527T030415Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140527T061240Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Summary of day 1 in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Keynote: Personalized medicine and big data - Anu Acharya
DTSTART:20140726T040000Z
DTEND:20140726T044500Z
DTSTAMP:20260801T124109Z
UID:session/Sg8JMtyCtioKcLpVWnxhBh@hasgeek.com
SEQUENCE:0
CREATED:20140703T094729Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20200619T062516Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Keynote: Personalized medicine and big data - Anu Acharya in A
 uditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lessons from Elasticsearch in production
DTSTART:20140726T044500Z
DTEND:20140726T053000Z
DTSTAMP:20260801T124109Z
UID:session/XcoVLT7xnPJL55pEzL7ZxR@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140705T025737Z
DESCRIPTION:The customer-facing side of Helpshift product is a simple chat
  feature within the app using the Helpshift mobile SDK. The business-facin
 g side is a complex agent dashboard that helps the agent in processing as 
 many issues as quickly as possible. We will be focusing on this business-f
 acing side\, the designs we built on top of Elasticsearch and the problems
  we faced and how we went about solving them\, a few of them are:\n\n1. Ar
 chitecture 101 - Importance of separating master-only and data-only nodes\
 , etc.\n2. How we index documents for each customer - the flaw in having o
 ne index per customer\, and possible solutions\n3. Multilingual data - the
  importance of the phonetic plugin\n4. Complex views - the importance of u
 nderstanding filters and how to combine them\n5. The importance of benchma
 rking - how we implemented using percolators for live notifications of new
  issues\n6. Restarts & Upgrades - the importance of disabling shard alloca
 tion and clustering\n7. Bulk Indexing - the importance of controlling repl
 ica count\, etc.\n8. Runtime debugging - the importance of cat APIs\, etc.
 \n\n### Speaker bio\n\nI work in the backend team at Helpshift.com\, a cus
 tomer service platform for mobile apps\, our customers include Flipboard\,
  Supercell\, Flipkart\, and several others.\n\nI have previously worked at
  Automatic.com (the "UI for your car" company)\, Infibeam.com\, Adobe and 
 Yahoo!.\n\nI have also written a couple of Creative Commons-licensed books
  - A Byte of Python and A Byte of Vim.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/lessons-from-elasticse
 arch-in-production-XcoVLT7xnPJL55pEzL7ZxR
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lessons from Elasticsearch in production in Auditorium 2 in 5 
 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Live analytical dashboards at scale - SQL style
DTSTART:20140726T044500Z
DTEND:20140726T053000Z
DTSTAMP:20260801T124109Z
UID:session/KGiBPJjfdPy9tUpWDWTPER@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140703T095400Z
DESCRIPTION:Fireball is a stream processing engine at Flipkart. It powers 
 real time analytical dashboards to enable business take time-sensitive dec
 isions\, at scale. Fireball can process millions of events (with flexible\
 , json-like schema) per hour that require:\n* executing custom process (us
 ually SQL-like) to derive business metrics from the incoming events\n* ove
 r large number of dimensions (on an average 10 dimensions for each measure
 )\n* with very low latency and ensuring correctness all the time (enabling
  time-sensitive decision making)\n\nSo how do you build such a system? How
  do you store such a large amount of time-series data to ensure roll-ups\,
  drill-downs on different dimensions? In this talk we'll go over the trans
 formation of a standard stream processing platform and a CEP library into 
 Fireball.\n\n### Speaker bio\n\nI am Architect @ Flipkart and am part of t
 he Data Platform effort.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/live-analytical-dashbo
 ards-at-scale-sql-style-KGiBPJjfdPy9tUpWDWTPER
BEGIN:VALARM
ACTION:display
DESCRIPTION:Live analytical dashboards at scale - SQL style in Auditorium 
 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Morning tea break
DTSTART:20140726T053000Z
DTEND:20140726T060000Z
DTSTAMP:20260801T124109Z
UID:session/BHLSMDeFD88hpdf3HHhJtt@hasgeek.com
SEQUENCE:0
CREATED:20140527T030451Z
DESCRIPTION:\n
LAST-MODIFIED:20140529T032136Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Morning tea break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Scaling real time visualisations for Elections 2014
DTSTART:20140726T060000Z
DTEND:20140726T064500Z
DTSTAMP:20260801T124109Z
UID:session/Zh77RPVHduD3RNwJXSeAY@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140703T095420Z
DESCRIPTION:The CNN-IBN Microsoft Election Analytics Center\, which you ca
 n see the live visualisations at www.bing.com/elections\, served over 10 m
 illion requests on election day. \n\nThis includes real-time filtering of 
 the election commission results – based on turnouts\, margins\, computat
 ions of anti-incumbency factors\, alliance groupings\, etc.\n\nThis talk i
 s about the engineering from Gramener that went into making the site fast 
 and responsive.\n\n### Speaker bio\n\nAnand is the chief data scientist at
  Gramener. He explores data stories visually with Python and Javascript.\n
 \nHe blogs at http://www.s-anand.net/\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/scaling-real-time-visu
 alisations-for-elections-2014-Zh77RPVHduD3RNwJXSeAY
BEGIN:VALARM
ACTION:display
DESCRIPTION:Scaling real time visualisations for Elections 2014 in Auditor
 ium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Scaling SolrCloud to a large number of collections
DTSTART:20140726T060000Z
DTEND:20140726T064500Z
DTSTAMP:20260801T124109Z
UID:session/Ht9FgEpK1cyo2tL1pTMNhZ@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Advanced
CREATED:20140705T025754Z
DESCRIPTION:The traditional and typical search use case is the one large s
 earch collection distributed among many nodes and shared by all users. How
 ever\, there is a class of applications which need a large number of small
  or medium collections which can be used\, managed and scaled separately. 
 This talk will cover our effort in helping a client set up a large scale S
 olrCloud setup with thousands of collections running on hundreds of nodes.
  I will describe the bottlenecks that we found in SolrCloud when running a
  large number of collections. I will also take you through the multiple fe
 atures and optimisations that we contributed to Apache Solr to reduce or r
 emove the choke points in the system. Finally\, I will talk about the benc
 hmarking process and the lessons learned from the exercise.\n\n### Speaker
  bio\n\nI am a committer on Apache Lucene/Solr since 2008 as well as a mem
 ber of the Lucene/Solr project management committee. I've worked at AOL fo
 r five years on vertical search\, content mangement systems\, social/commu
 nity platforms and anti-spam systems as well as AOL WebMail's Inbox Search
  system which uses a highly customized version of Apache Solr to service t
 ens of millions of users and more than a billion index/search operations a
  day. I currently works at LucidWorks Inc. on Apache Solr and LucidWorks S
 earch mostly on the SolrCloud side of things. I also help organize the Ban
 galore Apache Solr/Lucene Meetup Group which has 350+ members and holds re
 gular meetings of people interested in Lucene\, Solr and search in general
 .\n\nhttps://twitter.com/shalinmangar\nhttp://www.meetup.com/Bangalore-Apa
 che-Solr-Lucene-Group/\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/scaling-solrcloud-to-a
 -large-number-of-collections-Ht9FgEpK1cyo2tL1pTMNhZ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Scaling SolrCloud to a large number of collections in Auditori
 um 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:The state of Julia - a fast language for technical computing
DTSTART:20140726T064500Z
DTEND:20140726T070000Z
DTSTAMP:20260801T124109Z
UID:session/5qAHzZFv1QqepNWGrJzJdW@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Intermediate
CREATED:20140705T030042Z
DESCRIPTION:I will give a talk introducing Julia for those who have not he
 ard about it. I will talk about progress of the language since the last ye
 ar\, and some glimpses into where we are headed. I will also discuss the g
 rowth of the Julia community.\n\nJulia is a high-level\, high-performance 
 dynamic programming language for technical computing\, with syntax that is
  familiar to users of other technical computing environments. It provides 
 a sophisticated compiler\, distributed parallel execution\, numerical accu
 racy\, and an extensive mathematical function library. The library\, large
 ly written in Julia itself\, also integrates mature\, best-of-breed C and 
 Fortran libraries for linear algebra\, random number generation\, signal p
 rocessing\, and string processing. In addition\, the Julia developer commu
 nity is contributing a number of external packages through Julia’s built
 -in package manager at a rapid pace. IJulia\, a collaboration between the 
 IPython and Julia communities\, provides a powerful browser-based graphica
 l notebook interface to Julia.\n\nJulia programs are organized around mult
 iple dispatch\; by defining functions and overloading them for different c
 ombinations of argument types\, which can also be user-defined. For a more
  in-depth discussion of the rationale and advantages of Julia over other s
 ystems\, see the following highlights or read the introduction in the onli
 ne manual.\n\nhttp://www.julialang.org/\nhttp://docs.julialang.org/\n\n###
  Speaker bio\n\nI am one of the co-creators of the Julia programming langu
 age.\n\nhttp://in.linkedin.com/in/viralbshah\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/the-state-of-julia-a-f
 ast-language-for-technical-computing-5qAHzZFv1QqepNWGrJzJdW
BEGIN:VALARM
ACTION:display
DESCRIPTION:The state of Julia - a fast language for technical computing i
 n Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Hadoop – Diagnose your Hadoop Jobs
DTSTART:20140726T064500Z
DTEND:20140726T070000Z
DTSTAMP:20260801T124109Z
UID:session/Eu9TzPSSyBnK6ZFfb8GGxo@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Intermediate
CREATED:20140705T030019Z
DESCRIPTION:This talk is about a tool that we have developed within intuit
  – Dr. hadoop\, which analyzes your job\, identifies the areas of improv
 ements and gives recommendations to improve its performance. It collects a
 ll the history logs\, counters and configuration of your job\, applies a s
 et of rules and provides recommendations with suggested values and severit
 y.\n\n### Speaker bio\n\nI am a hadoop performance engineer@Intuit. I have
  been working on hadoop performance for more than 3 years.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/dr-hadoop-diagnose-you
 r-hadoop-jobs-Eu9TzPSSyBnK6ZFfb8GGxo
BEGIN:VALARM
ACTION:display
DESCRIPTION:Dr. Hadoop – Diagnose your Hadoop Jobs in Auditorium 1 in 5 
 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:De-dup on Hadoop
DTSTART:20140726T070000Z
DTEND:20140726T073000Z
DTSTAMP:20260801T124109Z
UID:session/JR4pHCJJfoux6JbSkkLzWH@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Beginner
CREATED:20140705T030003Z
DESCRIPTION:In many enterprises it's commonly seen that business data has 
 a lot of client\, customer\, vendor or product lists in different formats 
 and systems\, many of which are near duplicates.MDM solutions on RDBMS hav
 e been prominent for many years in almost every enterprise to support mast
 er data management by removing duplicates\, standardizing data and incorpo
 rating rules to eliminate incorrect data from entering the system in order
  to create an authoritative source of master data. MDM on Big data platfor
 ms like Hadoop have benefits as well as it's own set of challenges when co
 mpared with the RDBMS counterparts. I will cover them in detail primarily 
 focusing on building this solution on Hadoop.\n\n### Speaker bio\n\nI am D
 ata Architect at Intuit with 13+ years of experience in BI and Data Analyt
 ics. Prior to Intuit\, I have worked at Intel\, Oracle and EMC applying BI
  in Manufacturing\, Finance and Storage Analytics domain.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/de-dup-on-hadoop-JR4pH
 CJJfoux6JbSkkLzWH
BEGIN:VALARM
ACTION:display
DESCRIPTION:De-dup on Hadoop in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Using Data for Art
DTSTART:20140726T070000Z
DTEND:20140726T073000Z
DTSTAMP:20260801T124109Z
UID:session/CnBvtcD5PenztzbfKengZZ@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Beginner
CREATED:20140705T030055Z
DESCRIPTION:A talk on Data Art for a tech-focused audience must introduce 
 the essence of art (vs design) and the purpose of creating art. I'll intro
 duce the field with a few teasers (conventional art vs data art) and then 
 introduce the modes in which data can be mapped in the final art form (mos
 tly visual attributes\, and a bit of other forms of media too). This will 
 be followed by an introduction to Generative Systems in art\, where an art
  form evolves from data (and isn't completely controlled by an artist) and
  a categorization of these systems (with hand picked examples) that I had 
 explored recently.\n\n### Speaker bio\n\nI'm Rasagy Sharma\, and I visuali
 ze data at Microsoft IDC. I'm a recent post graduate from National Institu
 te of Design\, Bangalore where I pursued Information & Interface Design\, 
 which had a heavy focus on Data Visualization.\n\nI've recently been inter
 ested in Generative Systems to create art\, and this talk will leverage on
  my research for my Colloquium paper on Creative Coding\, in which I categ
 orized ~100 generative art projects and the popular media used.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/using-data-for-art-CnB
 vtcD5PenztzbfKengZZ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Using Data for Art in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20140726T073000Z
DTEND:20140726T083000Z
DTSTAMP:20260801T124109Z
UID:session/ipgsvK7pcyY49zaci1wXe@hasgeek.com
SEQUENCE:0
CREATED:20140527T031015Z
DESCRIPTION:\n
LAST-MODIFIED:20140704T082535Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Apache Tez: Accelerating Hadoop Data Pipelines
DTSTART:20140726T083000Z
DTEND:20140726T091500Z
DTSTAMP:20260801T124109Z
UID:session/5ALMUy1sYQXsoHkm3qgHCa@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Beginner
CREATED:20140705T030129Z
DESCRIPTION:Apache Tez is a modern data processing engine designed for YAR
 N on Hadoop 2. Tez aims to provide high performance and efficiency out of 
 the box\, across the spectrum of low latency queries and heavy-weight batc
 h processing. With a clear separation between the logical app layer and th
 e physical data movement layer\, Tez is designed from the ground up to be 
 a platform on top of which a variety of domain specific applications can b
 e built. Tez has pluggable control and data planes that allow users to plu
 g in custom data transfer technologies\, concurrency-control and schedulin
 g policies to meet their exact requirements. The talk will elaborate on th
 ese features via real use cases from early adopters like Hive\, Pig and Ca
 scading.\n\n### Speaker bio\n\nGopal works on performance problems in hado
 op ecosystem. He's involved with the Stinger effort from Hortonworks to im
 prove the SQL data access layers in Hadoop. He is a contributor to the Apa
 che Hive project and a committer for the Apache Tez project.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/apache-tez-acceleratin
 g-hadoop-data-pipelines-5ALMUy1sYQXsoHkm3qgHCa
BEGIN:VALARM
ACTION:display
DESCRIPTION:Apache Tez: Accelerating Hadoop Data Pipelines in Auditorium 1
  in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Crafting Visual Stories with Data
DTSTART:20140726T083000Z
DTEND:20140726T091500Z
DTSTAMP:20260801T124109Z
UID:session/UgTNaWHpRRoyg8Sjtt42xr@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Beginner
CREATED:20140705T030115Z
DESCRIPTION:*"I think people have begun to forget how powerful human stori
 es are\, ex-changing their sense of empathy for a fetishistic fascination 
 with data\, networks\, patterns\, and total information... Really\, the da
 ta is just part of the story. The human stuff is the main stuff\, and the 
 data should enrich it." - Jonathan Harris*\n\nStories have been recognized
  for their power of communication & persuasion for centuries. There is an 
 increasingly realisation that we need to operate at this intersection of d
 ata\, visual and stories to fully harness the power of big data. \n\nIn th
 is session\, I will showcase the basic building blocks of storytelling acr
 oss different mediums - oral storytelling\, journalistic written stories\,
  graphic comics and movies. And then explore the idea that we can integera
 te narrative storytelling lessons from these mediums with our data visuali
 sation to start crafting visual stories with data. \n\nI will summarize ba
 sic design principles that can help us in our crafting journey\, as we tak
 e the data through the layers of abstraction - See the Data | Show the Vis
 ual | Tell the Story | Engage the Audience. The focus would be on sharing 
 'why' stories work and aim to unpack the six dimensions of creating a data
 -visual-story.\n  \n1. Abstraction (data patterns)\n2. Framing (perspectiv
 e\, genre)\n3. Representation (visual encoding including color)\n4. Messag
 ing (verbal\, text annotation)\n5. Flow (arrangement\, transition)\n6. Int
 eractivity\n\nI will be using exemplars from my work and other real-world 
 data-stories to explore this topic.\n\n### Speaker bio\n\nI am interested 
 in learning and teaching the craft of telling visual stories with data. I 
 use storytelling and data visualization as tools for improving communicati
 on\, persuasion and leadership. I am a partner at [narrativeviz Consulting
 ](http://narrativeviz.com) where I conduct workshops and trainings for cor
 porates\, non-profits\, colleges\, and individuals. I also teach sessions 
 on storytelling with data as invited expert / guest faculty in data visual
 ization and analytics related courses at IIM Bangalore and IIM Ahmedabad. 
 My background is in strategy consulting in using data-driven stories to dr
 ive change across organizations and businesses. I have more than 12 years 
 of consulting experience\, first with AT Kearney in India and then with Bo
 oz & Company in Europe. I did my B.Tech from IIT\, Delhi and PGDM from IIM
 \, Ahmedabad. You can find more about me at http://amitkaps.com and tweet 
 me at @amitkaps\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/crafting-visual-storie
 s-with-data-UgTNaWHpRRoyg8Sjtt42xr
BEGIN:VALARM
ACTION:display
DESCRIPTION:Crafting Visual Stories with Data in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Flash talks
DTSTART:20140726T091500Z
DTEND:20140726T094500Z
DTSTAMP:20260801T124109Z
UID:session/8HvJkH4EmB1SomYkoLwcPa@hasgeek.com
SEQUENCE:0
CREATED:20140705T030352Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140721T051246Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Flash talks in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Flash talks
DTSTART:20140726T091500Z
DTEND:20140726T100000Z
DTSTAMP:20260801T124109Z
UID:session/BzpmRMNaq7FVJ9YLReKf2N@hasgeek.com
SEQUENCE:0
CREATED:20140721T051232Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140721T051256Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Flash talks in Auditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Real Time User-Scoring for Bidding in Display Retargeting
DTSTART:20140726T094500Z
DTEND:20140726T100000Z
DTSTAMP:20260801T124109Z
UID:session/DtsEqkAxm5hqpscQJHonb5@hasgeek.com
SEQUENCE:2
CATEGORIES:Crisp talk,Beginner
CREATED:20140705T030152Z
DESCRIPTION:To participate in an auction\, we must come up with a dollar v
 alue that we are ready to bid for each user. To do this\, we analyze the s
 ite-activity of the users in real-time\, along with any meta-data we may h
 ave. In this session\, we walk through how we have broken this problem int
 o smaller pieces\, and through optimizations based on CTR\, conversion pro
 bability and expected revenue\, our campaigns are designed towards achievi
 ng business goals measured in terms of achieved Return Over Ad Spend (ROAS
 ). We also handle ad-serving\, and deciding items to recommend in the ad u
 nits\, in case of bids won.\n\n### Speaker bio\n\nI\, Ambuj Singh\, gradua
 ted in Computer Science from IIT Kanpur in 2012. Since then\, I have been 
 working in @WalmartLabs\, first in the twitter analysis team\, and current
 ly in the Display Ads team.\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/real-time-user-scoring
 -for-bidding-in-display-retargeting-DtsEqkAxm5hqpscQJHonb5
BEGIN:VALARM
ACTION:display
DESCRIPTION:Real Time User-Scoring for Bidding in Display Retargeting in A
 uditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Evening tea break
DTSTART:20140726T100000Z
DTEND:20140726T103000Z
DTSTAMP:20260801T124109Z
UID:session/JuStfEjHgSoYinjZu5x7YP@hasgeek.com
SEQUENCE:0
CREATED:20140527T070132Z
DESCRIPTION:\n
LAST-MODIFIED:20140703T191341Z
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Evening tea break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Application of analytics in retail - Anindo Chakraborthy
DTSTART:20140726T103000Z
DTEND:20140726T111500Z
DTSTAMP:20260801T124109Z
UID:session/5hG3fZ7VCZ8gHqsTjtxDbm@hasgeek.com
SEQUENCE:0
CREATED:20140724T135511Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140724T135523Z
LOCATION:Auditorium 2 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Application of analytics in retail - Anindo Chakraborthy in Au
 ditorium 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Experimentation to Productization : developing a Dynamic Bidding s
 ystem for a location aware Mobile landscape 
DTSTART:20140726T103000Z
DTEND:20140726T111500Z
DTSTAMP:20260801T124109Z
UID:session/Hmp28v3Z3DZByPCmkhNTCw@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140705T030249Z
DESCRIPTION:I will start with the obvious\, ie. how do advertisers reach y
 ou on mobile\, fetching your comprehensive digital footprint\, all in 6 mi
 lliseconds\, or less. Then look through sample digital footprint(weblogs)\
 , laying the ground to understand the data\, and algorithms to derive stat
 istical relationships. \n\nTowards this\, I will then talk about identifyi
 ng quick wins to deliver outcomes\, all through data and in this introduce
  Hypothesis based Engineering - ie how not to go down a bottomless pit.\n\
 nI will then spend majority of time talking about 3 problems we adressed a
 t AdNear to increase the bottom line for our clients -\n\n1. Algorithm to 
 develop a Dynamic bidding system which prices each opportunity to bid\, ba
 sed on the quality of that "specific" inventory - Towards this\, I will fo
 cus on how we built meta-data for otherwise\, not so userful attributes li
 ke "user-agent"\, data for creatives\, besides the obvious "features"\n\n2
 . Characterizing user-mobilty patterns to generate user profiles - ie give
 n a cross section of user\, how do we map the activities associated with t
 heir geographical footprint - and generate & probablistic picture of his a
 ctivity patterns & affinity towards general activities\n\n3. Developing a 
 comprehensive app-ranking system : How we use web to increase the informat
 ion content of the apps to deliver Business outcomes that matter. The syst
 em updates the snapshot across multiple dimensions for each of the unique 
 appids in the system every hour\, to deliver a self aligning machine learn
 ing system at scale\n\nFinally I will close this with the framework we bui
 lt to measure all this in real time - A/B testing framework\, Simulation &
  Reporting - which supported the Experimentation phase\, created stickness
  that pushed the productization of data into our Production systems\, whil
 e doing so at Scale.\n\n### Speaker bio\n\nEkta is Data Scientist with AdN
 ear Pte.\, where she is designing Dynamic Bidding systems and A/B testing 
 framework for bidding in location based mobile targeting space\, to increa
 se the bottom line for clients across Asia-Pacific. She has a background i
 n Quantitative Economics(MS) from Goethe-University\, Frankfurt and Comput
 er Science(BS) from Bangalore\, India and enjoys Monetizing and leveraging
  technology to solve abstract Business problems. While at Grad school she 
 became passionately interested in rationality\, framing problems and how w
 e human being respond to ambiguous choices\, something she sews in technic
 al dimensions with a scientific rigour.\n\nPrior to AdNear\, she was with 
 [24]7 Inc.\, Innovation Labs\, where she was responsible for end to end so
 lutioning\, statistical analysis and deployment of Analytic models for e-c
 ommerce clients and designing intuitive customer experiences. Before that 
 she has worked in roles across Quality Engineering (VMware Inc.)\, Program
  Management (SAP Labs) and Experimentation methods\, Auctions & Macroecono
 mics while pursuing her Masters at Goethe University.\n\nShe presented a t
 alk at Pycon 2013\, Bangalore\, selected as a speaker for Pycon APAC\, 201
 4 Taipei. Also accepted to present the same in Grace Hopper's conference f
 or Women in Computing\, 2014 at Pheonix(USA)\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/experimentation-to-pro
 ductization-developing-a-dynamic-bidding-system-for-a-location-aware-mobil
 e-landscape-Hmp28v3Z3DZByPCmkhNTCw
BEGIN:VALARM
ACTION:display
DESCRIPTION:Experimentation to Productization : developing a Dynamic Biddi
 ng system for a location aware Mobile landscape  in Auditorium 1 in 5 minu
 tes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:The ART of Data Mining - Practical Learnings from Real-world Data 
 Mining applications
DTSTART:20140726T111500Z
DTEND:20140726T120000Z
DTSTAMP:20260801T124109Z
UID:session/C1cLLT5Z24fPr3XhZfAQBV@hasgeek.com
SEQUENCE:2
CATEGORIES:Full talk,Intermediate
CREATED:20140705T030304Z
DESCRIPTION:The role of a data scientist has evolved in the last few years
  from someone who can "put-together" a "modelling pipeline" to someone who
  can: (a) "understand" the data beyond basic statistics and simple visuali
 zations\, (b) extract "deep" and "novel" insights from the data\, (c) engi
 neer "better features" to fairly distribute complexity between features an
 d models\, (d) visualize and make sense of complex data types like network
 s\, unstructured text corpora\, etc.\, and (e) create innovative ways of h
 arnessing data to make smarter decisions.\n\nIn order to create "magic fro
 m data"\, a data scientist must go beyond the SCIENCE\, ENGINEERING\, and 
 PROCESS and delve into the ART of data mining. In this talk I will share a
  number of "mistakes" and "innovations" in this context that helped me bui
 ld better models in domains as diverse as remote sensing\, text classifica
 tion\, text clustering\, fraud detection\, information retrieval\, bioinfo
 rmatics\, retail data mining\, and image understanding\, etc. \n\nThese pr
 actical insights might help the audience pay attention to the right detail
 s in the modelling process\, look for model improvements in the right plac
 es\, be more creative with their data and use its full potential\, and eve
 n overcome the limitations of their modeling tools.\n\n### Speaker bio\n\n
 http://www.linkedin.com/in/shaileshk\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20230810T072606Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2014/schedule/the-art-of-data-mining
 -practical-learnings-from-real-world-data-mining-applications-C1cLLT5Z24fP
 r3XhZfAQBV
BEGIN:VALARM
ACTION:display
DESCRIPTION:The ART of Data Mining - Practical Learnings from Real-world D
 ata Mining applications in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Feedback
DTSTART:20140726T120000Z
DTEND:20140726T121500Z
DTSTAMP:20260801T124109Z
UID:session/QkwX63qTco5syPVrADjTbe@hasgeek.com
SEQUENCE:0
CREATED:20140705T030502Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20140705T030520Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Feedback in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:HasGeek-Nexus powered panel on "Real-time analytics – technolog
 ies of today\, for tomorrow"
DTSTART:20140726T121500Z
DTEND:20140726T134500Z
DTSTAMP:20260801T124109Z
UID:session/SZhQn9PD8xADSxkoFRbk2T@hasgeek.com
SEQUENCE:0
CREATED:20140630T185331Z
DESCRIPTION:\n
GEO:12.943181274247097;77.59629179723562
LAST-MODIFIED:20200619T062516Z
LOCATION:Auditorium 1 - NIMHANS Convention Centre\nBangalore\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:HasGeek-Nexus powered panel on "Real-time analytics – techn
 ologies of today\, for tomorrow" in Auditorium 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
END:VCALENDAR