BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//HasGeek//NONSGML Funnel//EN
DESCRIPTION:An Event on Big Data and Cloud Computing
X-WR-CALDESC:An Event on Big Data and Cloud Computing
NAME:The Fifth Elephant 2013
X-WR-CALNAME:The Fifth Elephant 2013
REFRESH-INTERVAL;VALUE=DURATION:PT12H
SUMMARY:The Fifth Elephant 2013
TIMEZONE-ID:Asia/Kolkata
X-PUBLISHED-TTL:PT12H
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
SUMMARY:Neo4j Graph Workshop
DTSTART:20130711T040000Z
DTEND:20130711T060000Z
DTSTAMP:20260421T123255Z
UID:session/XJUcVQYHaNjEZe8CSHrnkX@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20190705T045836Z
DESCRIPTION:This hand-on session will be a crash course in using the Neo4j
  graph database. Assuming nothing\, we'll learn about working with Neo4j t
 hrough a progressive series of exercises. \n\nWe will use Neo4j's query la
 nguage\, Cypher\, to:\n\n- create a simple graph\n- import a larger sample
  graph\n- run basic queries to get known data\n- discover new data with gr
 aph patterns\n\nYou will leave with a foundation of how to begin working w
 ith Neo4j\, ready to explore more with language-specific drivers.\n\n### S
 peaker bio\n\nWith NASA\, for the love of technology. Then Zambia\, using 
 technology for social good. Now with Neo4j\, making the world a better pla
 ce from a graph perspective.\n\nAndreas has been part of the Neo4j communi
 ty since having his own graph epiphany while working on medical informatic
 s in Zambia. He joined as an early member of core engineering\, and has no
 w taken on the role of Product Experience Designer\, responsible for matur
 ing that fantastic codebase into an industrial strength product.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/neo4j-graph-workshop-X
 JUcVQYHaNjEZe8CSHrnkX
BEGIN:VALARM
ACTION:display
DESCRIPTION:Neo4j Graph Workshop in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Big Data\, Real-time Processing and Storm
DTSTART:20130711T040000Z
DTEND:20130711T060000Z
DTSTAMP:20260421T123255Z
UID:session/RY88ckWWu7P2CqMMwpa9ys@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20190705T045858Z
DESCRIPTION:Hadoop is predominantly for batch processing. Did you ever won
 der how to process Big Data in real-time? If yes\, this workshop is for yo
 u. \n\nTo give an example\, trends in Twitter are powered by Storm\; Tweet
 s are analyzed in real-time to find the trending topics / hashtags using S
 torm.\n\nThis workshop will introduce the basics of Storm and its salient 
 features. We will discuss how Storm is similar / different from Hadoop. We
  will also run through the source of WordCount example and its demo. And f
 inally we will discuss how Hadoop and Storm together can help process Big 
 Data seamlessly.\n\nIf time permits\, we will also check a simple demo of 
 real-time processing of tweets using Storm.\n\nBrief outline of the sessio
 n has been uploaded to [Slideshare](http://www.slideshare.net/prashanthvvb
 abu/big-data-realtime-processing-and-storm)\, which is also embedded in sl
 ides section below.\nPlease check the slidedeck and let me know if you hav
 e any feedback and / or comments on the outline of the workshop.\n\n**Note
 **: For this session\, we will be using [Storm Local Mode](https://github.
 com/nathanmarz/storm/wiki/Local-mode) for developing and testing the code.
  So\, any laptop with JDK and Maven should suffice.\n\n### Speaker bio\n\n
 Prashanth Babu is a Research Engineer with [NTT DATA](http://www.nttdata.c
 om). He is working on an R & D initiative on Big Data using Apache Hadoop 
 Ecosystem. He is also Cloudera Certified Developer for Apache Hadoop [CCDH
 ].\n\n[About Prashanth](http://About.Me/Prashanth "About Prashanth")\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 3 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/big-data-real-time-pro
 cessing-and-storm-RY88ckWWu7P2CqMMwpa9ys
BEGIN:VALARM
ACTION:display
DESCRIPTION:Big Data\, Real-time Processing and Storm in Audi 3 in 5 minut
 es
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:MongoDB: An Overview 
DTSTART:20130711T063000Z
DTEND:20130711T080000Z
DTSTAMP:20260421T123255Z
UID:session/MS1PirWpVx3zXNu5mYGtN7@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20190705T045933Z
DESCRIPTION:This will be a practical demonstration of MongoDB overviewing 
 \n- Schema design\n- MongoDB query language\n- Indexing options\n- Aggrega
 tion functionality\n- Basics of replication\n- Basics of sharding\n\n### S
 peaker bio\n\nDr. Edouard Servan-Schreiber is Director for Solution Archit
 ecture at 10gen\, advising customers on how to make MongoDB make their bus
 iness simpler\, faster\, and better.\nPreviously\, Edouard was director fo
 r cross-channel analytics at Teradata\, leading projects in advanced analy
 tics and predictive modeling with customers in all heavily data-driven ind
 ustries such as telco\, retail\, finance\, high tech manufacturing.\nEdoua
 rd’s specialty is to help customers extract business value from their da
 ta through the effective use of technology and analytics.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/mongodb-an-overview-MS
 1PirWpVx3zXNu5mYGtN7
BEGIN:VALARM
ACTION:display
DESCRIPTION:MongoDB: An Overview  in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Finding order in the chaos : machine learning for web text analyti
 cs using R
DTSTART:20130711T063000Z
DTEND:20130711T080000Z
DTSTAMP:20260421T123255Z
UID:session/TbPaZeQQErycKsz6upXW7c@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20190705T045946Z
DESCRIPTION:Do you get the feeling of ‘the cart before the horse’ on h
 earing buzz-words like social data mining or sentiment analysis and so on?
  Fundamental text mining methods are the real ‘workhorses’ behind thes
 e buzz-words. This workshop aims to give understanding of the fundamentals
  in  ‘learning by doing’ fashion.\n\nInternet\, the information beast\
 , largely consists of unstructured text form data. R environment provides 
 excellent set of tools to deal with this. We will take up a realistic prob
 lem of finding topics in web-documents and touch upon a number of relevant
  machine learning methods using R.\n\nWe will also cover some relevant and
  interesting business problems which can be tackled using these methods.\n
 \n### Speaker bio\n\nAn avid R user\, I work on applying machine learning 
 methods to the field of digital advertising\, @ Sokrati Inc. I have a prio
 r experience of applying these methods to telecom and banking sector probl
 ems. I hold a master's in Operations Research from IIT\, Mumbai.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 3 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/finding-order-in-the-c
 haos-machine-learning-for-web-text-analytics-using-r-TbPaZeQQErycKsz6upXW7
 c
BEGIN:VALARM
ACTION:display
DESCRIPTION:Finding order in the chaos : machine learning for web text ana
 lytics using R in Audi 3 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Data Analysis and Visualization using R
DTSTART:20130711T090000Z
DTEND:20130711T110000Z
DTSTAMP:20260421T123255Z
UID:session/NgqaSKfqJUfSH5qSBqDP5e@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Intermediate
CREATED:20190705T050019Z
DESCRIPTION:The workshop will work on using R to uncover patterns\, anomal
 ies and insights in public data sets. The workshop will demonstrate on how
  to use R for statistics analysis using different modules and techniques. 
 A significant part of the workshop will also be dedicated to visually expl
 oring patterns and anomalies in data using R modules such as ggplot2.\n\n#
 ## Speaker bio\n\nVinayak Hegde has been working with large scale data and
  analytics for several years for MNCs such as Inmobi and Akamai. He used R
  as a part of Marketplace team in Inmobi to improve Ad-serving relevance a
 nd performance. \n\nHis areas of expertise are data analytics and large sc
 ale networks . He is a polyglot and often works with multiple languages to
  build robust and scalable software systems.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/data-analysis-and-visu
 alization-using-r-NgqaSKfqJUfSH5qSBqDP5e
BEGIN:VALARM
ACTION:display
DESCRIPTION:Data Analysis and Visualization using R in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Workshop: Learning ElasticSearch and using it to analyze Aadhaar's
  Public Datasets
DTSTART:20130711T090000Z
DTEND:20130711T110000Z
DTSTAMP:20260421T123255Z
UID:session/Tz2CVvrGLKui3yEYqR64w@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Beginner
CREATED:20190705T050035Z
DESCRIPTION:ElasticSearch is a flexible and powerful open source\, distrib
 uted real-time search and analytics engine. This workshop is planned for b
 eginners and consists of two sections:\n\n* Learning ElasticSearch\n\nThis
  section covers the basics of ElasticSearch. Installation\, common configu
 ration options\, what are indexes\, documents\, type mappings\, aliases\, 
 querying using curl\, tire and pyelasticsearch would be some of the topics
  covered.\n\n* Implementing all that into something useful\n\nThe Aadhaar 
 project provides some publicly available data. The set is large enough to 
 begin manipulating. We will be importing the dataset and trying to find if
  the oldest Indian is really 179 years old. Or\, any other interesting que
 ries we can come up with.\n\nThe intended take-away from the workshop is a
  deeper understanding of ElasticSearch and\, an appreciation of simple und
 erlying technologies that power the "Big Data" aspects. Plus\, you'll get 
 to learn about queries - their cost and\, how to plan for the cheapest or\
 , fastest maps.\n\n### Speaker bio\n\nAnurag works with Red Hat at their P
 une office. He's a part of Engineering Services group and loves to play wi
 th APIs.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 3 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/workshop-learning-elas
 ticsearch-and-using-it-to-analyze-aadhaars-public-datasets-Tz2CVvrGLKui3yE
 YqR64w
BEGIN:VALARM
ACTION:display
DESCRIPTION:Workshop: Learning ElasticSearch and using it to analyze Aadha
 ar's Public Datasets in Audi 3 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Agility and Innovation vs IT: how new data platforms can overcome 
 this neverending struggle
DTSTART:20130712T044500Z
DTEND:20130712T053000Z
DTSTAMP:20260421T123255Z
UID:session/H2nKrmAKnGeX6Wbt9hWYuL@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050146Z
DESCRIPTION:IT organizations confront business champions and developers ab
 out the scale and number of their projects. \nDevelopers want the ability 
 to get new applications and features up and running as fast as possible. A
 nd IT attempts to enforce a robust deployment process.\nThe core of the is
 sue is enabling a scalable data layer. Traditional RDBMS technology is enf
 orcing very strict data modeling practices in order to scale. Similarly\, 
 traditional hardware deployment demands a very high upfront cost\, in term
 s of both time and expense. \nNoSQL technology and private cloud practices
  are enabling a deep shift towards deployments based on variable cost\, pr
 ivileging fast time to market and scalable growth. We will discuss some us
 e cases from 10gen-MongoDB to illustrate this pattern.\n\n### Speaker bio\
 n\nDr. Edouard Servan-Schreiber is Director for Solution Architecture at 1
 0gen\, advising customers on how to make MongoDB make their business simpl
 er\, faster\, and better.\nPreviously\, Edouard was director for cross-cha
 nnel analytics at Teradata\, leading projects in advanced analytics and pr
 edictive modeling with customers in all heavily data-driven industries suc
 h as telco\, retail\, finance\, high tech manufacturing.\n\nEdouard’s sp
 ecialty is to help customers extract business value from their data throug
 h the effective use of technology and analytics.\n\nEdouard began practici
 ng artificial intelligence and statistical learning models at Carnegie Mel
 lon University for his bachelor’s degree\, before going to UC Berkeley f
 or his PhD in Computer Science.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/agility-and-innovation
 -vs-it-how-new-data-platforms-can-overcome-this-neverending-struggle-H2nKr
 mAKnGeX6Wbt9hWYuL
BEGIN:VALARM
ACTION:display
DESCRIPTION:Agility and Innovation vs IT: how new data platforms can overc
 ome this neverending struggle in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Latency and Fault tolerance in OLTP @ 1.5 billion/day service call
 s
DTSTART:20130712T060000Z
DTEND:20130712T064500Z
DTSTAMP:20260421T123255Z
UID:session/C7WN3Qqxw86cS8j18Y25Ht@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050211Z
DESCRIPTION:A good eCommerce web-site would serve millions of pages per da
 y with a fair mix of static and dynamic content per page. Services built o
 n SOA often serve the dynamic content and a request might depend on dozens
  of these services to render a single page and require MBs of data read fr
 om various data sources . Website availability and user experience is affe
 cted by latency variance and failures of these services. \nOne needs to wo
 rry about the 75th and 90th percentile response times  and good Median and
  Mean responses just do not suffice. \n\nCompact protocols - Thrift\, Prot
 obuf\, Avro and Transports - TCP\, Http do not address latency variance or
  provide for fallbacks and graceful degradation. \n\nA number of design pa
 tterns and technologies may be used to stop cascading failures\, fail fast
  and recover rapidly. \n\nThis talk describes how Flipkart built smart Ser
 vice Proxies to handle this problem for apps and services running on a num
 ber of Platforms - PHP and JVM based\, Protocols - Custom\, Thrift\, JSON-
 REST\, Data Sources - SQL and NoSQL. The talk also covers database technol
 ogy selection for a number of use cases - MySQL\, Couchbase\, Redis \, inc
 luding HBase for serving on-line content.  \n\nThe Flipkart Service Proxie
 s are built using technologies like [Netty](http://netty.io/)\, [Hystrix](
 https://github.com/Netflix/Hystrix)\, [Trooper](https://github.com/regunat
 hb/Trooper) and is influenced by projects like [Finagle](https://github.co
 m/twitter/finagle).\n\nThe talk will also feature a demo of the Service Pr
 oxy. The links in this proposal also has slides on the Flipkart website te
 ch stack evolution. The actual talk will feature the next gen version of t
 he fk-w3-agent mentioned in the slides\n\n### Speaker bio\n\nRegunath is a
 n architect\, developer and mentor with a career span of 16 years. He is c
 urrently responsible for building long term\ntechnology vision across Cust
 omer Platform teams at Flipkart. Prior to Flipkart\, he was Chief Architec
 t at MindTree where he played a number of roles including leading an Archi
 tecture services group\, building IP based solutions and implementing larg
 e scale systems\; notable among them was architecting the Govt. of India's
  Aadhaar project - the world's largest biometric identity database.\n\nHe 
 is passionate about Open Source and technology trends - recent ones are Bi
 g Data and deriving insights from Social Media. He has contributed to Open
  Source that is used in 90+ countries word-wide.\nRegunath has been an inv
 ited speaker in various technology forums such as HasGeek Fifth Elephant\,
  OSI days\, Microsoft Architecture Days\, iCMGWorld Architecture Summit an
 d others. Also blogs frequently and was a guest columnist for CIOUpdate.co
 m.\n\nMore about him at:\n[LinkedIn](http://in.linkedin.com/in/regunathb)\
 n[Twitter](https://twitter.com/RegunathB)\n\nOSS projects:\n[Sift](https:/
 /github.com/regunathb/Sift/ )\n[Trooper](https://github.com/regunathb/Troo
 per/ )\n[MindTreeInsight](http://sourceforge.net/projects/mindtreeinsight/
 )\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/latency-and-fault-tole
 rance-in-oltp-1-5-billion-day-service-calls-C7WN3Qqxw86cS8j18Y25Ht
BEGIN:VALARM
ACTION:display
DESCRIPTION:Latency and Fault tolerance in OLTP @ 1.5 billion/day service 
 calls in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Cloud based low cost\, low maintenance\, scalable data platform 
DTSTART:20130712T060000Z
DTEND:20130712T064500Z
DTSTAMP:20260421T123255Z
UID:session/TAgwCVnmSMPeJeMY9r1Zxc@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Beginner
CREATED:20190705T050224Z
DESCRIPTION:Taming the big data beast involves collecting\, storing\, usin
 g and reusing it. In this section I'll try to explain how we at myntra hav
 e addressed these. Also in big data world storage and maintenance costs ar
 e biggest stumbling block\, I'll focus on how we've kept this to the minim
 um. I'll also give an overview on how are transactional systems are intera
 cting with the data platform.\n\nI'll talk about some of the technologies 
 which we are using (primary being Amazon EMR\, Amazon S3\, Apache Kafka\, 
 Twitter Finagle)\, why we chose them and how are they treating us.\n\nI'll
  also mention some of the business problems we are trying to solve like pe
 rsonalizing user experience on our website\, measuring the effectiveness o
 f marketing campaigns\, understanding the life cycle of any product etc. a
 nd how has the platform helped us.\n\n### Speaker bio\n\nI'm working as an
  Associate Architect in Myntra dot com\, India's largest online fashion st
 ore. I've been building web scale systems for nine years and have been wor
 king on NoSQL systems for two years now. I built a mongoDB based analytics
  engine which used to power our web analytics. Working on it I realized th
 e shortcomings of technologies which are not inherently distributed. I've 
 also been the key member behind scaling and speeding up myntra's portal. W
 orking in a startup I've realized that biggest boon is time to market and 
 biggest bane is maintenance overheard. I've tried to use these learning wh
 ile building this data platform.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/cloud-based-low-cost-l
 ow-maintenance-scalable-data-platform-TAgwCVnmSMPeJeMY9r1Zxc
BEGIN:VALARM
ACTION:display
DESCRIPTION:Cloud based low cost\, low maintenance\, scalable data platfor
 m  in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Analyzing Terabytes of Data with Google BigQuery
DTSTART:20130712T064500Z
DTEND:20130712T073000Z
DTSTAMP:20260421T123255Z
UID:session/VSudV2hs1tKcHpvi3rNkGT@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Beginner
CREATED:20190705T050236Z
DESCRIPTION:Google BigQuery is a product that allows you to do interactive
  analysis on very large data sets containing billions of rows using a subs
 et of SQL. We will give an introduction to Dremel and ColumnIO that allows
  us to do this extremely quick analysis. We will explain how this product 
 can be used with a simple demonstration. We will also compare this to othe
 r Google products such as Google Cloud SQL.\n\n### Speaker bio\n\nChandram
 ouli Mahadevan leads the Site Reliability Engineering team at Google Banga
 lore. He has worked at Google Bangalore for the past 8 years in a variety 
 of engineering roles on multiple products such as Google Transliteration\,
  Google News\, and Google Adwords. He received a B Tech and Ph.D in Comput
 er Science from IIT Bombay.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/analyzing-terabytes-of
 -data-with-google-bigquery-VSudV2hs1tKcHpvi3rNkGT
BEGIN:VALARM
ACTION:display
DESCRIPTION:Analyzing Terabytes of Data with Google BigQuery in Audi 1 in 
 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:HOWTO run a hadoop cluster on a laptop
DTSTART:20130712T064500Z
DTEND:20130712T073000Z
DTSTAMP:20260421T123255Z
UID:session/MJXSnLiqD6MmcJmThztwip@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Beginner
CREATED:20190705T050246Z
DESCRIPTION:Hadoop has nearly become synonymous with Big Data. And this is
  because of the large community that stands behind the project. But this i
 s a tough project to contribute to for someone who wants to  work on hadoo
 p in their spare time.\n\nIn a short session over half-hour\, I want to sh
 are some of the convenience scripts I have accumulated over the last 6 mon
 ths\, which help me work with hadoop - at dev-scales. \n\nFor the purposes
  of a clean build env\, we will use an  Ubuntu LXC container to isolate th
 e hadoop install from the rest of the system. \n\nThis provides the base c
 ontainer for your install\, which we can clone later for multiple nodes of
  the cluster.\n\nNow\, with the help of a few convenience scripts & pre-pa
 ckaged config files\, you can download hadoop\, build it and set up a sing
 le node cluster without much trouble.\n\nThis brings up a very brain-dead 
 simple\, non-secure hadoop cluster - easily extensible to a few nodes easi
 ly.\n\nThe multiple node setup is only useful to debug node locality and s
 chedulers\, but for most of the HDFS/Hadoop development\, the single node 
 cluster works wonders.\n\nAnd all that takes you from a clean laptop to ru
 nning a private hadoop instance\, that you can recompile & redeploy in sec
 onds. \n\nThe next step is contributing patches\, which is left as an exer
 cise to the reader.\n\n### Speaker bio\n\nGopal Vijayaraghavan is a late e
 ntry into the hadoop game\, having started working on it last year. Workin
 g with hadoop as part of the Stinger/Tez initiatives\, he has gathered a l
 ot of what used to be tribal knowledge in the hadoop community & discovere
 d that most of it has never been written down. Having been exposed to some
  of the secrets behind working on hadoop for profit\, he wants to share th
 at with people who want to do that for fun.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/howto-run-a-hadoop-clu
 ster-on-a-laptop-MJXSnLiqD6MmcJmThztwip
BEGIN:VALARM
ACTION:display
DESCRIPTION:HOWTO run a hadoop cluster on a laptop in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:15 Billion value at risk computations in 187 milliseconds
DTSTART:20130712T083000Z
DTEND:20130712T091500Z
DTSTAMP:20260421T123255Z
UID:session/mCT9Z5B22xQEn3VAmJd2T@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050306Z
DESCRIPTION:In the session we will present our learning and share with you
  about how we addressed following problems  :\n1.	Extreme parallel process
 ing in HBase?  \n2.	Network jam while aggregating Billions of records?  \n
 3.	Disk latency? \nThe session will follow with an Demo session.\n\n### Sp
 eaker bio\n\nAbinasha Karana works as Director - Technology at Bizosys Tec
 hnologies. He is a committer of HSearch. His passion is "Real time nature 
 of Big Data Computation".  He has been a speaker at "Apache Hadoop India S
 ummit " \, "Microsofft TechEd - Hadoop on Azure  Track"\, "Cloudera Bigdat
 a BigQuestions Event" and "Hadoop Track at IIT Mumbai \, Computer Science 
 BTech Curriculum".  Abinasha has 15 years of industry experience in the ar
 eas of  Application Servers\, Search Technology\, Database Systems\, Inter
 net Applications\, Mobility and Data Integration Tools.\n\nAbinasha gradua
 ted in Engineering from NIT\, Rourkela. Prior to Bizosys\, he confounded D
 rapa Technologies. In his career with Infosys Technologies\, Bangalore\, h
 e was involved in various initiatives such as - starting the Infosys Mobil
 ity Solutions and Enterprise Search practice. One of his architected proje
 cts featured in the 2004 InfoWorld 100.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/15-billion-value-at-ri
 sk-computations-in-187-milliseconds-mCT9Z5B22xQEn3VAmJd2T
BEGIN:VALARM
ACTION:display
DESCRIPTION:15 Billion value at risk computations in 187 milliseconds in A
 udi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Strategic advantages of MongoDB
DTSTART:20130712T083000Z
DTEND:20130712T091500Z
DTSTAMP:20260421T123255Z
UID:session/4c5ExaPNUWhsnQJwPdoP3f@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050317Z
DESCRIPTION:This session will present the strategic advantages of MongoDB 
 as an operational data store. I will discuss the range of use cases curren
 tly in production\, such as:\n- Content Management Store\n- User Profile M
 anagement\n- Product Catalog and Reference Data Store\n- High speed data s
 tore\n- Operational analytics.\n\n### Speaker bio\n\nDr. Edouard Servan-Sc
 hreiber is Director for Solution Architecture at 10gen\, advising customer
 s on how to make MongoDB make their business simpler\, faster\, and better
 .\nPreviously\, Edouard was director for cross-channel analytics at Terada
 ta\, leading projects in advanced analytics and predictive modeling with c
 ustomers in all heavily data-driven industries such as telco\, retail\, fi
 nance\, high tech manufacturing.\nEdouard’s specialty is to help custome
 rs extract business value from their data through the effective use of tec
 hnology and analytics.\nEdouard began practicing artificial intelligence a
 nd statistical learning models at Carnegie Mellon University for his bache
 lor’s degree\, before going to UC Berkeley for his PhD in Computer Scien
 ce.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/strategic-advantages-o
 f-mongodb-4c5ExaPNUWhsnQJwPdoP3f
BEGIN:VALARM
ACTION:display
DESCRIPTION:Strategic advantages of MongoDB in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:SolrCloud and NoSQL
DTSTART:20130712T091500Z
DTEND:20130712T100000Z
DTSTAMP:20260421T123255Z
UID:session/4GGGVQGPXtfWBHAsGAsNJQ@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050329Z
DESCRIPTION:With both NoSQL and Search both trying to add features that ha
 ve traditionally belonged to the others` domain\, where are they both head
 ed? Will one be victorious over the other or both finally provide a bridge
  to plug into the other?\n\nI'll speak about data stores and how does it c
 ompare to "Integrate search into a data store" vs "Doubling up a search en
 gine as the datastore".\n\n### Speaker bio\n\nEngineer at [LucidWorks](htt
 p://www.lucidworks.com). With almost 8 years of experience in search\, inf
 ormation retrieval and recommendation engines\, Anshum has been a part of 
 the core Search team at Naukri.com and Cleartrip.\n\nAt Naukri\, he was on
 e of the core contributors and an architect for the JobSearch and ResumeSe
 arch (Resdex) platforms\, the largest resume database in India. He also de
 signed and developed the recommendation engine for Job Seekers.While at Cl
 eartrip\, he designed and developed the data aggregator and search for one
  of the most interesting travel products\, [SmallWorld](http://www.cleartr
 ip.com/smallworld). As a part of the core team that developed and launched
  the first AWS Search service\, CloudSearch he has experience with designi
 ng and developing a very large scale service.\n\nHe is now back to the ope
 n source world and is currently working with [LucidWorks](http://www.lucid
 works.com)\, the leading developer of search\, discovery and analytics sof
 tware based on Apache Lucene and [Apache Solr](http://lucene.apache.org/so
 lr/) technology.\n\nMore about him at: [LinkedIn](http://www.linkedin.com/
 in/anshumgupta) [Twitter](https://twitter.com/anshumgupta)\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/solrcloud-and-nosql-4G
 GGVQGPXtfWBHAsGAsNJQ
BEGIN:VALARM
ACTION:display
DESCRIPTION:SolrCloud and NoSQL in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Neo4j Graphs: What\, When\, How
DTSTART:20130712T091500Z
DTEND:20130712T100000Z
DTSTAMP:20260421T123255Z
UID:session/M7Dwvro6TWJYEwpAvdWRHf@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Beginner
CREATED:20190705T050341Z
DESCRIPTION:Google's knowledge graph\, Facebook's graph search\, and even 
 Amazon's recommendation engine are hugely successful applications of basic
  graph theory. We'll take a look at Neo4j\, the world's leading graph data
 base\, to understand what graphs offers and how to use them in real applic
 ations. \n\nWe'll start with a basic introduction to Neo4j and graph datab
 ases\, placing them in context of NOSQL and Big Data. Then we'll explore a
  few business use cases which illustrate different advantages of using gra
 phs. While we'll touch on some technical details\, the focus will be under
 standing what a graph database is\, when to use one\, and how it helps.\n\
 nYou'll leave knowing how to identify a graph problem when you see one\, a
 nd be ready to do something about it.\n\n### Speaker bio\n\nWith NASA\, fo
 r the love of technology. Then Zambia\, using technology for social good. 
 Now with Neo4j\, making the world a better place from a graph perspective.
 \n\nAndreas has been part of the Neo4j community since having his own grap
 h epiphany while working on medical informatics in Zambia. He joined as an
  early member of core engineering\, and has now taken on the role of Produ
 ct Experience Designer\, responsible for maturing that fantastic codebase 
 into an industrial strength product.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/neo4j-graphs-what-when
 -how-M7Dwvro6TWJYEwpAvdWRHf
BEGIN:VALARM
ACTION:display
DESCRIPTION:Neo4j Graphs: What\, When\, How in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Workflow Schedulers: The Heart Beat of a Big Data Stack
DTSTART:20130712T103000Z
DTEND:20130712T111500Z
DTSTAMP:20260421T123255Z
UID:session/DbGB42v7E5hVdbcKvJHcDZ@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050400Z
DESCRIPTION:At Qubole\, we use Apache Oozie as the scheduler. I'll address
  which features are more important than others based on the usage of the S
 cheduler product in the Qubole platform. With an insight into the salient 
 features\, I'll compare other open source schedulers such as Azkaban (Link
 edin)\, Luigi (Spotify) and Chronos (Airbnb). This information will provid
 e a platform for attendees to make more informed decisions on which of the
 se technologies to choose to schedule ETL and reporting processes on top o
 f Hadoop.\n\n### Speaker bio\n\nRajat Venkatesh is a developer at Qubole\,
  a company that provides data analysis tools on the cloud. He is responsib
 le for the Scheduler product at Qubole. Before Qubole\, he worked as a dat
 abase kernel developer at Vertica - a big data analytics platform.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/workflow-schedulers-th
 e-heart-beat-of-a-big-data-stack-DbGB42v7E5hVdbcKvJHcDZ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Workflow Schedulers: The Heart Beat of a Big Data Stack in Aud
 i 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:What Happens When Firefox Crashes?
DTSTART:20130712T111500Z
DTEND:20130712T120000Z
DTSTAMP:20260421T123255Z
UID:session/LpmmjMv3rYqxjrLaM7Exed@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050413Z
DESCRIPTION:Receiving and organizing every Firefox crash in the world is a
  big job. Concurrency\, realtime constraints\, and a volume of data 110TB 
 strong all contribute to the challenge of giving Firefox engineers what th
 ey need to find and squash browser bugs.\n\nFollow a Firefox crash from it
 s genesis in a collapsing browser process through the dizzying array of co
 llection\, storage\, and reporting systems that make up Socorro\, our open
 -source crash collector. Enjoy war stories of weird\, interlocking failure
 s\, and see how we nevertheless continue to fulfill our mandate: “Never 
 lose a crash.”\n\n### Speaker bio\n\nErik Rose coordinates the impact of
  108 spring-loaded buttons at Mozilla\, venting a byproduct of static anal
 ysis\, search\, and pattern-finding software. His past selves have done re
 altime fuzzy matching against the corpus of U.S. voters at Votizen\, cause
 d the Django community's tests to run in funny orders\, written a book abo
 ut Zope and Plone\, and released a bevy of eclectic Python libraries. When
  not speaking or coding\, Erik retreats to his volcanic fortress in the wi
 lds of North Carolina\, where he discusses formal language theory with his
  dog\, Max.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/what-happens-when-fire
 fox-crashes-LpmmjMv3rYqxjrLaM7Exed
BEGIN:VALARM
ACTION:display
DESCRIPTION:What Happens When Firefox Crashes? in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Extracting consumer trends in real time using 100 billion tweets.
DTSTART:20130713T044500Z
DTEND:20130713T053000Z
DTSTAMP:20260421T123255Z
UID:session/VbrBbjGdBFy2F6ChVrz5aY@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050428Z
DESCRIPTION:We will outline a system to extract volume\, sentiment\, geo t
 rends and related words about any arbitrary topic (defined as a boolean qu
 ery) from a corpus on 100B tweets increasing at ~500M a day.\n\n### Speake
 r bio\n\nPankaj Risbood is Director of Engineering at @WalmartLabs where h
 e leads social media analytics effort. Prior to @WalmartLabs Pankaj spent 
 6 years at Google leading various efforts in cloud computing\, enterprise 
 and speech processing. He was Member of Tech Staff at Bell Labs where he s
 pecialized in optical and IP networking. Pankaj has co-authored 16 issued 
 patents and several research papers in premier conferences. An alumnus of 
 IISc\, Pankaj is a avid hiker and a marathon runner.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/extracting-consumer-tr
 ends-in-real-time-using-100-billion-tweets-VbrBbjGdBFy2F6ChVrz5aY
BEGIN:VALARM
ACTION:display
DESCRIPTION:Extracting consumer trends in real time using 100 billion twee
 ts. in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY: Evaluating SSD Performance for Databases Handling Real-Time Big D
 ata
DTSTART:20130713T044500Z
DTEND:20130713T053000Z
DTSTAMP:20260421T123255Z
UID:session/Sj1XNuw3bTtK9MYprPZNps@hasgeek.com
SEQUENCE:2
CATEGORIES:Storage and Databases,Intermediate
CREATED:20190705T050437Z
DESCRIPTION:Unprecedented volumes of structured and unstructured data are 
 being generated at high velocity via sensors\, and Web\, social and mobile
  interactions. This is giving rise to new applications that respond immedi
 ately to what people\, devices and systems are doing now. Increasingly\, t
 he developers of these applications are turning to a combination of SSD st
 orage and NoSQL databases to deliver highly relevant\, high-value data and
  actionable insights in milliseconds. However\, not all SSDs are equal whe
 n it comes to database performance. In this session\, Brian Bulkowski will
  discuss how he developed the open source ACT tool to evaluate SSD drive p
 erformance in handling large-scale\, real-time database loads\; how severa
 l popular SSD drives have performed on the ACT benchmark\; and how develop
 ers can use the ACT benchmark to evaluate SSDs for their own high-velocity
  big data demands.\n\n### Speaker bio\n\nBrian Bulkowski\, founder and CTO
  of Aerospike Inc. has 20-plus years experience designing\, developing and
  tuning networking systems and high-performance Web-scale infrastructures.
  He founded Aerospike after learning first hand\, the scaling limitations 
 of sharded MySQL systems at Aggregate Knowledge as director of performance
  at this media intelligence SaaS company. Brian developed the open source 
 Aerospike Certification Tool (ACT) to evaluate the performance of flash-ba
 sed SSDs in supporting database functions after existing tools proved to b
 e unreliable predictors of how well SSDs would support the Aerospike real-
 time database in real-world deployments. Today\, ACT is used both by enter
 prise IT teams to determine which SSDs to deploy with their databases\, an
 d by storage providers to help tune their SSDs for enterprise database dem
 ands. Previously\, Brian also has served as a founding member of the digit
 al TV team at Navio Communications\, chief architect of Cable Solutions at
  Liberate Technologies\, and lead engineer at Novell.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/evaluating-ssd-perform
 ance-for-databases-handling-real-time-big-data-Sj1XNuw3bTtK9MYprPZNps
BEGIN:VALARM
ACTION:display
DESCRIPTION: Evaluating SSD Performance for Databases Handling Real-Time B
 ig Data in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Building large scale Analytics Platform
DTSTART:20130713T060000Z
DTEND:20130713T064500Z
DTSTAMP:20260421T123255Z
UID:session/PSrevMbJ5xcD8jfCGYUzKJ@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050449Z
DESCRIPTION:I will start with a brief introduction to Media IQ\, the world
  of online display advertising and the business need for a big data platfo
 rm. I will then go into the options we considered and the current technolo
 gy stack and infrastructure that we have the platform on. I will give an o
 verview of our data pipeline\, the technologies we are using - S3\, Elasti
 c MapReduce\, Hadoop/Hive and the components that we built to put it all t
 ogether. I will then talk about batch processing vs AdHoc Querying\, give 
 some perspective from the users of our platform and why we had to evolve t
 he platform to facilitate these two kinds of querying. I will talk about o
 ur experience with Amazon's Redshift\, HBase and will also give a sense of
  costs (storage\, processing) vs performance (querying/processing times) a
 nd the trade-offs\n\n### Speaker bio\n\nPrabhu heads the technology team a
 t MEDIA iQ Digital\, who are the next generation digital advertising tradi
 ng specialists. He has over a decade of experience in the software industr
 y and has designed and developed high performing large scale backend platf
 orms and complex enterprise applications. Prior to MEDIA iQ\, he was at Do
 w Jones where he built an extensive and scalable search platform\, that wa
 s handling more than a billion documents.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/building-large-scale-a
 nalytics-platform-PSrevMbJ5xcD8jfCGYUzKJ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Building large scale Analytics Platform in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:MapReduce and the "Art of Thinking Parallel"
DTSTART:20130713T060000Z
DTEND:20130713T064500Z
DTSTAMP:20260421T123255Z
UID:session/GyvyHFeFjQwoLMMco5asj1@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Advanced
CREATED:20190705T050500Z
DESCRIPTION:MapReduce is a ubiquitously used framework for largescale numb
 er crunching in BigData analytics. While it is quite general\, it is not u
 niversal. There are a lot of analytics problems that cannot be ported to t
 he MapReduce framework "naturally" (e.g. finding similarity between all pa
 irs of documents in their Bag-of-Words representation). \n\nIn this talk\,
  through a series of such problems\, we will highlight both the limitation
 s of MapReduce and how to overcome those limitations by being "smart" abou
 t "transforming those problems" to be more "amenable to MapReduce".\n\nAs 
 a concrete example we will develop an end-to-end solution in MapReduce for
  a very important and NP-hard Graph Theory problem - finding all Maximal C
 liques in a graph.\n\n### Speaker bio\n\nDr. Shailesh Kumar is a Member of
  Technical Staff at Google\, Hyderabad where he works on large scale data 
 mining problems for various Google products. Prior to joining Google\, he 
 has worked as a Principal Dev. Manager at Microsoft (Bing) Hyderabad\, Sr.
  Scientist at Yahoo! Labs Bangalore\, and Principal Scientist at Fair Isaa
 c Research in San Diego\, USA.\n\nDr. Kumar has over fifteen years of expe
 rience in applying and innovating machine learning\, statistical pattern r
 ecognition\, and data mining algorithms to hard prediction problems in a w
 ide variety of domains including information retrieval\, web analytics\, t
 ext mining\, computer vision\, retail data mining\, risk and fraud analyti
 cs\, remote sensing\, and bioinformatics. He has published over 20 confere
 nce papers\, journal papers\, and book chapters and holds over a dozen pat
 ents in these areas.\n\nHe has two keen passions - first creating "magic f
 rom data" and second understanding functionally how the brain works!\n\nDr
 . Kumar received his PhD in Computer Engineering in 2000 (with a specializ
 ation in statistical pattern recognition and data mining) and Masters in C
 omputer Science in 1997 (with a specialization in artificial intelligence 
 and machine learning)\, both from the University of Texas at Austin\, USA.
  He received his B.Tech. in Computer Science and Engineering from the Inst
 itute of Technology\, Banaras Hindu University in 1995.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/mapreduce-and-the-art-
 of-thinking-parallel-GyvyHFeFjQwoLMMco5asj1
BEGIN:VALARM
ACTION:display
DESCRIPTION:MapReduce and the "Art of Thinking Parallel" in Audi 2 in 5 mi
 nutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Analytics using Hadoop ecosystem on AWS
DTSTART:20130713T064500Z
DTEND:20130713T073000Z
DTSTAMP:20260421T123255Z
UID:session/VuLK8qoGM4CMekCAmw8wQh@hasgeek.com
SEQUENCE:2
CATEGORIES:Workshops,Intermediate
CREATED:20190705T050544Z
DESCRIPTION:Organizations who want to perform analytics in the AWS Cloud n
 eed to figure out the following:\nHow do we get our log data sets into the
  cloud (AWS S3)?\nHow do we import data to Amazon S3 from on-premise or on
 -cloud databases such as mysql\, mongodb or postgres?\nDo I need a persist
 ent Hadoop Cluster? How do I setup the system so that multiple users withi
 n the organization can run M/R\, Pig or Hive commands?\nWhat are the best 
 practices for organizing data on S3 for long term storage and query?\nWhat
  about security? What are the security risks of doing analytics in the clo
 ud?\nWhat about cost?\nWhat is the role of Hadoop versus traditional data 
 warehouses like Vertica and AWS Redshift? What about data visualization?\n
 How I do build reports using this infrastructure and where do i host them?
 \nWe will layout some common design patterns and alternatives for these qu
 estions. For some of the questions - we may highlight features in the Qubo
 le platform - and similarly where we go through live examples - we maybe u
 sing Qubole Data Service.\nAfter this workshop\, attendees will be better 
 informed on the process to get data analytics up and running on AWS.\n\n##
 # Speaker bio\n\nRajat Venkatesh is a engineer at Qubole and has experienc
 e in all aspects of helping users analyze their data on AWS.  Before Qubol
 e\, he worked as a database kernel developer at Vertica - a big data analy
 tics platform.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/analytics-using-hadoop
 -ecosystem-on-aws-VuLK8qoGM4CMekCAmw8wQh
BEGIN:VALARM
ACTION:display
DESCRIPTION:Analytics using Hadoop ecosystem on AWS in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Visualising networks
DTSTART:20130713T083000Z
DTEND:20130713T091500Z
DTSTAMP:20260421T123255Z
UID:session/XXLJRkE8NmNzdUSnPYMrz2@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050557Z
DESCRIPTION:Network structures are everywhere: in social media\, telecom\,
  airports\, banking\, as well as a number of somewhat unusual places: frau
 dulent tea auctions\, examination halls\, and works of fiction.\n\nWe'll s
 tart with traditional hierarchical and node-link visuals\, but a number of
  new techniques have emerged over the last few years -- such as attribute-
 driven layouts\, pivot graphs\, grouped means\, etc.\n\nYou'll see example
 s of these applied to real-life datasets\, and the stories that these tell
 .\n\n### Speaker bio\n\nAnand is a data scientist at Gramener\, a data vis
 ualisation company.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/visualising-networks-X
 XLJRkE8NmNzdUSnPYMrz2
BEGIN:VALARM
ACTION:display
DESCRIPTION:Visualising networks in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Similar entity detection in large data
DTSTART:20130713T083000Z
DTEND:20130713T091500Z
DTSTAMP:20260421T123255Z
UID:session/QmnBAJ6ihJm9kWAudd3Zyp@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050608Z
DESCRIPTION:One of the fundamental issues across industries is the presenc
 e of many similar entities but registered under different names. For examp
 le different groups of insurance companies offer different policies to sam
 e customers. In the systems these policies are registered under different 
 customer ids.  This leads to multiple issues including - Inability to cros
 s / up sell\,  Identify any fraudulent claim patterns \, etc.  Same is the
  case in banks where same customer could be making different loan requests
  under different names.     This presentation is based on our experiences 
 with Similar entity detection in Big Data.  It will speak about   \n  1.  
  What is similar entity detection \n 2.  Where is the need for this  \n3. 
 Techniques for similar entity detection and their applicability  \n4. Supe
 rvised \, unsupervised and continuous learning modes\n5.  Use of Semantic 
 techniques\n6.  Implementation Challenges\nHandling large data\,  Handling
  large number of comparisons\, How to relate similar entities\n7. Sample r
 esults of our experiments\n\nThe above is the outline of what I intend to 
 cover. There would enough time for questions and answers \,  however if yo
 u would like something more to be covered do post a comment and I will see
  how it can be incorporated.\n\n### Speaker bio\n\n•	Arthi Venkataraman 
 has > 16.5 years of experience in the design\, development and testing of 
 projects in different domains\n•	 She is currently a Senior Architect  i
 n the Chief Technology Office of Wipro Technologies\n•	 Her current role
  involves solution development  for different business problems spanning t
 he area of Big Data\,  Machine Learning and Semantics Technologies\n•	 S
 he has a B.E Degree in Computer Science from University Visvesvariah Colle
 ge of Engineering\, Bangalore and an MBA (PGDSM) from IIM\, Bangalore. She
  is also a PMP.\n•	She has previously presented papers and spoken at oth
 er international conferences\nThis presentation is based on Arthi's experi
 ence in area of Similar entity identification\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/similar-entity-detecti
 on-in-large-data-QmnBAJ6ihJm9kWAudd3Zyp
BEGIN:VALARM
ACTION:display
DESCRIPTION:Similar entity detection in large data in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Interactive analysis of data live\, using Pandas\, Matplotlib and 
 IPython
DTSTART:20130713T091500Z
DTEND:20130713T100000Z
DTSTAMP:20260421T123255Z
UID:session/RvyHR3SmMUoJ6HP77mNeow@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Beginner
CREATED:20190705T050622Z
DESCRIPTION:One of the data sets that is going to be the used is the datas
 et parsed from usesthis.com: The hardware and software used by people to g
 et their work done. (permission for the same from the site owner has been 
 obtained.)\n\nThe audience is going to be a part of the whole process of p
 arsing it and converting it into numpy arrays whereupon it can be analysed
  to find various answers.\n\nAnother dataset would be the names of people 
 in the US social security database since 1880 with 3 million published nam
 e records.\n\n### Speaker bio\n\nThe speaker has been working on Python fo
 r years and SciPy tools have always interested him. Recently he took the t
 ime to dive into it and has been pleased with what he learnt so far\, whic
 h he can't wait to share!\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/interactive-analysis-o
 f-data-live-using-pandas-matplotlib-and-ipython-RvyHR3SmMUoJ6HP77mNeow
BEGIN:VALARM
ACTION:display
DESCRIPTION:Interactive analysis of data live\, using Pandas\, Matplotlib 
 and IPython in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Julia: A fresh approach to technical computing and data science
DTSTART:20130713T091500Z
DTEND:20130713T100000Z
DTSTAMP:20260421T123255Z
UID:session/YUPYKfyn3RfTsQeSpVgUKJ@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Beginner
CREATED:20190705T050628Z
DESCRIPTION:Julia provides a sophisticated compiler\, distributed parallel
  execution\, numerical accuracy\, and an extensive mathematical function l
 ibrary. The library\, largely written in Julia itself\, also integrates ma
 ture\, best-of-breed C and Fortran libraries for linear algebra\, random n
 umber generation\, signal processing\, and string processing. Performance 
 of Julia programs is often within a factor of two of C programs and in man
 y cases as good as good as C. This obviates the need to write computationa
 l kernels in C or Fortran and leads to higher programmer productivity. The
   base julia repository has received contributions from over 135 contribut
 ors. In addition\, the Julia developer community has contributed over 125 
 external packages through Julia’s built-in package manager.\n\n### Speak
 er bio\n\nI am one of the co-creators of the Julia programming language\, 
 along with Jeff Bezanson\, Stefan Karpinski\, and Alan Edelman.\n\nMy Link
 edIn profile is provided in the links below.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/julia-a-fresh-approach
 -to-technical-computing-and-data-science-YUPYKfyn3RfTsQeSpVgUKJ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Julia: A fresh approach to technical computing and data scienc
 e in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:A Billion Snapshots- Principles and Processes in the Census of Ind
 ia
DTSTART:20130713T103000Z
DTEND:20130713T111500Z
DTSTAMP:20260421T123255Z
UID:session/T2FYoTbbKPpo75rHrBTx9x@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Beginner
CREATED:20190705T050639Z
DESCRIPTION:The Census of India 2011 is unmatched across in world not only
  in the size of the population covered\, but also the quantity of data acr
 oss various demographic dimensions collected per person. The session expla
 ins the basic principles of Census taking in India\, which ensure quality 
 results\; how those principles were applied in process and design decision
 s\, and how these decisions were implemented. It ends with a brief discuss
 ion of released Census results to date and their analysis and utility.\n\n
 ### Speaker bio\n\nVarsha Joshi is an IAS officer from the Union Territori
 es Cadre\, 1995 batch. She is presently Director Census Delhi as well as D
 irector in the office of the Registrar General of India. She has conducted
  the Census of India 2011 for the NCT of Delhi.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/a-billion-snapshots-pr
 inciples-and-processes-in-the-census-of-india-T2FYoTbbKPpo75rHrBTx9x
BEGIN:VALARM
ACTION:display
DESCRIPTION:A Billion Snapshots- Principles and Processes in the Census of
  India in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Unlocking the Potential of Data for Everyday Developers and Produc
 t Managers
DTSTART:20130713T103000Z
DTEND:20130713T111500Z
DTSTAMP:20260421T123255Z
UID:session/GGxt4f8PvASKHuQ5DWqUj8@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050648Z
DESCRIPTION:An introduction to instrumenting the application on both the c
 lient and server for better visibility into the application behaviour in p
 roduction environments\, and exploring the various tools that enable you t
 o build cheap/reliable monitoring into your application. \n\nThe Talk will
  touch some of the following areas : \n\n1. Why analyse everything? \n2. I
 ntroduction to Etsy' StatsD (https://github.com/etsy/statsd) and Graphite.
  \n3. Instrumenting Applications\, and what to Instrument and what not to 
 Instrument. \n4. Real time vs Delayed Analytics. \n5. Visualisation of the
  Collected Metrics\, and exploring a few opensource and commercial visuali
 sation solutions. \n6. Unlocking the Potential of Data. Use Cases and Succ
 ess Stories.\n\n### Speaker bio\n\nI'm the Performance Guy at PayPal India
 (eBay Inc.)\, building Performance as a key attribute into the Next Genera
 tion of PayPal's presence on the Web.\n\nIn my free time I dabble with Web
  Frameworks\, researching on the various things that go into making applic
 ations fast and responsive.\n\nPreviously Spoke at MetaRefresh 2012 on Bui
 lding High Performance Web Applications. \n\nI also occasionally blog at h
 ttp://karthik.kastury.in/\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/unlocking-the-potentia
 l-of-data-for-everyday-developers-and-product-managers-GGxt4f8PvASKHuQ5DWq
 Uj8
BEGIN:VALARM
ACTION:display
DESCRIPTION:Unlocking the Potential of Data for Everyday Developers and Pr
 oduct Managers in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Co-occurrence Analytics: A versatile framework for finding interes
 ting needles in crazy haystacks!
DTSTART:20130713T111500Z
DTEND:20130713T120000Z
DTSTAMP:20260421T123255Z
UID:session/5wHnv1cvvm6SDTFjr1mYMf@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Advanced
CREATED:20190705T050709Z
DESCRIPTION:Most data around us can be thought of as "things co-occurring 
 with other things in certain contexts". Whether it is products co-occurrin
 g with other products in retail market baskets\, words occurring before or
  after other words in unstructured text\, tags co-occurring with other tag
 s in social tagging systems\, people co-occurring with other people in var
 ious social networking scenarios\, or objects occurring in various 2-D geo
 metrical juxtapositions of other objects in images\, etc. \n\nWhile there 
 have been silos of efforts in each research community - retail\, text\, so
 cial networking\, and vision\, etc. - in dealing with "their" data\, there
  has been no unifying framework to tame such a wide variety of co-occurren
 ce data systematically - a theme for this session.\n\nWe will present a si
 mple\, intuitive\, yet a powerful co-occurrence analytics framework to dea
 l with a wide variety of data of the form "things co-occurring with other 
 things in some context". After describing the framework we will demonstrat
 e how to adapt and apply the core principles of the framework to a variety
  of large real-world datasets to find novel and actionable insights even i
 n the presence of significant noise in the data.\n\nWhat makes this approa
 ch attractive is that it is:\n\n(1) Unsupervised: No cost of getting label
 ed data. Just point it to the data and crunch.\n\n(2) Unbiased: No prior a
 ssumptions about data distributions\, etc.\n\n(3) High Precision: Generate
 s very high quality insights.\n\n(4) High Recall: Generates exhaustively m
 any insights.\n\n(5) Parameter Poor: Very few parameters to play with.\n\n
 (6) Scaleable: Highly parallelizable  in MapReduce sense.\n\n### Speaker b
 io\n\nDr. Shailesh Kumar is a Member of Technical Staff at Google\, Hydera
 bad where he works on large scale data mining problems for various Google 
 products. Prior to joining Google\, he has worked as a Principal Dev. Mana
 ger at Microsoft (Bing) Hyderabad\, Sr. Scientist at Yahoo! Labs Bangalore
 \, and Principal Scientist at Fair Isaac Research in San Diego\, USA. \n\n
 \nDr. Kumar has over fifteen years of experience in applying and innovatin
 g machine learning\, statistical pattern recognition\, and data mining alg
 orithms to hard prediction problems in a wide variety of domains including
  information retrieval\, web analytics\, text mining\, computer vision\, r
 etail data mining\, risk and fraud analytics\, remote sensing\, and bioinf
 ormatics. He has published over 20 conference papers\, journal papers\, an
 d book chapters and holds over a dozen patents in these areas.\n\nHe has t
 wo keen passions - first creating "magic from data" and second understandi
 ng functionally how the brain works!\n \nDr. Kumar received his PhD in Com
 puter Engineering in 2000 (with a specialization in statistical pattern re
 cognition and data mining) and Masters in Computer Science in 1997 (with a
  specialization in artificial intelligence and machine learning)\, both fr
 om the University of Texas at Austin\, USA. He received his B.Tech. in Com
 puter Science and Engineering from the Institute of Technology\, Banaras H
 indu University in 1995.\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 2 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/co-occurrence-analytic
 s-a-versatile-framework-for-finding-interesting-needles-in-crazy-haystacks
 -5wHnv1cvvm6SDTFjr1mYMf
BEGIN:VALARM
ACTION:display
DESCRIPTION:Co-occurrence Analytics: A versatile framework for finding int
 eresting needles in crazy haystacks! in Audi 2 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Smart Analytics in Smartphones
DTSTART:20130713T111500Z
DTEND:20130713T120000Z
DTSTAMP:20260421T123255Z
UID:session/MBqyKdGo6qRDCFwSSeCd2Q@hasgeek.com
SEQUENCE:2
CATEGORIES:Analytics and Visualization,Intermediate
CREATED:20190705T050729Z
DESCRIPTION:Current smartphones are having quad/octa -core processors alon
 g with more than ten sensors. Using these sensors large number of data is 
 collected and processed. This data explosion opens up classical debate –
  How much data processing should be performed in device vs. server? In thi
 s talk\, I would like to discuss my thoughts and findings on this question
  along with other interesting questions on performing “Smart Analytics
 ”. Smart Analytics is a form of data mining and machine learning that ca
 n be done in Smartphones with specific use cases in mind. I will discuss m
 y findings with an example of performing activity recognition in smartphon
 es. Activity recognition uses smartphone’s accelerometer sensor values t
 o detect physical activity of the user e.g. running\, jogging\, walking\, 
 sitting\, climbing up/down. I will show the results on few datasets using 
 machine learning techniques.\n\n### Speaker bio\n\nCurrently I am working 
 as a Senior Data Scientist in Samsung Research India-Bangalore. I enjoy in
  applying data mining and machine learning in real world applications.  My
  detailed profile is available at LinkedIn. \nhttp://in.linkedin.com/pub/s
 atnam-singh-phd/2/349/347\n
GEO:12.9431582;77.5964488824009
LAST-MODIFIED:20230810T072606Z
LOCATION:Audi 1 - Nimhans Convention Centre\nIN
ORGANIZER;CN="The Fifth Elephant":MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/fifthelephant/2013/schedule/smart-analytics-in-sma
 rtphones-MBqyKdGo6qRDCFwSSeCd2Q
BEGIN:VALARM
ACTION:display
DESCRIPTION:Smart Analytics in Smartphones in Audi 1 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
END:VCALENDAR
