The Fifth Elephant 2017
On data engineering and application of ML in diverse domains
Jul 2017
24 Mon
25 Tue
26 Wed
27 Thu 08:15 AM – 10:00 PM IST
28 Fri 08:15 AM – 06:25 PM IST
29 Sat
30 Sun
Accepting submissions
Not accepting submissions
Streaming for life, universe and everything using Confluent PlatformWhen Kafka came it made streaming and our lives a lot easier. But there were still some gaps to fill, how to validate the schema of events coming in, how to stream data from languages other than java and keep this streaming setup central, can we use Kafka to stream for tables and vice-versa, and more. Confluent Platform(CP) is a one-stop centre for all our streaming needs. It is built on top of K… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Blockchain for business and governmentThe talk will focus on how Blockchain technology has matured to infuse trust based systems and hence apt for implementation in Businesses and government programs. We will also focus on the early adopters of Blockchain with use cases and industry solutions on Blockchain. Further how Open standards are key to BLockchain adoption in Enterprises and government. more
Section: Crisp talk for Data in Government track
Technical level: Beginner
|
How Paytm uses k8s for global expansionAt Paytm, we are constantly engaged in creating new environments and aligning infrastructure for standard services such as Authentication, Access, Logging/Monitoring etc. There is also the case of dynamic resource allocation, high-availability, scalability, security - then factor in ‘x’ number of environments and you have a fairly complex problem to solve. This is especially the case for big data… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
From a recommendations carousel to personalizing entire app - personalization story at paytmAt paytm we value user experience and we want to pre-emptively show a user the types of products they would want to buy. In this talk, we will walk our audience through how we personalize every pixel on our app. How do we use deep learning on tens of terabytes of data everyday to sort long tail merchandise and how we use an ensemble of several models to generate every recommendation. We will shar… more
Section: Full talk in Payment Analytics track
Technical level: Advanced
|
How to engineer a personalization system that can handle Paytm scaleWhen we say we value customer experience we meant it! When you have to personalize every pixel on the app, your standard caching techniques go out of window and you need very fast and scalable system that can generate content for users in unnoticeable time. In this talk we will share how did we build our real time personalization engine which evaluates and serves over 10 billion recommended produ… more
Section: Full talk for data engineering track
Technical level: Advanced
|
Machine Learning Applications in Cisco Spark Collaboration SaaSA use case driven technical overview of the applications of machine learning in the Cisco Spark Collaboration SaaS offer, including Webex (refer: http://www.ciscospark.com) more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
micro-ATMs: The what, the why and the howThe aftermath of demonetization has led to a scramble for digital or cashless payments. Enter the era of the micro-ATM, the superhero of payment devices and a solution to the dearth of ATM networks in India, where all you need to transact is your fingerprint. This session will detail the journey of deploying mATMs across India and the leverage provided by deriving a data-driven strategy to do so more
Section: Full talk in Payment Analytics track
Technical level: Intermediate
|
Credit where Credit is due: Using data science to lend to customers without a credit historyTraditional loans are based on banking history leaving a large segment of people ineligible. These however, represent a highly untapped segment representing large purchasing potential. How do you deem if someone is trustworthy when you have no information to base your decision on? This session will detail methods of evaluating people and extending loans irrespective through leveraging technology … more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
ML For Personalization At Scale @ NearbuyHere I will try to explain how we use ML to give personalized recommendations to the customers. Also I will explain how have we setup our Big Data Pipeline using KAFKA , SPARK and HBASE . The amount of data we process daily and how to we handle anamolies and our learning track . I will also discuss about vvarious ML Algos that we are using and how to use them in SPARK . Understanding of Collabora… more
Section: Full talk for data engineering track
Technical level: Advanced
|
Working with Apache Spark in EtaEta is a high-level, purely functional programming language and also the newest member to the JVM world. It has been gaining traction as an alternative to Scala for solving Big Data problems. In this talk, I would like to discuss why Eta is ideal for writing Apache Spark jobs by considering the following aspects: more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Big Data Computations: Comparing Apache HAWQ, Druid, Google Spanner and GPU DatabasesA class of big data computations known as the distributed merge tree was required to be built to aggregate user information across multiple data sources in media domain. This class is characterized by non-scalar aggregates all the way to the root of the merge tree – equivalent of a Set union operation in SQL at every level of the tree. Typical big data technologies were mostly supporting only sca… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Distributed Consensus and Data Safety: NewSQL PerspectiveWe explore data safety issues in designing large distributed systems. Though data safety issues have been addressed in traditional complex software systems such as aircraft engineering systems, ensuring data safety in distributed systems is a complex and arduous task. The complexity is due to necessity to ensure safety of various data such as configuration data, state changes at individual nodes,… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Learning representations of text for NLPThink of your favorite NLP application you wish to build - sentiment analysis, named entity recognition, machine translation, information extraction, summarization, recommender system. A key step in building it is - using the right technique to represent the text in a form that machine can understand. In this workshop, we will focus on the key concepts, maths, and code behind state-of-the-art tec… more
Section: Workshops
Technical level: Intermediate
|
Application of AI in e-commerce industry from product search to customer satisfactionArtificial Intelligence(AI) was introduced to develop and create “thinking machine” that are capable of mimicking, learning and replacing human intelligence. Since last 20 years, AI has shown great promise in improving human decision making processes and the subsequent productivity. more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Large scale business stats aggregation using KafkaAt Indix we collect and process lot of data. We monitor the correct behaviour of our system through collection of business metrics. Over the time, we moved most of our system from batch map-reduce jobs to kafka stream tasks. Hence we had to move the stats to be more real time. So we built a system called Abel, which aggregates millions of events that it gets and collects stats for the same. more
Technical level: Intermediate
|
Application of machine learning in oil and gas industryThis talk describes the various machine learning algorithms used in the public SEG (Society of Exploration Geophysicist) challenge held in December 2016 to identify lithofacies based on well log measurements. Lithofacies are the different rock layers encountered during drilling, which are used to characterize the sub-surface. Correct classification of lithological facies helps in identifying targ… more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Optimising Model performance using automated ML pipeline for predicting purchase propensity @ Fractal AnalyticsEnsemble learning is the process by which multiple machine-learning models are evaluated and combined to help build a combined model that provides better results. Building these models require experimenting with not just multiple Machine-Learning models, but also with various model-parameters that help build good individual models. more
Section: Full talk for data engineering track
Technical level: Advanced
|
Autonomous Grid using Machine LearningIn this talk we deep dive into how we are assisting Energy Utilities using IOT and Machine Learning to build the next generation of Autonomous grid. The potential impact of applying Machine Learning, IOT, IIOT is estimated at 2-4% of annual revenue, 3-5% of annual accounts receivable, cost improvement of 4-8% per campaign against their consumers. The topics include application of Machine Learning… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Machine Learning from Practice to ProductionWith AI research and machine learning systems growing at great speed, companies require significant effort to keep up or risk losing their relevance in this brave new world. The new tide also brings with it numerous tools to tackle previously intractable problems. However, there does seem to exist a gulf between appreciating these developments and subsequently deploying them. Despite the global p… more
Section: Full talk for data engineering track
Technical level: Beginner
|
Discovery tools for Government data analyticsThis talk will focus on Data discovery tools such as Tableau and Qlikview in the context of Government data. Invariably, the Government data is complex and most of the efforts are focused on getting and using this data. This session will focus on the challenges encountered while analyzing Government data and how to address these challenges based on my experiences working with various Government d… more
Section: Crisp talk for Data in Government track
Technical level: Intermediate
|
Suuchi - Toolkit to build distributed systemsAt Indix, we have a bunch of services that need to operate on top of large volume of product data. We started out with using open source distributed systems (like Hadoop, HBase, Solr, Spark, etc) to build some of our solutions. Along the way, we’ve also had problems where existing solutions wouldn’t really work for our requirements and operational cost associated with them started to shoot up. Th… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Search Infrastructure @ Slack using Lambda ArchitectureSlack is a collaboration tool for teams. We’re on a mission to make your working life simpler, more pleasant, and more productive. Search is the core feature of Slack offerings as Slack itself is an acronym for “Searchable Log of all conversation & knowledge”. At Slack, we experiment frequently with various machine learning models to improve search experience so rebuilding search indexes are crit… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
A Recommender for Match-making: Item-based CF, PageRank, Evaluation techniques & Deep-LearningOnline match-making has a lot of challenges where Machine-Learning can help. When we look at a profile what is it that makes us swipe right or left? Is there something about a profile that attracts us and if so what can a person’s historical interactions say about their preferences. I believe the contents would resonate with the audience quite well and help them appreciate the challenges of doing… more
Section: Full talk for data engineering track
Technical level: Advanced
|
The Python ecosystem for data science - Landscape OverviewIn their day-to-day jobs, data science teams and data scientists face challenges in many overlapping yet distinct areas such as Reporting, Data Processing & Storage, Scientific Computing, ML Modelling, Application Development. To succeed, Data science teams, especially small ones, need a deep appreciation of these dependencies on their success. more
Section: Full talk for data engineering track
Technical level: Beginner
|
Using data pipelines to navigate your data oceanOne of the main challenges facing companies adopting data-driven analytics-based approach to their business, is how to scale the development and adoption of data products throughout the company. In our experience, managed data pipelines is one approach that has emerged to address these challenges. This talk will introduce data pipelines, and illustrate how the challenges are addressed. The talk w… more
Section: Full talk for data engineering track
Technical level: Beginner
|
Fraud Detection & Risk Management in Payment Systems implemented using a Hybrid Memory DatabaseIn this talk, we will describe key real-time use cases in the areas of fraud detection, risk management and revenue assurance for payment systems and other such related systems. We will then present a brief overview of a database platform that has proven to be well suited for handling such use cases. more
Section: Full talk in Payment Analytics track
Technical level: Intermediate
|
Dr. Elephant: Achieving Quicker, Easier, and Cost-effective Big Data AnalyticsOpen Source: https://github.com/linkedin/dr-elephant more
Section: Crisp talk for Data in Government track
Technical level: Intermediate
|
Designing Machine Learning Pipelines for Mining Transactional SMS MessagesMuch of data science involves using data for some practical, business purpose. The data usually needs to be cleaned and processed and that might take a while, but it is generally close to where it needs to be. It can be incredibly exciting and engaging to work at one level back, where data is far from where it needs to be. At this level real work has to be done to transform data into a form ready… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Causal Analytics in Retail and TelcoIn this talk, I will discuss causal analytics using machine learning in the retail and telco domains. This talk should provide a brief overview of the value machine learning can provide in these domains along with the associated challenges and opportunities. more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Real-time Monitoring of Big Data WorkflowsDo you want to know the real-time status of your big data job? Not sure of how to collect all the metrics from these jobs and make sense out of them? Want to track and monitor the metrics in real time? Want to track the historical performance of your job? Want to build business reporting dashboards? more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Making data scientists life easy with DockerLife of data scientists is hard as they have to bother not only about the algorithms & analysis but also about the environment & dependencies they have to build in order to get there at the first place. Also, when it comes to collaboration, deployment and scaling they always have hard times. Introducing docker in the data science workflow can eliminate these issues significantly. While docker has… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsAs a Hadoop developer, do you want to quickly develop your Hadoop/Spark workflows? Do you want to test your workflows in a sandboxed environment similar to production? Do you want to write unit tests for your workflows and add assertions on top of it? more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Out of Stone age : Why investing in developer tools is necessary for big data development to scale.Do you wish hadoop development was as easy as any other application development ? Do you wish we had comprehensive tools that are well-integrated with each other for hadoop development ? At linkedin, we have 1000s of nodes spread across multiple clusters. We have 1000s of active users who use the cluster on an ongoing basis and 100s of flows that runs on a regular schedule powering the data to ou… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Transforming India's Budgets into Open Linked DataIndian Budget documents across various tiers of government, consist of detailed information on allocations made and resources raised in a financial year. Unfortunately these documents are published in unstrtuctured PDFs which makes it difficult for researchers, economists and general public to analyse and use this crucial data. This session will delve into our journey of developing OpenBudgetsInd… more
Section: Full talk for Data in Government track
Technical level: Intermediate
|
Human Centric API DesignIn the last decade, with the advent of big data technologies, the amount of data produced and processed is increasing exponentially. This data is meaningless if the insights out of it are not exposed in the right manner. more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
How we are building serverless architectures for Deep Learning & NLP at EpisourceServerless is the new kid on the block, and an exciting one at that ! As Anand Chitipothu puts it, it’s rapidly becoming the Uber of cloud computing resources. more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Processing mission critical events in real timeIf you have an event driven mission-critical application, you are always worried about such application failing and leading to opportunity or revenue loss. For a data based adtech company like Zapr Media Labs, one such application is deducting costs in real time for displayed advertisements and stop displaying when daily or hourly caps are reached. Such applications have challenges of scalability… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Designing Cost Effective Cloud Native ApplicationsDesigning applications for cloud environment requires thinking design in a different paradigm. In this talk, I will be discussing design principles, taking examples of applications that we have developed at Zapr Media Labs. How to make applications Idempotent, Immutable, Stateless, Resilient and Elastic will be the core of the talk. I will also discuss, how this design helps us save costs by leve… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Adapting Bandit Algorithms to optimise user experience at Practo ConsultThe art of trading between exploiting the best arm versus exploring for further knowledge of other arms has long been studied as Bandit Algorithms in various fields of clinical trials, designing financial portfolios, etc. Recently, in website optimization, these algorithms have been used for optimizing click through rates and performing A/B testing. However, these algorithms has the potential to … more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Seamless Hadoop Deployments - Myth or Reality?Continuous deployment of hadoop workflows is by and large a distant dream for every hadoop engineer. Reducing wastage of compute resources, improving developer productivity, eliminating costly bugs and avoiding data corruption are basic goals for every deployment. Yet, often times these goals are not achieved due to lack of comprehensive test coverage and standard best practices. This in-turn res… more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Learnings from building TV viewership platform for 100 Million users at zaprZapr Media Labs has come a long way from tracking TV viewership of around 5 Million users two years back to around 100 Million users currently. We want to share learnings while building a complex audio signal processing based platform which has gone through this sort of hyper growth; which involves processing more than Billion signals per day; producing tera bytes of raw organic data and processi… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Gabbar: Machine learning to guard OpenStreetMapOpenStreetMap is the largest free and open map of the world! An average of 2 million features are touched by volunteers around the world every single day. Amazing isn’t it? The global scale and the local diversity bring in a host of challenges for maintaining a high quality of data on OpenStreetMap. more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Interestingness of interestingness measuresAnalysis of relationship between entities is at the heart of data mining problems. There are many metrics used for association mining like support, confidence, lift, mutual information etc. However many of these measures provide conflicting results about the interestingness of the association. Therefore it becomes very important to understand how to evaluate metrics for an application. more
Section: Full talk for data engineering track
Technical level: Advanced
|
Wait, I can explain this! (ML models explaining their predictions)Today ML/AI is being used in mission critical applications. However, it is still difficult for a human being to trust a black-boxy ML algorithm. Wouldn’t it be cool if an algorithm could also explain why it had predicted a particular result and thereby strengthen it’s voice? That’s what exactly this talk is all about. Would walk you through how we implemented a model explainer for ZOHO’s ML suite… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Plumbing data science pipelinesData - There is a lot of it . But organizing it can be challenging, and analysis/consumption cannot begin until data is aggregated and massaged into compatible formats. These challenges grow more difficult as your dataset increases and as your needs approach the fabled “real time” status. Here, we’ll talk about how Python can be leveraged to collect data that is organized from many sources, stand… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
What database? - a practical guide to selection from NoSQL, SQL and Polyglot data storesIn system building, data store choices affect system scalability more often than language platforms. Frequently it is also the single most constrained resource in the application stack. While most database vendors will want you to believe their solution is the panacea for database scalability problems, it only leaves a developer confused among the plethora of SQL and NoSQL databases. This talk wi… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Scalability truths and serverless architectures - why it is harder with stateful, data-driven systemsBuilding scalable systems is not easy. It is not as simple as deploying on a cloud and expecting it to scale alongwith the cloud’s elasticity. Many systems and solutions that claim elasticity of scale often indirectly limit their claims to stateless services. Serverless architecture is a recent addition to the developer programming/deployment toolset that offers the convenience of zero server dep… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Developing and Deploying Analytics for Internet of Things (IoT)The combination of smart connected devices with data analytics and machine learning is enabling a wide range of applications, from home-grown traffic monitors to sophisticated predictive maintenance systems and futuristic consumer products. While the potential of the Internet of Things (IoT) is virtually limitless, designing IoT systems can seem daunting, requiring a complex web infrastructure an… more
Section: Sponsored session
Technical level: Intermediate
|
How to read a user's mind? Designing algorithms for contextual recommendationsThe human mind is going through thousands of thoughts everyday. A perfect recommender system needs to know what is going on and suggest something useful - at all times, without being perceived as intrusive or noisy. After slicing every possible sensor within the reach of a digital system - from the GPS, Accelerometer, Time of day, Temperature, Browsing History, TV Viewing, Sound, a “perfect recom… more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Do you know what's on TV?The mobile has made tremendous progress - but it is still referred as “second screen” to the Television. Television (specifically Linear TV) will continue to be the most efficient way to get high quality content to millions of homes. Even though all the devices around us have gotten smarter - people still watch TV by memorizing channel numbers and move between painful guides. At the root of this … more
Section: Full talk for data engineering track
Technical level: Intermediate
|
What explains our marks?The NCERT put together a large-scale survey called the National Achievement Survey. This captured student performance across 4 subjects via 100 questions each, the demographics and behaviour of students, teachers and schools through 300 more questions. more
Section: Crisp talk for Data in Government track
Technical level: Beginner
|
Lessons learned from building a globally distributed database service from the ground upDescription: Dharma and his team has spent past 7 years to build Azure Cosmos DB (http://cosmosdb.com) - a massively scalable, multi-tenant, globally distributed database service from the ground up. The system they have built is currently operating across more than thirty-four geographical regions, managing hundreds of petabytes of indexed data, and serving 100s of trillions of requests every day… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Apache Atlas Introduction: Need for Governance and Metadata managementApache Atlas is the one stop solution for data governance and metadata management on enterprise Hadoop clusters. Atlas has a scalable and extensible architecture which can plug into many Hadoop components to manage their metadata in a central repository. Vimal Sharma will review the challenges associated with managing large datasets on Hadoop clusters and demonstrate how Atlas solves the problem.… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
How to prepare your language for Machine Learning and NLP with an open audio documentation toolkitPronunciation libraries are a key to building machine learning tools and many Natural Language Processing research and product development. In the age of personal assistant apps, human voice-based apps can help people with visual disability and everyone else access information, and contribute back to the knowledge commons. There is a need for a range of native-language-based solutions—from talkin… more
Section: Full talk for Data in Government track
Technical level: Intermediate
|
Machine Learning as a ServiceYou code, you test, you ship and you maintain This workshop addresses one of the most common pain points we have come across with data scientists at many organizations : last-mile delivery of data science applications - moving data science solutions to production. more
Section: Workshops
Technical level: Beginner
|
Building a Generic but highly customizable and scalable Anomaly Detection System @ BadooBadoo is a data driven company with 340 million users across 190 countries it provides a number of apps and white label services across multiple platforms. Badoo crunches through around 23 billion events per day with 600 different types of events. Automated tracking a large number of events and reporting observations which do not conform to an expected pattern is the essential part of our data dr… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Reality of Data Modelling: Many analysts, one dataset: Multiple ResultsThere is a study that gave the same data set to many teams competent to analyse it and asked them all the same question: “whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players”: http://home.uchicago.edu/~npope/crowdsourcing_paper.pdf more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Saving taxes without breaking laws using Machine LearningNovel use cases for machine learning in the taxation and accounting areas. These are particularly important given the push towards GST and digitization of taxes in India. more
Section: Full talk in Payment Analytics track
Technical level: Beginner
|
Talk Less, Chat MoreConversational interfaces are the new channels coming up for business. These channels are new for both users and businesses. For a business it’s a new kind of user behaviour they have to understand! This new behaviour generates a completely new kind of data. more
Section: Full talk for data engineering track
Technical level: Beginner
|
How to build scalable and robust data pipeline iteratively.I will drill down to understand how startups can build scalable data pipeline using open source tools. What do all these tools do and how do they fit into the ecosystem? And how to iteratively build a scalable and robust data engineering pipeline as you grow as a company ? more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetWhen interacting with analytics dashboards in order to achieve a smooth user experience, two major key requirements are quick response time and data freshness. To meet the requirements of creating fast interactive BI dashboards over streaming data, organizations often struggle with selecting a proper serving layer. more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Unlock sub-second SQL analytics over terrabytes of data with Hive and DruidDruid is an open-source analytics data store designed for business inteligence OLAP queries on timeseries data. Druid provides low latency real-time data ingestion, flexible data exploration and fast data aggregation. Many organizations have deployed Druid to analyze ad-tech, dev-ops, network traffic, website traffic, finance, sensor and IOT data. more
Section: Full talk for data engineering track
Technical level: Beginner
|
Modeling intent of the user using Probabilistic Machine LearningUnderstanding the user’s intent can help the product team dramatically improve the user’s experience. Be it adding the right products to a shopping cart, stocks to the portfolio or packages to a software stack, the user’s intent drives the choices and products added. When designing recommendation systems, modelling intent is non-trivial. The intent behind the action is hidden. This talk is about … more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Unless you measure it; you can’t improve it - Data pipelines for your business KPIs and KRAsAbstract Any business can gain unfair advantage through actionable insights using data pipelines and some common sense. We’re already experiencing this through our interactions online (amazon , medium.com) and through mobile apps (uber, ola and many more) more
Section: Workshops
Technical level: Intermediate
|
Lessons Learnt building and optimizing a self service Data Platform on Apache Spark at IndixIn this talk I will talk about how we used Apache Spark to build a self service data platform at Indix that helped democratise access to several datasets at Indix to our customers and the internal engineering and data science teams. I will also share some of the lessons learnt while optimizing performance and tuning Spark jobs that run on these datasets. more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Recommendation Engine for Wide TransactionsMany applications we use today are powered by cloud and mobile. One of the critical components that drives engagement for the platforms on cloud is the recommendation engine. Recommendation systems are becoming all-pervasive. The transactions/interactions we have with the platform decide the next set of recommended items. As both users and the number of products offered on the platform scale, we … more
Section: Full talk for data engineering track
Technical level: Beginner
|
Near Real time indexing/search in E-commerce marketplace : Approaches and LearningsKey Take aways of the talk 0. Demystifying Lucene & showing inside view of it and how to extend core components of it. more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Augmenting Solr’s NLP Capabilities with Deep-Learning Features to Match ImagesMatching images with human-like accuracy is typically extremely expensive. A lot of GPU resources and training data are required for the deep-learning model to perform image-matching. While GPU is something that most companies can afford, training data is hard to obtain. more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Using Probabilistic Data Structures to Build Real-Time Monitoring DashboardsPerforming basic operations like finding an element in a set or calculating its cardinality for a few thousands of data points is child’s play. However, it becomes complex and prohibitively expensive as the data-set grows into the millions and covers multiple dimensions. more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Data in drug discoveryData is being used to solve some of the greatest challenges in medicine today. Advances in technology mean that scientists have access to data that was impossible to acquire just 5 years ago. Modeling and analysis are driving improved understanding about how our bodies work. This in turn is helping scientists find cures for deadly diseases. Curing diseases now requires combined efforts of data sc… more
Section: Full talk for data engineering track
Technical level: Beginner
|
Leonardo Machine Learning Foundation - Adding Intelligence to your Enterprise BusinessMachine learning and the larger world of artificial intelligence (AI) are no longer martial arts. As a new breed of software that is able to learn without being explicitly programmed, machine learning (deep learning and supervised learning) can access, analyse, and find patterns in, Big Data in a way that is beyond human capabilities. We all know that the world is moving to a more data driven dec… more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Application Dependency Data Performance Mapping tool - DynatraceMore companies today are adopting cloud services and related technologies like microservices architecture and containerization to build and deliver digital services faster and achieve greater IT agility. Monitoring and managing the performance of these dynamic application environments spanning the cloud and other third-party services is difficult, however, without the right tools. Leveraging an a… more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
How Machine Learning Algorithms evolved at Haptik while it's Chatbot catered to 200 million messagesEvolution of automated messaging, which started in 1966 with first Chatbot, ELIZA, has now reached a stage where Chatbots have found there application in several industry domains like personal assistance, customer care, banking, e-commerce, healthcare, etc. With early experiments showing positive results , we have reached a stage where chatbots are no longer merely an application to play around w… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
ML Goes FruitfulIndustry is demanding for the real-time interactions, automation[2] and decision making. The latest trends like machine learning, Internet of Things, Artificial Intelligence, Virtual Reality, Digitization, Blockchain are booming in the market and can be leveraged to meet market demand. Highest customer experience is the key, that can be achieved by minimizing defects in the product. Food processi… more
Section: Workshops
Technical level: Beginner
|
Making sense of Digital and Physical Documents using ML and Optical Character RecognitionHave you ever wondered what could you do with the piece of paper that you have at hand when you make a purchase at your local grocery store, get your car’s tank full, see a doctor when you are ill, go to a loan provider to get a quick loan and much more! more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Building camera based intelligent applicationsCamera based intelligent applications are lot of fun! There are many practical applications of it like Industrial Counters, Real Time Object Tracking, Object Classification, Road Traffic Estimation etc. While they are fun and interesting, building them is not that trivial. Generally, building camera based intelligent applications require many modules in the pipelines and a data scientist may not … more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Distributed Machine Learning - Challenges and OppurtunitiesThe traditional machine learning libraries like scikit-learn in Python are written to work on a single computer. While that is good enough for small datasets, traning ML models on large datasets often taken very long time. more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Multi-channel conversational chatbot platform powered by NLP engineIn this talk, the speaker would talk about a chat engine/ platform to enable human to machine interaction on multiple channels (web, slack, hipchat, etc) including social like facebook across text and voice. A user can seamlessly move across channel without losing the chat context and the conversation. Also, this talk will give insights to the NLP engine that powers this platform. more
Section: Crisp talk for data engineering track
Technical level: Beginner
|
Zero down time ML model swap using docker and kubernetesAt Gojek, we needed to improve the allocation of driver to customer. The behaviour of drivers across different regions are different. Models went stale depending on festivals and influx of new drivers to the system. Also a safe environment for the data science to play with the models was lacking. more
Section: Full talk for data engineering track
Technical level: Beginner
|
Gen Z BI Paradigm - A Scalable , hybrid and collaborative Visualization Architecture using Spark , No SQL and Restful APIThe Business Intelligence (BI) landscape is constantly in a state of flux – there is a need for constant growth in order to cope with the exponential changes in the data and analytics space. In today’s world, everything is measured, and everything is interconnected. This has triggered our common goal to collate varied sources of information in different formats and make it available anywhere, any… more
Section: Crisp talk for data engineering track
Technical level: Intermediate
|
Democratising Data in the Microservices WorldIn the new world of microservices, every service lives independently with its own databases. But then, they still need data from other microservices to function. It becomes harder and harder for running any kind of analytics or data science on all this fragmented data. In this democratic, decentralized world how do you empower microservices teams to build their own data pipelines? How do you enab… more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Maps ❤️ Data: A voyage across the world of geo-visualizationA talk on visualizing data with maps, with an aim to answer the following questions: more
Section: Full talk for data engineering track
Technical level: Intermediate
|
Building a converged platform for data analyticsThis talk will explain the approaches one must take to build a converged platform for data analytics. We at IQLECT have built a real-time analytics platform and will like to share the experience. Also this helps answer an important question, Build or Buy. more
Section: Crisp talk for data engineering track
Technical level: Advanced
|
Interactive Data Visualisation using Markdown“A picture is worth a thousand words. An interface is worth a thousand pictures.” — Ben Shneiderman more
Section: Full talk for data engineering track
Technical level: Beginner
|
Open data in government: challenges, and the case of Telangana Open Data InitiativeThis talk will cover: The challenges involved in opening up government data. more
Section: Full talk for Data in Government track
Technical level: Beginner
|
5 Lessons I’ve Learned Tackling Product Matching for E-commerceProduct matching is the challenge of examining two different representations of retail products (think items that you see on e-commerce websites) and determining whether they both refer to the same product. Tackling this problem requires a mix of NLP (to deal with text data), computer vision (to deal with product images), ontology management and more (to ingest a host of other signals on offer). more
Section: Full talk for data engineering track
Technical level: Intermediate
|
How We Built Our Machine Intelligence To Help Humans Save Lives7.2 million people die of heart disease every year. 50% of these lives can be saved if heart attacks can be diagnosed quickly and treatment coordinated within the golden hour. Diagnosing heart disease requires a simple test called an ECG, unfortunately, interpreting the ECG accurately requires a specialist. But, how do we put the skills of a cardiologist in every corner of the globe ? How do we e… more
Section: Full talk for Data in Government track
Technical level: Beginner
|
Bits and joules: data-driven energy systemsThe electricity industry is going through a paradigm shift by moving from centralised generation to distributed energy resources. This talk will give an overview of this shift, discuss how data-driven energy systems are powering this shift, and illustrate the approach through a specific use case of solar plant management. I will also provide some pointers for exploring the space. more
Section: Full talk for Data in Government track
Technical level: Beginner
|