The Fifth Elephant winter edition 2019

Winter edition of India's most renowned conference on big data and data science

From an archived data field to GOJEK’s world class product feature for customer-experience

Submitted by Divya Choudhary (@divyachoudhary) on Friday, 5 October 2018

videocam
Preview video

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +1

Abstract

This talk will focus on details of how machine learning & natural language processing was used on customer notes data along with bookings data by our team at GOJEK to come up with a product feature enabling customers to see all pickup gates near him/her with their appropriate names that they can understand & choose from while booking a car or ride service.

Like any other service company, customer experience while booking a service is of prime importance at GOJEK, a technology startup based in Jakarta, Indonesia that specialises in ride/car hailing along with 17+ other services. With immense data influx in the system from more than 18 services, the data field that have been archived earlier turned out to be the best data that improved how customers book rides/car on GOJEK app. If you want to understand through examples, the untapped value your data can add to your product through curated machine learning and natural language processing, no use case can better explain it than this.

Divya Choudhary, the speaker will speak in detail about this use case- problem statement, solution outline, data processing, major algorithmic decisions/learnings & final output feature. The presentation will highlight on how the product feature was built, focussing majorly on two algorithmic piece that constituted the problem solution:

  • Machine learning clustering technique
  • DBSCAN vs K-means, how to know when to use what
  • Wonders of Language modeling
  • Pre-processing of corpus is the key
  • Great potential of N-gram modeling

Key takeaway from the talk:

  • understanding the importance of correct machine learning algorithm on data in driving business to an unimaginable higher level
  • understanding that no data is waste - all you need is correct algorithm to best utilize it
  • understanding of building an excellent product feature using textual & geospatial data
  • the talk will be focussed on how this product feature that’s live now was practically built using machine learning and NLP
  • understanding of impact that a data solution like this can bring to the logistics service with a live example of metrics being upscaled by this at GOJEK
  • understanding of possible use cases of application of similar solution is different other scenarios across logistics industry
  • greater understanding of clustering geo spatial data
  • understanding of an amazing use case of n-gram language model

Outline

Who would have imagined that a random chat message or note written in a local language sent by customers to their drivers while waiting for a ride/car to arrive for their pickup can be utilized to carve out unparalleled information about pickup points, their names that sometimes even Google map has no idea of & to finally help in creating a world class customer pick-up experience feature!

Speaker bio

A computer science engineer turned decision scientist turned data scientist. I have an experience of ~4 years.

Having worked closely with the board of directors of 3 startups in India & Indonesia, I am known for my business understanding, problem solving approach and obviously driving data science problems to the final execution.

Personal:) a yoga lover, a poetess, a painter, a avid trekker & wanderer who is best at talking to people and learning about them

Links

Slides

https://docs.google.com/presentation/d/12Q9f5V-SXILQepBPo2ktZJYDvv2nDEFO57kL83L_3Cw/edit?usp=sharing

Preview video

https://drive.google.com/file/d/1qLhG1VHsaMiXMCbh1JoEQvTjwAA3t0kB/view?usp=sharing

Comments

Login with Twitter or Google to leave a comment