The Fifth Elephant 2019

The eighth edition of India's best data conference


Using ML for Personalizing Food Search at Go-jek

Submitted by Maulik Soneji (@mauliks) on Sunday, 13 January 2019

Preview video

Session type: Full talk of 40 mins


GoFood, the food delivery product of Gojek is one of the largest of its kind in the world. This talk summarizes the approaches considered and lessons learnt during the design and successful experimentation of a search system that uses ML to personalize the restaurant results based on the user’s food and taste preferences .

We formulated the estimation of the relevance as a Learning To Rank ML problem which makes the task of performing the ML inference for a very large number of customer-merchant pairs the next hurdle.
The talk will cover our learnings and findings for the following:
a. Creating a Learning Model for Food Recommendations
b. Targetting experiments to a certain percentage of users
c. Training the model from real time data
d. Enriching Restaurant data with custom tags

Our story should help the audience in making design decisions on the data pipelines and software architecture needed when using ML for relevance ranking in high throughput search systems.


  1. Brief about Speaker and GoJek/GoFood
  2. Architecture considerations
  3. Modelling search as a relevance problem
  4. Creating Machine Learning Model for Personalized Search
  5. Aggregating real time customer interaction data
  6. Tracking Performance of the model
  7. Training current model with real time data points
  8. Enriching Restaurant Data with custom metrics
  9. Road Ahead for improving search experience


No pre-requisite is required for the presentation.
Having knowledge about Elasticsearch and ML will help them grasp our use case better.

Speaker bio

Maulik Soneji is currently working as a Data Engineer at Gojek where he works with different parts of data pipelines for a hyper-growth startup. Outside of learning about data systems, he is interested in elasticsearch, golang and kubernetes.



Preview video


  • Anwesha Das (@anweshasrkr) Reviewer 3 months ago

    Thank you for submitting this proposal. We require slides and preview video by 11th March, latest, to evaluate your proposal and make a decision.

  • Maulik Soneji (@mauliks) Proposer 3 months ago

    Please find the slides here:
    Will be posting the video as well.

  • Anwesha Sarkar (@anweshaalt) Reviewer 2 months ago

    Submit your preview video by 20th April (latest) it helps us to close the review process.

  • Zainab Bawa (@zainabbawa) Reviewer 2 months ago

    Maulik, our policy is one speaker per session. This is non-negotiable. Between Jewel and yourself, you have to decide and let us know who is the person who will present if this proposal is shortlisted for the conference.

  • Maulik Soneji (@mauliks) Proposer a month ago

    I will be presenting at the conference. I have made the changes in the proposal and the slides

  • Zainab Bawa (@zainabbawa) Reviewer a month ago

    Recapping the feedback from rehearsal held:

    1. Company introduction at the start should be removed. The context of Go-Food to be moved into the context and problem statement.
    2. Self introduction can be shortened too so that we move into the problem statement more quickly. Else, audience will switch off.
    3. The food classification needs to come in after the problem statement or removed. Because putting it before seems information without any purpose.
    4. Refer to user as “they/them” rather than he/she in order to keep the language gender neutral.
    5. When describing the ML pipeline, why did you make this choice?
    6. Show pictures when explaining user journey.
    7. The metrics slide was unclear. The text was cut off, and in general, this wasn’t clear.
    8. How many clusters were there for which the partnerships, and scale of output?
    9. User research for rice, such entity matching and entity relationships – how do you come up with this decision?
    10. Direct connection between further work and current work is not clear.
    11. How are you learning to rank the model? What kind of choices are used?
    12. Merchant users – what does the number represent? Why is this the case?
    13. Equations are just stated, but not explained. End indexes are probably incorrect. This will help us understand the normalization flow.
    14. The problem statement, the challenge is unclear. Talk is too high level.
    15. What do you want the audience to take away? This is a black box.
    16. The story is interesting, and people can identify with the problem. But the deep dive is missing. Maybe helping the audience to understand how to rank, or why you use the open source plug-in or how you do what you do so that the audience can open their laptops and start trying something out – going deep dive into one of these could be a takeaway for the audience.
    17. Did you have a cold start problem? Or if you did not have to deal with this, then explain this as well.
    18. Tracking performance – how long did it take for you to take learning to rank to work well. Sessions from personalized to non-personalized search, and vice versa – showing some numbers will help.
    19. Abhishek will add his comments on the slides itself.

    The next steps from here are:

    Submit your revised slides, incorporating particularly the three points of feedback:
    A. Defining the problem statement clearly.
    B. Anchoring the proposed talk in one key takeaway.
    C. Going deep dive, rather than giving high level details.

    Revised slides to be submitted as per the proposer’s timelines.

  • Abhishek Balaji (@booleanbalaji) Reviewer 18 days ago

    Moved to waitlisted; revised deadline is 3 June.

  • Maulik Soneji (@mauliks) Proposer 15 days ago

Login with Twitter or Google to leave a comment