The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

5 Lessons I’ve Learned Tackling Product Matching for E-commerce

Submitted by Govind Chandrasekhar (@gc20) on Saturday, 29 April 2017

videocam_off

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +4

Abstract

Product matching is the challenge of examining two different representations of retail products (think items that you see on e-commerce websites) and determining whether they both refer to the same product. Tackling this problem requires a mix of NLP (to deal with text data), computer vision (to deal with product images), ontology management and more (to ingest a host of other signals on offer).

I’ve been working on this problem in various capacities for a few years now at Semantics3. During this period, I’ve made a fair number of mistakes which in turn have taught me useful lessons about applying deep/machine learning in an industry setting.

During this talk, I’d like to walk you through 5 specific scenarios in which I attempted to achieve a specific goal in the context of product matching, but ran into an unexpected problem that threw a spanner in the works. I’ll then talk about the root cause that sprouted the problem in the first place and the lesson I learned having made this discovery. Where relevant, I’ll bring in examples from outside the retail domain to broaden the perspective offered.

The goal of the talk isn’t to provide a guidebook for solving the product matching problem - the goal is to give you insight into the ups and downs of working through a specific data-science problem, and in the process, delivering packaged lessons that you could potentially draw on in your own field of work.

Outline

Slides: https://docs.google.com/presentation/d/e/2PACX-1vQXiUXFJlyY15Q9y4dzgQZRrN808r2x6jm-5oahtczE_Mb4MpzErXychm2JXccGSS-3563Dk5QgPT-H/pub

Requirements

Basic understanding of deep learning and experience working on real-world problems is ideal. Beginners should be able to follow.

Speaker bio

Govind is a co-founder of Semantics3. Semantics3 provides Data APIs and AI APIs for e-commerce focused companies to make better decisions and grow their businesses. We’re a 5+ year old Y Combinator backed startup based in Bengaluru, San Francisco and Singapore.

Our data-science team works on e-commerce data problems like product categorization, product matching, named entity recognition and unsupervised content extraction.

Links

Slides

https://docs.google.com/presentation/d/e/2PACX-1vQXiUXFJlyY15Q9y4dzgQZRrN808r2x6jm-5oahtczE_Mb4MpzErXychm2JXccGSS-3563Dk5QgPT-H/pub

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Share draft slides, detailing the content you will cover + two-min preview video explaining what this talk is about and why participants should attend.

    • 1
      Govind Chandrasekhar (@gc20) Proposer a year ago

      @zainabbawa: Added video + slides!

Login with Twitter or Google to leave a comment