The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Govind Chandrasekhar

@gc20

5 Lessons I’ve Learned Tackling Product Matching for E-commerce

Submitted Apr 29, 2017

Product matching is the challenge of examining two different representations of retail products (think items that you see on e-commerce websites) and determining whether they both refer to the same product. Tackling this problem requires a mix of NLP (to deal with text data), computer vision (to deal with product images), ontology management and more (to ingest a host of other signals on offer).

I’ve been working on this problem in various capacities for a few years now at Semantics3. During this period, I’ve made a fair number of mistakes which in turn have taught me useful lessons about applying deep/machine learning in an industry setting.

During this talk, I’d like to walk you through 5 specific scenarios in which I attempted to achieve a specific goal in the context of product matching, but ran into an unexpected problem that threw a spanner in the works. I’ll then talk about the root cause that sprouted the problem in the first place and the lesson I learned having made this discovery. Where relevant, I’ll bring in examples from outside the retail domain to broaden the perspective offered.

The goal of the talk isn’t to provide a guidebook for solving the product matching problem - the goal is to give you insight into the ups and downs of working through a specific data-science problem, and in the process, delivering packaged lessons that you could potentially draw on in your own field of work.

Outline

Slides: https://docs.google.com/presentation/d/e/2PACX-1vQXiUXFJlyY15Q9y4dzgQZRrN808r2x6jm-5oahtczE_Mb4MpzErXychm2JXccGSS-3563Dk5QgPT-H/pub

Requirements

Basic understanding of deep learning and experience working on real-world problems is ideal. Beginners should be able to follow.

Speaker bio

Govind is a co-founder of Semantics3. Semantics3 provides Data APIs and AI APIs for e-commerce focused companies to make better decisions and grow their businesses. We’re a 5+ year old Y Combinator backed startup based in Bengaluru, San Francisco and Singapore.

Our data-science team works on e-commerce data problems like product categorization, product matching, named entity recognition and unsupervised content extraction.

Slides

https://docs.google.com/presentation/d/e/2PACX-1vQXiUXFJlyY15Q9y4dzgQZRrN808r2x6jm-5oahtczE_Mb4MpzErXychm2JXccGSS-3563Dk5QgPT-H/pub

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures