The Fifth Elephant round the year submissions for 2019
Submit a talk on data, data science, analytics, business intelligence, data engineering and ML engineering
Submit a talk on data, data science, analytics, business intelligence, data engineering and ML engineering
Aayushi Pathak
Matching the same and similar products is a problem fundamental to the online retail industry with multiple applications spanning across price optimization, recommending similar or substitute products to customers, understanding gaps in product assortments, and counterfeit product detection.
Given that that there are no standard product identifiers, catalog data is often noisy, incomplete and nonstandard, product matching is a challenging problem at scale. In this talk we will define the problem of product matching and discuss what makes it a hard problem. We will then discuss our approaches towards addressing it.
We use an ensemble of text and image-based approaches: content-based image retrieval (that uses a novel hashing technique that we developed), CNN, language model based word embeddings (BERT and Transformer), and techniques from classical machine learning.
We have built an automated pipeline that adapts based on the category of products it is handling.
Byom Kesh Jha, Data Scientist – Semantics, DataWeave
Byom designs and develops predictive modelling technologies in multiple domains, especially in retail and education. He is extensively involved in the training & deployment of machine-learning models. His expertise lies in diverse NLP techniques, sequence learners - NERs, classifiers, building knowledge bases, deep learning, product aspect extraction, user-generated content analysis, and more.
https://drive.google.com/file/d/1f3Lz4RPf6sxPH-W8xJZ388TeGZ_oJkIu/view?usp=sharing
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}