Taking Fashion and Lifestyle Commerce Towards SKUs Using Deep Image and Text Parsing

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

Make a submission

NIMHANS Convention Centre

Format

This year’s edition spans two days of hands-on workshops and conference. We are inviting proposals for:

Full-length 40 minute talks.

Crisp 15-minute talks.

Sponsored sessions, 15 minute duration (limited slots available; subject to editorial scrutiny and approval).

Hands-on Workshop sessions, 3 and 6 hour duration.

Selection process

Proposals will be filtered and shortlisted by an Editorial Panel. We urge you to add links to videos / slide decks when submitting proposals. This will help us understand your past speaking experience. Blurbs or blog posts covering the relevance of a particular problem statement and how it is tackled will help the Editorial Panel better judge your proposals.

We expect you to submit an outline of your proposed talk – either in the form of a mind map or a text document or draft slides within two weeks of submitting your proposal.

We will notify you about the status of your proposal within three weeks of submission.

Selected speakers must participate in one-two rounds of rehearsals before the conference. This is mandatory and helps you to prepare well for the conference.

There is only one speaker per session. Entry is free for selected speakers. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. HasGeek will provide a grant to cover part of your travel and accommodation in Bangalore. Grants are limited and made available to speakers delivering full sessions (40 minutes or longer).

Commitment to open source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source licence. If your software is commercially licensed or available under a combination of commercial and restrictive open source licences (such as the various forms of the GPL), please consider picking up a sponsorship. We recognise that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Key dates and deadlines

Revised paper submission deadline: 17 June 2016

Confirmed talks announcement (in batches): 13 June 2016

Schedule announcement: 30 June 2016

Conference dates: 28-29 July 2016

##Venue
The Fifth Elephant will be held at the NIMHANS Convention Centre, Dairy Circle, Bangalore.

##Contact
For more information about speaking proposals, tickets and sponsorships, contact info@hasgeek.com or call +91-7676332020.

Taking Fashion and Lifestyle Commerce Towards SKUs Using Deep Image and Text Parsing

Submitted Apr 25, 2016

Section: Full talk Technical level: Intermediate

In this talk, I will describe challenges, insights, innovations and experiences in building a large-scale deep learning system to prepare SKUs (Stock Keeping Units) for millions of fashion products.

E-commerce is booming across the globe at an astonishing rate. India alone is expected to witness CAGR of 50% by 2020. In such a fast-faced and mobile-first market, online commerce experiences (e.g., intent identification, search results, product recommendations) are steadily replacing deep discounts as the means to acquire and retain consumers. While consumers interact with buyer-side portals, the organization of products (aka catalogues) acquired through seller-side platforms plays pivotal role towards search, discovery and personalization experiences. Unfortunately, different sellers have different interpretations of the same product and distributed onboarding of products produces poorly organized e-commerce catalogues. For instance, a large fraction of products in such catalogues have inconsistency between product title and product image, missing product titles or keywords, incorrect tagging of keywords and several duplicate products. This further results in poor and irrelevant search, personalization and recommendations; degrading user experience significantly.

With the use of SKUs, normalizing and cleaning of products has been possible for products in consumer electronics space. However, due to poorly organized catalogues and inherent difficulties in describing and quantifying product details, the problem of organizing products as SKUs in categories such as fashion (e.g., fashion apparel, fashion accessories) and lifestyle (such as home-decor) has been largely unsolved. The problem becomes even more critical if one has to build an aggregate fashion commerce application that ingests several such poorly tagged catalogues. In this talk, I will describe how deep learning has made it possible to prepare SKUs for fashion and lifestyle products. We innovate and apply deep image parsing to extract detailed product information from product images. We further apply deep learning models originally conceived for images to process text paragraphs to solve Named Entity Recognition and Disambiguation to produce structured outputs. Using confidence scores of the two process, we then combine results of text and image parsing to merge and create unique products. I will also describe evaluation criteria and several engineering-challenges to build large-scale systems to process product steams and normalize millions of fashion products. I will especially give insights into experiences and experimentation on the amount and quality of labeled data needed to achieve desired accuracy.

Intended audience: intermediate and advanced technical audience

Outline

This talk will present recent innovations in deep neural networks to build business applications using large scale data. We will take deep dive into fashion and lifestyle online commerce data, and image and text processing to build large-scale deep learning models.

Motivation: What is the quality of commerce catalogues and why product search experience is not satisfactory?
How often do you open multiple tabs to search for an item? How many times you find a fashion item that you are looking for? Do the results match to your query intent? Most often not. To dig deeper, we will first review some interesting statistics on the state of pollution in fashion and lifestyle commerce catalogues; for instance, number of duplicate products, number of products with missing keywords, number of products with mismatch between image and text. No wonder, when you search for “blue evening cocktail party dress”, you get poor results on most of the commerce platforms. We did this analysis on more than 10 million products from different e-commerce portals in India and abroad.

Challenges: Why has this state not been improved over the years?
Normalizing and cleaning unstructured image and text data into a structured data poses several difficulties. Product images come in different size, shape, pose, content and other varieties. They contain different product items that may or may not be relevant to accompanying text. Text description contains mix of product description, complementary products and suitability criteria. Parsing such images and text snippets on scale and with high accuracy has been traditionally difficult for software machines (using machine learning algorithms).

Rebirth of deep learning to utilize big data in fashion commerce:
I will first motivate why previous attempts of using machine learning to parse e-commerce data have not been entirely successful. I will then describe what has changed with the rebirth of deep learning to solve the problems of deep image and text parsing. Innovating and applying deep learning models, I will then show in details how we can extract structured data from unstructured image and text data to build SKUs for fashion and lifestyle products. This category of products is especially challening to prepare SKUs since we have to extract a lot visual and textual attributes via deep parsing of images and text; unlike consumer electronics category where product specifications are standardised.

Deep Learning at scale:
I will then describe deep learning engineering pipeline to collect, clean and feed data to deep learning models, train deep learning models using GPUs, innovating on architectures and training procedures to achieve desired accuracy, and deploying models in production to clean and normalzie millions of products. I will especially talk about recent advances in fully convolutional and segmentation based deep neural networks and its applications for image and text processing at Infilect.

Talk outline: https://www.dropbox.com/s/1ph6fslz9wn3inu/fifth-elephant-2.pdf?dl=0

Requirements

Speaker bio

Vijay Gabale is co-founder and CTO of Infilect, an AI-powered Commerce Platform. Infilect has been building a fashion commerce platform to provide exceptional shopping experiences to the Internet consumers. The company has made several innovations in deep learning to process rich multi-media data (text, image, videos) to improve discovery, search and personalization experiences of online consumers.

Prior to co-founding Infilect, he was a research scientist with IBM Research Labs. He graduated with a PhD from IIT Bombay, India in 2012. He has several top tier research publications and software patents to his name. He is also co-organizer of ‘Deep Learning Bangalore’ meetup. He has been actively working in deep learning for past several years and has give several talks in and outside India on the research and applications of deep learning in e-commerce.

Slides

https://www.dropbox.com/s/1ph6fslz9wn3inu/fifth-elephant-2.pdf?dl=0

The Fifth Elephant 2016

Format

Selection process

Commitment to open source

Key dates and deadlines