Anthill Inside 2019

A conference on AI and Deep Learning

Make a submission

Accepting submissions till 01 Nov 2019, 04:20 PM

Taj M G Road, Bangalore, Bangalore

About the 2019 edition:

The schedule for the 2019 edition is published here: https://hasgeek.com/anthillinside/2019/schedule

The conference has three tracks:

  1. Talks in the main conference hall track
  2. Poster sessions featuring novel ideas and projects in the poster session track
  3. Birds of Feather (BOF) sessions for practitioners who want to use the Anthill Inside forum to discuss:
    - Myths and realities of labelling datasets for Deep Learning.
    - Practical experience with using Knowledge Graphs for different use cases.
    - Interpretability and its application in different contexts; challenges with GDPR and intepreting datasets.
    - Pros and cons of using custom and open source tooling for AI/DL/ML.

Who should attend Anthill Inside:

Anthill Inside is a platform for:

  1. Data scientists
  2. AI, DL and ML engineers
  3. Cloud providers
  4. Companies which make tooling for AI, ML and Deep Learning
  5. Companies working with NLP and Computer Vision who want to share their work and learnings with the community

For inquiries about tickets and sponsorships, call Anthill Inside on 7676332020 or write to sales@hasgeek.com


Sponsors:

Sponsorship slots for Anthill Inside 2019 are open. Click here to view the sponsorship deck.


Anthill Inside 2019 sponsors:


Bronze Sponsor

iMerit Impetus

Community Sponsor

GO-JEK iPropal
LightSpeed Semantics3
Google Tact.AI
Amex

Hosted by

Anthill Inside is a forum for conversations about Artificial Intelligence and Deep Learning, including: Tools Techniques Approaches for integrating AI and Deep Learning in products and businesses. Engineering for AI. more

Govind Chandrasekhar

@gc20

Unsupervised Catalog Generation with Clustering, Reinforcement and More

Submitted Apr 5, 2019

This presentation will look at how you can generate product catalogs from ecommerce websites using just the homepage URL of the website. Techniques explored include URL clustering, regex generation, reinforcement learning and supervised classification.

Outline

Presentation structure:

  • Intro: What the problem is, why it’s useful and its roots in the Semantic Web movement.

  • Identifying Product URLs: The need to identify product pages from just their URLs. Using URL signatures + clustering + regex generation + supervised classification to solve this problem.

  • Spidering Strategy: Optimal strategy for spidering through the website to find product URLs, using reinforcement learning techniques.

  • Context Extraction: Techniques for extracting structured data from HTML + rendered webpages, notably through the use of bounding boxes. Then, we look at variation identification and extraction through the use of headless browsers.

Speaker bio

Govind is a co-founder of Semantics3. Semantics3 offers data and AI based enterprise solutions for ecommerce marketplaces (catalog generation & enrichment, seller on-boarding) and logistics companies (HTS/tariff classification, attribute enrichment). We’re a 7+ year old Y Combinator backed startup based in Bengaluru, San Francisco and Singapore.

Our data-science team works on problems like product categorization, product matching, named entity recognition and unsupervised content extraction.

Slides

https://docs.google.com/presentation/d/e/2PACX-1vSQoXGj0ZxG8tkWR-47oqABSsWjCq0rrecVlUgHtaHl9104FNImSHsUDp6h5IVJG9wJAHv_KNwt4-EK/pub?start=false&loop=false&delayms=3000

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Make a submission

Accepting submissions till 01 Nov 2019, 04:20 PM

Taj M G Road, Bangalore, Bangalore

Hosted by

Anthill Inside is a forum for conversations about Artificial Intelligence and Deep Learning, including: Tools Techniques Approaches for integrating AI and Deep Learning in products and businesses. Engineering for AI. more