How similar are two pieces of text? A moderately broad and deep dive in one of the fundamental topics in NLP.

Nov 2017

20 Mon

21 Tue

22 Wed

23 Thu

24 Fri 10:00 AM – 05:50 PM IST

25 Sat

26 Sun

Make a submission

Venue 1

##About the event

When it comes to Machine Learning (ML), Deep Learning (DL) and Artificial Intelligence (AI), three aspects are crucial:

Clarity of fundamental concepts.
Insights and nuances when applying concepts to solve real-world problems.
Knowledge of tools for automating ML and DL.

Anthill Inside Miniconf will provide understanding on each of these fronts.

##Format

This miniconf is a full day event consisting of:

3-4 talks each, on concepts, applications and tools.
Birds of Feather (BOF) sessions on focussed topics.

We are accepting proposals for:

10 to 40-minute talks, explaining fundamnetal concepts in math, statistics and data science.
20 to 40-minute talks on case studies and lessons learned when applyng ML, DL and AI concepts in different domains / to solve diverse data-related problems.
10 to 20-minute talks on tools on ML and DL.
Birds of a Feather (BOF) sessions on failure stories in ML, to what problems / use cases should you use ML and DL, chatbots.
3-6 hour hands-on workshops on concepts and tools.

##Hands-on workshops

Hands-on workshops for 30-40 participants on 25 November will help in internalizing concepts, and practical aspects of working with tools.
Workshops will be announced shortly. Workshop tickets have to be purchased separately.

##Target audience, and why you should attend this event

ML engineers who want to learn about concepts in maths, stats and strengthen foundations.
ML engineers wanting to learn from experiences and insights of others.
Senior architects and decision-makers who want to quick run-through of concepts, implementation case studies, and overview of tools.
Masters and doctoral candidates who want to bridge the gap between academia and practice.

##Selection process

Proposals will be shortlisted and reviewed by an editorial team consisting of practitioners from the community. Make sure your abstract contains the following information:

Key insights you will present, or takeaways for the audience.
Overall flow of the content.

You must submit links to videos of talks you have delivered in the past, or record and upload a two-min self-recorded video explaining what your talk is about, and why is it relevant for this event.

Also consider submitting links to the following along with your proposal:

A detailed outline, or
Mindmap, explaining the structure of the talk, or
Draft slides.

##Honorarium for selected speakers; travel grants

Selected speakers and workshop instructors will receive an honorarium of Rs. 3,000 each, at the end of their talk. We do not provide free passes for speakers’ colleagues and spouses.

Travel grants are available for domestic speakers. We evaluate each case on its merits, giving preference to women, people of non-binary gender, and Africans.
If you require a grant, mention this in the field where you add your location. Anthill Inside Miniconf is funded through ticket purchases and sponsorships; travel grant budgets vary.

##Important dates

Anthill Inside Miniconf – 24 November, 2017.
Hands-on workshops – 25 November, 2017.

##Contact details:
For more information about speaking, Anthill Inside, sponsorships, tickets, or any other information contact support@hasgeek.com or call 7676332020.

Hosted by

Anthill Inside

Anthill Inside is a forum for conversations about risk mitigation and governance in Artificial Intelligence and Deep Learning. AI developers, researchers, startup founders, ethicists, and AI enthusiasts are encouraged to: more

All submissions

Previous Next

This submission has been added to the schedule

How similar are two pieces of text? A moderately broad and deep dive in one of the fundamental topics in NLP.

Submitted Nov 12, 2017

Section: Full talk Technical level: Intermediate

I will talk about a fundamental problem of measuring similarity between two pieces of text. This problem appears in many contexts from search and information retrieval, natural language inferencing, plagiarism detection, answer scoring, machine translation, (near) duplicate detection etc. I will give an overview of some fundamentals, key formulations and approaches of work that is present in the literature.

The talk will center around scenarios where there are notions of application dependent similarity scores. I will use the example of automatically grading student answers against instructor-provided model answers (Automatic Short Answer Grading or ASAG). Given a question, a model answer - how can we automatically grade short answers where the answers are a sentence to a paragraph long. I will show various nuances of text similarity formulation for such applications and associated challenges.
I will introduce a range of generic unsupervised, semi-supervised and supervised techniques for measuring similarity. We will deep dive into couple of state of the art approaches - one based on classical pattern mining and word2vec and the other based on Siamese LSTM networks with a new cost function inspired by Earth Movers Distance (EMD).
This talk will be based on various papers published in 2016-17 in reputed conferences such as IJCAI, ECAI and COLING.

Outline

Text Similarity
a. Definition and scope
Application Areas
a. Information retrieval
b. Paraphrase detection
c. Natural language inference
d. Plagiarism detection
Types of Similarity
Techniques
a. Supervised
i. Classical techniques
ii. Deep neural network based techniques
b. Unsupervised
i. Lexical
ii. Semantic
Automatic Short Answer Grading
a. Context and motivation
b. Word-similarity based techniques
i. Wisdom of students
c. Siamese LSTM-based supervised ASAG technique
Conclusion

Speaker bio

Shourya Roy is the Head and Vice President of American Express Big Data Labs (BDL) which he took up in December 2016. In this role he is responsible for establishing and executing the technical agenda for BDL working closely with the broader Decision Science community and business units. Shourya is leading a team of scientists and engineers in the areas of machine learning, artificial intelligence, deep learning and cloud computing.

Prior to joining American Express, Shourya spent nearly fifteen years in the labs of IBM and Xerox playing several leadership roles in technical research, research and strategic management, customer facing business development. Shourya has a proven track record of conceptualize and initialize (by influencing business group leaders), design and develop (by participating and leading research teams) and transfer (with software development partners) innovation from research labs to real life operations and business.
Shourya’s technical expertise spans Text and Data Mining, Natural Language Processing, Machine Learning, and Big Data in which he is a well-known thought leader in several communities. His work has led to more than 60 publications in premier journals and conferences. He has been granted about 15 patents while tens of patent applications are currently in different stages of patent lifecycle. He is an active member of the ACM and ACL communities - as a part of which he has been associated with multiple conference and workshop organisations.
Shourya holds Ph.D., Masters and Bachelors Degrees in Computer Science from IISc Bangalore, IIT Bombay and Jadavpur University respectively. Shourya also has an MBA from Faculty of Management Studies (FMS), Delhi University.
Beyond work Shourya is passionate about meeting and knowing people as well as following and playing multiple sports.

All submissions

Previous Next

Comments

Nov 2017

20 Mon

21 Tue

22 Wed

23 Thu

24 Fri 10:00 AM – 05:50 PM IST

25 Sat

26 Sun

Make a submission

Venue 1

Hosted by

Anthill Inside

Anthill Inside Miniconf – Pune

How similar are two pieces of text? A moderately broad and deep dive in one of the fundamental topics in NLP.

Outline

Speaker bio

Comments