##About the 2019 edition:
The schedule for the 2019 edition is published here: https://hasgeek.com/anthillinside/2019/schedule
The conference has three tracks:
- Talks in the main conference hall track
- Poster sessions featuring novel ideas and projects in the poster session track
- Birds of Feather (BOF) sessions for practitioners who want to use the Anthill Inside forum to discuss:
- Myths and realities of labelling datasets for Deep Learning.
- Practical experience with using Knowledge Graphs for different use cases.
- Interpretability and its application in different contexts; challenges with GDPR and intepreting datasets.
- Pros and cons of using custom and open source tooling for AI/DL/ML.
#Who should attend Anthill Inside:
Anthill Inside is a platform for:
- Data scientists
- AI, DL and ML engineers
- Cloud providers
- Companies which make tooling for AI, ML and Deep Learning
- Companies working with NLP and Computer Vision who want to share their work and learnings with the community
For inquiries about tickets and sponsorships, call Anthill Inside on 7676332020 or write to firstname.lastname@example.org
Sponsorship slots for Anthill Inside 2019 are open. Click here to view the sponsorship deck.
Tutorial on Testing of Machine Learning Applications
##URL for workshop date, time, venue, schedule and tickets: https://hasgeek.com/anthillinside/testing-machine-learning-applications-workshop/
Rapid progress in Machine Learning (ML) has seen a swift translation to real world commercial deployment. While research and development of ML applications have progressed at an exponential pace, the required software engineering process for ML applications and the corresponding eco-system of testing and quality assurance tools which enable software reliable, trustworthy and safe and easy to deploy, have sadly lagged behind. Specifically, the challenges and gaps in quality assurance (QA) and testing of AI applications have largely remained unaddressed contributing to a poor translation rate of ML applications from research to real world . Unlike traditional software, which has a well-defined software testing methodology, ML applications have largely taken an ad-hoc approach to testing. ML researchers and practitioners either fall back to traditional software testing approaches, which are inadequate for this domain, due to its inherent probabilistic and data dependent nature, or rely largely on non-rigorous self-defined quality assurance methodologies. These issues have driven the ML and Software Engineering research communities to develop of newer tools and techniques designed specifically for ML. These research advances need to be publicized and practiced in real world in ML development and deployment for enabling successful translation of ML from research prototypes to real world. This tutorial intends to address this need.
This tutorial aims to
- Provide a comprehensive overview of testing of ML applications
- Provide practical insights and share community best practices for testing ML software
Target audience for this tutorial would include the data science and machine learning community folks. This would include
- Industry Machine Learning practitioners and solution architects
- Software developers/ML Engineers who are developing production machine learning applications
- Software quality assurance and testing professionals who have to test ML applications
- student ML enthusiasts
- ML researchers (industry/academic)
*** A basic degree of familiarity in ML concepts as well as basic/intermediate experience in developing of ML applications is expected from this tutorial audience.*** Audience should be familiar with the general software development life cycle as well as intermediate coding ability in one of the high-level programming language such as Python/R/Java/C/C++/Matlab, which they have used for developing ML applications. This tutorial does not require any prior knowledge in traditional software testing and quality assurance methodologies.
Key takeaways for the audience include:
- Overview of testing ML applications - How/Why/What
- Tools and Techniques available for testing ML applications
- Practical insights/tips for incorporating into their work on testing ML models
We have set up a survey for the tutorial participants so that we can fine tune the contents based on the responses.
This will be a half day tutorial consisting of four parts. The first part of the tutorial will cover the fundamental concepts of ML testing, followed by coverage of state of art techniques and methods in each of the sub-topics:
- How to Test – ML Testing Workflow
- What Components to Test
- What Properties to Test for,
- Testing for different application scenarios
With the audience armed with this background, the second half of the tutorial will cover the stages of Machine Learning Life Cycle from Software Quality Assurance perspective, outlining the key quality assurance requirements for each stage and methods to meet these SQA requirements. We will cover existing open source and commercial tools available for ML Testing along with data sets available for ML Testing. This session will also provide tips and actionable insights for improving software quality in ML Life cycle.
The third part of the tutorial will focus on the research challenges and open problems in this space, pointing out potential opportunities. This part will highlight the process of taking ML applications from research to real world industry, point out the process and product gaps and challenges to be addressed for successful translation of ML applications to real world deployment.
- Part I – 50 minutes followed by Q & A for 10 minutes
- Part 2 – 50 minutes followed by Q & A for 10 minutes
- Part 3 – 50 minutes followed by Q & A for 10 minutes
- Hands on Session – 45 minutes
Participants should bring their own laptop. We will provide a list of open source libraries to be installed for hands on exercises before the tutorial session once we finalize the contents.
This tutorial will be organized by three of us:
1.Sandya Mannarswamy, Independent NLP Research Scientist. email@example.com
2.Shourya Roy, Head, American Express AI Labs, firstname.lastname@example.org
3.Saravanan Chidambaram, Independent NLP Researcher & Consultant, email@example.com
Sandya Mannarswamy is an independent NLP researcher. She was previously a senior research scientist at Conduent Labs India in the Natural Language Processing research group. She holds a Ph.D. in computer science from Indian Institute of Science, Bangalore. Her research interests span natural language processing, machine learning and compilers. Her research career spans over 16 years, at various R&D labs, including Hewlett Packard Ltd, IBM Research etc. She has co-organized a number of workshops including workshops at International Conference on Data Management, Machine Learning Debates workshop at ICML-2018 etc. Her current research is focused on software testing and evaluation of Natural Language Processing applications. She has extensive experience in traditional software engineering, working on Research and Development of developer tools eco-system such as compiler, debugger, performance analyzer, static source code analyzer during her extensive career at Hewlett Packard. She along with Shourya, co-authored a paper at IJCAI 2018, which focused on the challenges in taking AI applications from research to real world. Her current research is focussed on rigorous evaluation of NLP applications (using NLP to evaluate NLP). She is the author of the popular CodeSport column in Open Source For You magazine. (https://opensourceforu.com/tag/codesport/).
Shourya Roy (https://www.linkedin.com/in/shouryaroy/) is Head and VP of American Express AI Labs which is spearheading innovations in the areas of machine learning, NLP and document recognition, cloud computing and AI-product management for American Express. Shourya’s research interest spans Text and Web Mining & Natural Language Processing. He holds a Ph.D. in computer science from Indian Institute of Science, Bangalore. Over the years, Shourya’ s work has led to 15 granted patents and about 70 publications in premier journals and conferences from his current and prior association with research labs of IBM and Xerox over 15 years. In recent times, Shourya co-organized a number of workshops in tier-1 conferences ICML 2018, KDD 2018, SIGMOD 2016-18, ECML 2016, ICDE 201617 and notably had co-initiated and ran the series of Noisy Text Analytics (AND) series of workshops between 2007-12. He is currently serving as the Vice Chair of the India Chapter of SIGKDD organization (IKDD).
Saravanan Chidambaram (Saro) (https://www.linkedin.com/in/saravanan-chidambaram-saro-9a87ab5/) is an independent consultant in Machine Learning and AI technologies. Previously he was head of Advanced Development Centre, Hewlett Packard Enterprise, where he led the research team exploring emerging technologies, including AI/Blockchain/ML. Over a career spanning 16 years, at various R&D labs, including Hewlett Packard Ltd, Microsoft and Oracle, he has led the development of many research and development projects in the areas of virtualization, compilers, kernel and big data, focusing on designing and deploying mission critical enterprise software. Saro is passionate about educating the emerging ML software developer community into adopting rigorous software quality assurance techniques. He is currently working on developing a test-suite for testing NLP applications.