The Fifth Elephant 2018

The seventh edition of India's best data conference


A study in classification

Submitted by Ramanan Balakrishnan (@ramananbalakrishnan) on Tuesday, 29 May 2018

Section: Crisp talk Technical level: Intermediate

View proposal in schedule


Let me ask you a question, is a watch a time-keeping device, an electrical gadget, a collectible item or piece of jewelry? (you can pick only one). Such queries, mandated by governments across the world, cause sleepless nights for the global trade industry. The astronomical penalties on making classification errors in such import/export declarations being one key reason for worry.

In this session, we will take this Harmonized System (HS) classification problem as an example and talk about how we can build ML systems which process such complexity, and still perform accurate classification.

The talk will be broken down into individual sections describing the various stages of development. By sticking to a specific use-case, the talk will list the decisions that need to be made and hopefully generalizations can also be derived as a result.

The aim of this talk is to convey the questions and approaches that need to be considered when making ML-driven solutions successful within traditional business workflows.


Introduction [3-4 mins]

An introduction to the ML problem at hand (an import/export related classification task). Examples will be presented to highlight the complexity of tasks involved. This section will also be used to explain the real-world implications of the system that we aim to develop. The use-case introduced in this section, will be continuously referred to throughout the talk.

Starting steps [5 mins]

This section will describe the ideal first steps to start with. Approaches to analyze the dataset will be presented. Expected outcomes will be discussed, together with the need to develop baseline guarantees.

Topics Covered

  • dataset considerations
  • problem solving by pattern matching
  • analyzing existing workflows (aka the system you are looking to make redundant)
  • calibrating expectations

Advanced considerations [5 mins]

In more complicated scenarios, additional (business-driven) objectives need to be considered before making decisions. This section will talk about how involving other project stakeholders can drastically affect your own internal roadmap towards a successful ML product.

Topics covered

  • business context considerations
  • other stakeholder involvement

Deployment and continuous learning [5 mins]

Given the knowledge learned in the earlier sections, we can now focus on what makes a ML deployment successful. The advantages to having a “human-in-the-loop” workflow will also be presented here. By introducing additional checkpoints at multiple stages and continuous monitoring, effective quantitative assessments can be carried out.

Topics covered

  • deployment scenarios
  • human-in-the-loop augmentation
  • effective monitoring outcomes

Conclusion [3-4 mins]

This section will serve as a recap of the entire talk. The approach followed through the earlier sections will be summarized and hopefully presented as a generalizable approach for others.

Speaker bio

I am a member of the data science team at Semantics3 - building data-powered software for ecommerce-focused companies. Over the years, I have had the chance to dabble in various fields covering data processing, pipeline setup, database management and data science. When not picking locks, or scuba diving, I usually blog about my technical adventures at our team’s engineering blog and sometimes, speak at conferences.




  • Aditya Yadav (@jsadi007) 3 months ago

    Must be thanks for the awesome article here needed the internet explorer 10 in windows 10 easily here you get the nice way to explore the internet window 10.

  • Silas66 (@silas89) 2 months ago

    The Study classification is key to talk about the who have the knowledge related to the education. They know how to speak to the education and how to grow it in the society.

Login with Twitter or Google to leave a comment