Anthill Inside 2019

A conference on AI and Deep Learning

Tickets Propose a session

Industrialized Capsule Networks for Text Analytics

Submitted by Vijay Srinivas Agneeswaran, Ph.D (@vijayagneeswaran) on Wednesday, 3 April 2019

Preview video


Multi-label text classification is an interesting problem where multiple tags or categories may have to be associated with the given text/documents. Multi-label text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (gene classification problem, see [Zafer Barutcuoglu et. al 2006]). Kaggle data set is representative of the problem:

Several other interesting problems in text analytics exist, such as abstractive summarization [Chen, Yen-Chun 2018], sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, machine translation etc. Deep learning has been applied to solve many of the above problems – for instance, the paper [Rie Johnson et. al 2015] gives an early approach to applying a convolutional network to make effective use of word order in text categorization. Recurrent Neural Networks (RNNs) have been effective in various tasks in text analytics, as explained here ( Significant progress has been achieved in language translation by modelling machine translation using an encoder-decoder approach with the encoder formed by a neural network [Dzmitry Bahdanau et. al 2014].

However, as shown in [Dan Rosa de Jesus et. al 2018] , certain cases require modelling the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have to be incorporated in these networks to achieve high accuracy. Moreover, deep learning networks do not consider hierarchical relationships between local features as pooling operation of CNNs lose information about the hierarchical relationships.

We show one industrial scale use case of capsule networks which we have implemented for our client in the realm of text analytics – news categorization. We show, using the precision, recall and F1 metrics the performance of capsule networks on the news categorization task. Importantly, we discuss how to tune key hyper-parameters of capsule networks such as batch size, number of filters and size of filters, initial learning rate, number of capsules and dimension of capsules. We also discuss the key challenges faced and how we have industrialized capsulet networks using KubeFlow.

  1. History of impact of machine learning and deep learning on NLP.
  2. Motivation for capsule networks and how they can be used in text analytics.
  3. Implementation of capsule networks in TensorFlow.
  4. Benchmarking of capsule networks with dynamic routing for a real multi-label text classification use case for news categorization.

[Zafer Barutcuoglu et. al 2006] Zafer Barutcuoglu, Robert E. Schapire, and Olga G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22, 7 (April 2006), 830-836. DOI= [Rie Johnson et. al 2015] Rie Johnson, Tong Zhang: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. HLT-NAACL 2015: 103-112. [Dzmitry Bahdanau et. al 2014] Bahdanau, Dzmitry et al. “Neural Machine Translation by Jointly Learning to Align and Translate.” CoRR abs/1409.0473 (2014). [Dan Rosa de Jesus et. al 2018] Dan Rosa de Jesus, Julian Cuevas, Wilson Rivera, Silvia Crivelli (2018). “Capsule Networks for Protein Structure Classification and Prediction”, available at [Yequan Wang et. al 2018] Yequan Wang, Aixin Sun, Jialong Han, Ying Liu, and Xiaoyan Zhu. 2018. Sentiment Analysis by Capsules. In Proceedings of the 2018 World Wide Web Conference (WWW ‘18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1165-1174. DOI: Chen, Yen-Chun and Bansal, Mohit (2018), “Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting”, eprint arXiv:1805.11080.


We illustrate how capsule networks can be industrialized:

  1. Overview of NLP and how machine learning and deep learning have been used in various NLP tasks.
  2. Overview of capsule networks and how they help in handling spatial relationships between objects in an image.
  3. We also learn about how they can be applied to text analytics.
  4. We show an implementation of capsule networks, which are useful in text analytics – we also benchmark our implementation and discuss hyper-parameters to be tuned.
  5. We also show how to industrialized capsule networks by using KubeFlow.

This presentation shall be made at the O’Reilly conference on Artificial Intelligence @New York in April 2019 also. We shall showcase some progress we shall subsequently make at the Fifth Elephant.

PLease note that this session will have 2 speakers - self and Abhishek Kumar.

Speaker bio

Dr. Vijay Srinivas Agneeswaran has a Bachelor’s degree in Computer Science & Engineering from SVCE, Madras University (1998), an MS (By Research) from IIT Madras in 2001, a PhD from IIT Madras (2008) and a post-doctoral research fellowship in the LSIR Labs, Swiss Federal Institute of Technology, Lausanne (EPFL). He currently heads data sciences at Publicis Sapient, India. He has spent the last eighteen years creating intellectual property and building data-based products in Industry and academia. In his current role, he has led the team that delivered real-time hyper-personalization for a global auto-major as well as other work for various clients aross domains such as retail, banking/finance, telecom, automotive etc. He has built PMML support into Spark/Storm and realized several machine learning algorithms such as LDA, Random Forests over Spark. He led a team that designed and implemented a big data governance product for a role-based fine-grained access control inside of Hadoop YARN. He and his team have also built the first distributed deep learning framework on Spark. He is a professional member of the ACM and the IEEE (Senior) for the last 15+ years. He has five full US patents and has published in leading journals and conferences, including IEEE transactions. His research interests include distributed systems, data sciences as well as Big-Data and other emerging technologies.



Preview video


  • Anwesha Sarkar (@anweshaalt) 6 months ago

    Thank you for submitting the proposal. Submit your preview video by 20th April (latest) it helps us to close the review process.

  • Vijay Srinivas Agneeswaran, Ph.D (@vijayagneeswaran) Proposer 6 months ago

    Hi Anwesha, have posted a video - please check it out and let me know the next steps in the review process.

  • Zainab Bawa (@zainabbawa) Reviewer 5 months ago

    Hello Vijay,

    Couple of quick questions:

    1. Who is the audience for this proposed talk? Or, what is the background knowledge anyone should have in order to participate in this talk?
    2. What was the format, including duration, for this talk at the O’Reilly conference? Was this a lecture or a paper discussion or a tutorial?
    • Abhishek Kumar (@meabhishekkumar) 5 months ago

      Hi Zainab,

      • Audience : Machine learning/ Deep learning practitioners/aspirants working in the field of text processing, NLP. The talk highlights challenges with existing deep learning techniques and how capsNets solve these issues
      • Our O’Reilly talk was of 45 mins duration ( standard lecture : ) where we talked about background, algorithms, also walked through the demo ( how to run CapsNet on Kubeflow - ML on Kubernetes)
      • Zainab Bawa (@zainabbawa) Reviewer 5 months ago (edited 5 months ago)

        Thanks for the clarification, Abhishek. The available format for presenting this at Anthill Inside is a tutorial. Let us know if you are interested in proceeding ahead.

  • Zainab Bawa (@zainabbawa) Reviewer 5 months ago

    Here is a way to proceed with this proposal: we turn this into a tutorial which is a 60-90 min session, without hands-on component. The idea here is to introduce the audience to a concept or a new approach, and open them up to considering it for their practice. The tutorial will have to show real-world applications and case studies in order for participants to internalize the learnings and take this back to their work.
    If this suggestion is agreeable to you, we will work with you on the next steps, including sharing the format of the tutorial and scheduling this.

  • Zainab Bawa (@zainabbawa) Reviewer 5 months ago

    Also, moving this proposal to Anthill Inside since this is relevant for a segment of the audience that comes to Anthill Inside.

  • Vijay Srinivas Agneeswaran, Ph.D (@vijayagneeswaran) Proposer 5 months ago

    Zainab, while I agree that this could be structured like a 90 minute tutorial, the audience is broader - data scientists, data engineers and data architects too would love to know about capsule networks and how they compare with deep learning and how they can be applied in text analytics.

    So, I would prefer to keep this in the Fifth Elephant and not move it to Anthill Inside.

    We can work on making this a 90 minute tutorial meanwhile.

  • Abhishek Balaji (@booleanbalaji) Reviewer 2 months ago (edited 2 months ago)

    Hi Vijay,

    Couple more feedback on the proposal from reviewers:

    • The topic is interesting and relevant, but there is very little information on capsule networks in the slides
    • The slides currently are focused too much on the introduction to the concept and does not really present much value to the audience
    • The slides are not organized correctly and seem to be a mix of multiple slide decks.

    We’ll need to see a lot more details about capsule networks and how they fit in your workflow.

  • Vijay Srinivas Agneeswaran, Ph.D (@vijayagneeswaran) Proposer 2 months ago

    This review seems a bit funny, as this talk has now been given in two great conferences (O’Rielly’s AI conf and ODSC) and got great feedback in both. The proposal stands herewith withdrawn for further consideration.

Login with Twitter or Google to leave a comment