Deep Learning Howlers: Downside of Learning only Statistical Regularities
It has been shown in a recent work ( https://arxiv.org/pdf/1711.11561.pdf), that deep convolutional learning networks do not learn higher level abstract concepts, but only statistical regularities. We investigate this claim by taking open source deep learning libraries and testing them out.
It turns out that deep learning networks (espcially, the neuraltalk2 and img2txt) for image annotation perform howlers - funny mistakes. For instance, an image of goats climbing on trees is annotated with “birds flying in the air”, while an image of rocks is annotated as “elephant standing in the middle of a river”!! This talk outlines a few of these howlers with the actual images fed to these CNNs and what funny annotations they produce.
The intent is to explore deficiencies in existing deep learning networks. We also explore how CNNs cannot capture spatial relationships between objects in the image leading to more funny misclassifications. This talk again outlines some of the misclassifications.
The talk gives a brief overview of capsule networks, which have been recently proposed and which help in capturing spatial relationships between objects in the image. A capsule outputs a vector, as opposed to a neuron which only outputs a scalar. The length of a vector determines the probabalities of a capsule detecting low level objects, while the vector determines abstract internal state of the objects. We show how capsule networks avoid some of the above gaffes of CNNs.
The talk ends with a dive into deficiencies of the Recurrent Neural Networks (RNNs) which are mainly computational. RNNs suffer from vanishing gradient problems which are overcome by LSTMs. LSTMs, however, are computationally expensive. We illustrate the use of hierarchical neural attention encoders as an alternative to LSTMs.
- convolutional neural networks and their deficiencies
- Image annotation howlers produced by neuraltalk2 and img2txt in TensorFlow including, goats climbing trees and being annotated as birds, while people lifting goats being annotated as people with dogs.
- Introduction to Capsule Neworks.
- Deficiencies of RNNs and how hierarchical neural attention encoders overcome these problems.
Basics of deep learning.
Dr. Vijay Srinivas Agneeswaran has a Bachelor’s degree in Computer Science & Engineering from SVCE, Madras University (1998), an MS (By Research) from IIT Madras in 2001, a PhD from IIT Madras (2008) and a post-doctoral research fellowship in the LSIR Labs, Swiss Federal Institute of Technology, Lausanne (EPFL). He is now a Senior Director of Technology and heads data sciences team of SapientRazorfish in India. He has spent the last ten years creating intellectual property and building products in the big data area in Oracle, Cognizant and Impetus. He has built PMML support into Spark/Storm and realized several machine learning algorithms such as LDA, Random Forests over Spark. He led a team that designed and implemented a big data governance product for a role-based fine-grained access control inside of Hadoop YARN. He and his team have also built the first distributed deep learning framework on Spark. He is a professional member of the ACM and the IEEE (Senior) for the last 10+ years. He has four full US patents and has published in leading journals and conferences, including IEEE transactions. His research interests include distributed systems, data sciences as well as Big-Data and other emerging technologies. He has been an invited speaker in several national and International conferences such as O’Reilly’s Strata Big-data conference series. He was an editorial speaker at the Strata Data conference in London in May 2017 and will also be speaking at the Strata Data 2018 conference in San Jose. He is also in the program committee of Strata Data Singapore 2017 as well as Strata Data, San Jose, 2018. He lives in Bangalore with his wife, son and daughter and enjoys researching history and philosophy of Egypt, Babylonia, Greece and India.