Deep learning: A convoluted overview with recurrent themes and beliefs.
The data needed to represent the world around is daunting. But somehow, we need to capture a lot of that information explicitly or implicitly to create ‘intelligent’ machines. It was formerly believed that explicitly capturing all this information would give rise to Artificial Intelligence through clever programming. But with every passing decade, despite the rise in computational power, this only resulted in hype, and at least two AI winters.
Instead, the field moved on to trying to implicitly represent the world. If a machine can mathematically infer the variance in the data, and then also infer how different parts interact, then it could be described as ‘understanding’ those factors. But we simply don’t have a large enough database to capture all that variance. A plausible alternative could be to have hierarchical model of the world, where each level of the hierarchy is transforming the ‘raw’ data into more abstract representations. The higher the level, the more abstract the representation, resulting in more human recognisable concepts, such as scenes or language.
It would be preferable if these abstractions were learnt by the machines themselves, and we have kind of known how to get started with that for several decades now. Pretty much every algorithm you hear about in these talks was invented decades ago. But historically, we did not have sufficient computational power, nor did we have large enough datasets to seed these machines and use those algorithms. Instead, Machine Learning was dominated by hand crafted feature descriptors and support vector machines. These models were simpler, and used other implicit knowledge, namely human intuition, to great effect.
With the arrival of graphics processors and programming tools to exploit them in the mid 2000s, deep architectures that could automatically learn these abstract representations were suddenly plausible and could be trained quickly, and the vast amount of data on the web, both images and text, was also available and ready to be translated. And now hopefully we can start building machines that better ‘understand’ the world around us, and we could maybe, just maybe ask it some intelligent questions.
Hopefully the answer is not 42.
The talk is meant to serve as an overview for the conference. It will briefly describe the history and origin of deep learning, how it contrasts with ‘traditional’ machine learning, and touching upon the dominant concepts that shape its practice today.
Dr. Anand Chandrasekaran is a founder and the CTO of Mad Street Den, an AI company specializing in computer vision. In addition to an academic background in the fields of neuroscience and neuromorphic engineering, he has been a member of teams working on DARPA projects in cognition and vision.