The Fifth Elephant 2016

India's most renowned data science conference

RNNs for multimodal information fusion

Submitted by Om Deshmukh (@omdesh) on Thursday, 9 June 2016

videocam_off

Technical level

Intermediate

Section

Crisp talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +3

Abstract

Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. I will discuss a novel generalized deep neural network architecture where temporal streams from multiple modalities can be combined. The hybrid Recurrent Neural Network (RNN) exploits the complimentary nature of the multimodal temporal information by allowing the network to learn both modality-specific temporal dynamics as well as the dynamics in a multimodal feature space. The efficacy of the proposed model is also demonstrated on multiple datasets.

Outline

Deep Learning overview, RNN overview, proposed model, performance and comparisons

Speaker bio

http://www.xrci.xerox.com/profile-main/114

Comments

Login with Twitter or Google to leave a comment