RNNs for multimodal information fusion

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

NIMHANS Convention Centre

All submissions

Previous Next

RNNs for multimodal information fusion

Submitted Jun 9, 2016

Section: Crisp talk Technical level: Intermediate

Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. I will discuss a novel generalized deep neural network architecture where temporal streams from multiple modalities can be combined. The hybrid Recurrent Neural Network (RNN) exploits the complimentary nature of the multimodal temporal information by allowing the network to learn both modality-specific temporal dynamics as well as the dynamics in a multimodal feature space. The efficacy of the proposed model is also demonstrated on multiple datasets.