Advances in Learning Multimodal Attention Networks


Multimodal attention networks, inspired by the attention mechanism of humans and now gaining popularity in machine learning literature, allow reducing the redundancy in multi-source information to perform multimodal tasks efficiently. Based on this motivation, neural networks adopt the attention networks as a critical component of a model to efficiently solve multimodal problems, usually by reducing channels. Attention networks are probabilistic models where the probability distribution over multiple sources is used to get the coefficients of convex combination to reduce the channels. This seminar begins with introductory remarks on attention networks for beginners, followed by an introduction to the bilinear attention networks for learning the multimodal interaction of two multi-channel modalities, which is one of the successful methods on visual question answering tasks. Furthermore, the recent advances in multimodal attention networks for graph reasoning will be discussed along with its implications.

Summer AI Seminar Series, POSTECH
POSTECH, Pohang, Korea