VIQA 2020 : The First International Workshop on Video Question Answering and Image Question Answering


When Jan 11, 2021 - Jan 11, 2021
Where Milan, Italy
Submission Deadline Oct 17, 2020
Notification Due Nov 10, 2020
Final Version Due Nov 15, 2020
Categories    visual question answering   computer vision   video   image

Call For Papers

Image Question Answering and Video Question Answering involve the realization of models able to analyze the visual content of an image or a video, and produce a meaningful answer to visual content-related questions. Both these tasks involve spatial, frame-level reasoning. Moreover, Video Question Answering also requires temporal, video-level reasoning which further raises the difficulty of the task. Solving this problem would represent the ability for the models to jointly analyze and reason on both visual and textual contents at a human-level, by learning to pinpoint objects of interest in video (or image), and to identify and reason about their interactions. Image and Video Question Answering thus represent a challenging, but fundamental task in both Computer Vision and Natural Language Processing communities.

The first Video and Image Question Answering (VIQA) Workshop will be held at ICPR2020 and will focus on these two fundamental tasks, while covering several tasks in both Computer Vision and NLP fields. It aims at gathering all the people from both academia and industry interested in these topics, this in order to stimulate the sharing between participants of state-of-the-art approaches, best practices, and future directions. Moreover, a special issue about Video and Image Question Answering is going to be organized in a top journal.

Topics of interest are related to Image and Video Question Answering including, but not limited to:

Datasets and evaluation
Deep learning methods for vision and language
Egocentric visual question answering
Image question answering
Multimodal question answering
Representation learning
Transfer learning for vision and language
Video analysis and understanding
Video question answering
Video summarization
Vision and language and/or other modalities
Vision applications and systems
Visual reasoning and logical representation

