Abstract: Over the last few years Convolutional Neural Networks (CNNs) have been shown to deliver excellent results in a broad range of low- and high-level vision tasks, spanning effectively the whole spectrum of computer vision problems. In this talk we will present recent research progress along two complementary directions. In the first part we will present research efforts on integrating established computer vision ideas with CNNs, thereby allowing us to incorporate task-specific domain knowledge in CNNs. We will present CNN-based adaptations of structured prediction techniques that use discrete (DenseCRF - Deeplab) and continuous energy-based formulations (Deep Gaussian CRF), and will also present methods to incorporate ideas from multi-scale processing, Multiple-Instance Learning, Spectral Clustering, and Metric Learning into CNNs. In the second part of the talk we will turn to designing a generic architecture that can tackle a multitude of tasks jointly, aiming at designing a `swiss knife’ for computer vision. We call this network an ‘UberNet’ to underline its overarching nature. We will introduce techniques that allow us to train an UberNet while using datasets with diverse annotations, while also handling the memory limitations of current hardware. The proposed architecture is able to jointly address (a) boundary detection (b) saliency detection (c) normal estimation (d) semantic segmentation (e) human part segmentation (f) human boundary detection (g) region proposal generation and object detection in 0.7 seconds per frame, with a level of performance that is comparable to the current state-of-the-art on these tasks.
Bio: Iasonas Kokkinos obtained the Diploma of Engineering in 2001 and the Ph.D. Degree in 2006 from the School of Electrical and Computer Engineering of the National Technical University of Athens in Greece, and the Habilitation Degree in 2013 from Université Paris-Est. In 2006 he joined the University of California at Los Angeles as a postdoctoral scholar, and in 2008 joined as faculty the Department of Applied Mathematics of Ecole Centrale Paris (CentraleSupelec), working an associate professor in the Center for Visual Computing of CentraleSupelec and affiliate researcher at INRIA-Saclay. In 2016 he joined University College London and Facebook Artificial Intelligence Research. His research activity is currently focused on deep learning for computer vision, focusing in particular on structured prediction for deep learning and multi-task learning architectures. He has been awarded a young researcher grant by the French National Research Agency, serves regularly as a reviewer for all major computer vision conferences and journals, has served as associate editor for the Image and Vision Computing journal, and is now serving as associate editor in the Computer Vision and Image Understanding journal.