Course Information

Course Title: Deep Learning for Computer Vision: Seminal Work

Course ID: 263-5904-00S

Lecturers: Dr. Zuria Bauer

Teaching Assistants: Zador Pataki, Philipp Lindenberger, Paper Supervisors

Venue: Mo 16-18h, CAB G 57

Course Description :

This seminar covers seminal papers on the topic of deep learning for computer vision. The students will present and discuss the papers and gain an understanding of the most influential research in this area - both past and present. The objectives of this seminar are two-fold. Firstly, the aim is to provide a solid understanding of key contributions to the field of deep learning for vision (including a historical perspective as well as recent work). Secondly, the students will learn to critically read and analyse original research papers and judge their impact, as well as how to give a scientific presentation and lead a discussion on their topic.

Each student chooses one paper from the provided collection to present during the course of the seminar. The students will be supported in the preparation of their presentation by the seminar assistants.

Important Information

Recommended Textbooks for Reference
Image Processing
  • Class Material:
  • Class material will be posted on Moodle. We will also use Moodle to submit assignments.
  • Assignments:
  • Each student will present one paper. Everybody is encouraged to read each paper before it is being presented and engage in a discussion following the presentations. To foster interesting discussions, each paper will also be assigned to a "critic" who studies the paper and shortly presents a summary of the paper's weaknesses.
  • Grading:
  • Each student will be graded based on their presentation (70%) and their participation in the assigned discussions (20%). There is a small participation grade (10%) for those that ask questions in papers even if they are not assigned to them.
  • Attendance:
  • Attendance is required to pass the course (3 absences allowed).
  • The class is held in person, except if otherwise stated.

Paper List:

Paper Title Supervisor
Attention Is All You Need
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Deep residual learning for image recognition
Denoising Diffusion Probabilistic Models
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
Dropout: a simple way to prevent neural networks from overfitting
DUSt3R: Geometric 3D Vision Made Easy
Emerging Properties in Self-Supervised Vision Transformers (DINO)
High-resolution image synthesis with latent diffusion models
ImageNet Classification with Deep Convolutional Neural Networks
Learning Transferable Visual Models From Natural Language Supervision
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NetVLAD: CNN architecture for weakly supervised place recognition
OpenScene: 3D Scene Understanding with Open Vocabularies
Segment Anything
SuperPoint: Self-Supervised Interest Point Detection and Description
Understanding Batch Normalization
U-Net: Convolutional Networks for Biomedical Image Segmentation
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?
You only look once: Unified, real-time object detection