Overview

The automatic analysis and understanding of images and videos, a field called Computer Vision, occupies significant importance in applications including security, healthcare, entertainment, mobility, etc. The recent success of deep learning methods has revolutionized the field of computer vision, making new developments increasingly closer to deployment that benefits end users. This course will introduce the students to traditional computer vision topics, before presenting deep learning methods for computer vision. The course will cover basics as well as recent advancements in these areas, which will help the student learn the basics as well as become proficient in applying these methods to real-world applications. The course assumes that the student has already completed a full course in machine learning, and some introduction to deep learning preferably, and will build on these topics focusing on computer vision.INTENDED AUDIENCE :Senior undergraduate students + post-graduate studentsPREREQUISITES Completion of a basic course in Machine Learning(Recommended, but not mandatory) Completion of a course in Deep Learning, or exposure to topics in neural networksKnowledge of basics in probability, linear algebra, and calculusExperience of programming, preferably in PythonIf you are unsure whether you meet the background requirements for the course, please look at Assignment 0 (both theory and programming). If you are comfortable solving/following these assignments, you are ready for the course.INDUSTRIES SUPPORT :All companies that use computer vision for their products/services (Microsoft, Google, Facebook, Apple, TCS, Cognizant, L&T, etc)

Syllabus

Week 1:Introduction and Overview:○ Course Overview and Motivation; Introduction to Image Formation, Capture and Representation; Linear Filtering, Correlation, ConvolutionWeek 2:Visual Features and Representations:○ Edge, Blobs, Corner Detection; Scale Space and Scale Selection; SIFT, SURF; HoG, LBP, etc.Week 3:Visual Matching:○ Bag-of-words, VLAD; RANSAC, Hough transform; Pyramid Matching; Optical FlowWeek 4:Deep Learning Review:○ Review of Deep Learning, Multi-layer Perceptrons, BackpropagationWeek 5:Convolutional Neural Networks (CNNs):○ Introduction to CNNs; Evolution of CNN Architectures: AlexNet, ZFNet, VGG, InceptionNets, ResNets, DenseNetsWeek 6:Visualization and Understanding CNNs:○ Visualization of Kernels; Backprop-to-image/Deconvolution Methods; Deep Dream, Hallucination, Neural Style Transfer; CAM,Grad-CAM, Grad-CAM++; Recent Methods (IG, Segment-IG, SmoothGrad)Week 7:CNNs for Recognition, Verification, Detection, Segmentation:○ CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive Loss, Ranking Loss); CNNs for Detection: Background of Object Detection, R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD, RetinaNet; CNNs for Segmentation: FCN, SegNet, U-Net, Mask-RCNNWeek 8:Recurrent Neural Networks (RNNs):○ Review of RNNs; CNN + RNN Models for Video Understanding: Spatio-temporal Models, Action/Activity RecognitionWeek 9:Attention Models:○ Introduction to Attention Models in Vision; Vision and Language: Image Captioning, Visual QA, Visual Dialog; Spatial Transformers; Transformer NetworksWeek 10:Deep Generative Models:○ Review of (Popular) Deep Generative Models: GANs, VAEs; Other Generative Models: PixelRNNs, NADE, Normalizing Flows, etcWeek 11:Variants and Applications of Generative Models in Vision:○ Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security; Variants: CycleGANs, Progressive GANs, StackGANs, Pix2Pix, etcWeek 12:Recent Trends:○ Zero-shot, One-shot, Few-shot Learning; Self-supervised Learning; Reinforcement Learning in Vision; Other Recent Topics and Applications