You are on page 1of 16

Deep Learning for

Computer Vision

WHAT TO BE COVERED?

ANY BACKGROUND REQUIRED?

HOW WILL I BE GRADED?

Prof. G.S. Jison Hsu 徐繼聖


• Artificial Vision Laboratory
• National Taiwan University of
Science and Technology

Deep Learning for Computer Vision


About Me and My TAs
Jison G.S. Hsu, E1-445, Ext. 3234
• My interests: computer vision and deep learning
• Home Page: https://sites.google.com/site/jisonhsuprofile2/
• Office hours (OH): 13:30~14:30 on Wednesday
• E-mail: jison@mail.ntust.edu.tw

TA:
• E1-248, E1-246 Ext. 7246 or 3738, preliminary OH: 13:00~15:00 Friday.
• Wei-Chun Hsieh (Chris): chrishsie84@gmail.com
• Wei-Jun Lin (Jimmy): Jimmmarathon@gmail.com
• Tzu-Yang Su (Young): ades95207@gmail.com

Deep Learning for Computer Vision 2


Pattern Recognition Lab at E1-248
2F
E1-223
E1-245
NTUST Campus

Deep Learning for Computer Vision 3


Teach Computers to UnderstandPictures Fei-Fei Lee

http://yt1.piee.pw/4cwlrh [18:02]
Deep Learning for Computer Vision 4
Deep Learning for Computer Vision (1/3)
• Supervised and Unsupervised Learning

- What is Supervised and Unsupervised Learning:


https://www.youtube.com/watch?v=rHeaoaiBM6Y&ab_channel=EyeonTech

• Classification and Regression

- Training and Testing Sets: https://reurl.cc/ER7dzm

- Performance Measurements (Precision, Recall, F1 Score):


https://reurl.cc/AOqZ38
• Convolutional Neural Networks

- What is Neural Network: https://reurl.cc/AOqvz3


Deep Learning for Computer Vision 5
Deep Learning for Computer Vision (2/3)
• Detection

- Introduction to Object Detection: https://reurl.cc/V16VWQ

- Faster-RCNN: https://reurl.cc/Qb3xD9

- Mask-RCNN: https://reurl.cc/V1WNrZ

- YOLO algorithm: https://reurl.cc/leYgE9

- Detectron2: https://reurl.cc/W1d98O

Deep Learning for Computer Vision 6


Deep Learning for Computer Vision (3/3)
• Few Shot Learning

- Introduction to Few Shot Learning: https://reurl.cc/3YMXkM

• Transformer

⁻ Self-attention: https://reurl.cc/YX3q7a

₋ Introduction to Transformer: https://reurl.cc/dWanED

Deep Learning for Computer Vision 7


Deep Learning for Computer Vision
• Deep Learning Tool Tutorials
- Coding for Beginners:
- Python, https://reurl.cc/eOxYQj
- Pytorch, https://reurl.cc/5plnVV
- Tensorflow, https://reurl.cc/D395e5
- Deep Learning Software:
- Pytorch Webpages, https://pytorch.org/
- Tensorflow Webpages, https://www.tensorflow.org/
- Data Processing Tool:
- Github:LableImg (Bounding Box), https://github.com/tzutalin/labelImg
- Github:Lableme (Segmentation), https://github.com/wkentaro/labelme

Deep Learning for Computer Vision 8


How Are You Graded?

• Mid-term Exams, 11/01 -------------------------------------------------- 20%


• Mid-term Presentation, 11/15 ------------------------------------------- 15%
• Final Presentation, 12/13 --------------------------------------------- 20%
• Final Exam, 12/20 ------------------------------------------------------------- 20%
• In-Class/Take-Home Exercises --------------------------------------------- 25%

Deep Learning for Computer Vision 9


Schedule and Contents
9/6 First class, Introduction, Classification, 11/8 Invited Talk
Regression, Precision & Recall 11/15 1st Project Presentation
9/13 Perceptron, Activation Function, Gradient
Descent, Backpropagation 11/22 AE、VAE、Diffusion model
9/20 Convolution, Loss-functions, Convolution 11/29 Transformer
Neural Networks (CNNs) , Batch 12/6 Vision Transformer, Swin
Normalization Transformer , Few Shot Learning
9/27 Fast-RCNN, Faster-RCNN 12/13 Final Project Presentation
10/4 Segmentation, Mask-RCNN, 12/20 Final Exam
10/11 YOLOv2, YOLOv3
10/18 YOLOv4, YOLOv5
10/25 Detectron2
11/1 Mid-term Exam

Deep Learning for Computer Vision 10


Your Semester Project
• You will form a study group of three teammates. Please submit your teammates
by 9/27 (two weeks from today). The faster you form the group, the earlier you
can start working on your assignment. We will form your team if you cannot
form yours by 9/27.

• We will assign a problem and a possible solution to each group. You will study
the possible solution in depth, and try to improve it or find other possible
solutions for comparison. Your problem and solutions will be presented in the
mid-term and final presentations.

Deep Learning for Computer Vision 11


Project List
Topic Paper Type Conference Year
Rank consistent ordinal regression for neural
1 Facial Age Estimation PRL 2020
networks with application to age estimation
When Age-Invariant Face Recognition Meets
2 Face Age Synthesis: A Multi-Task Learning Facial Age Estimation CVPR 2021
Framework
MagFace: A Universal Representation for Face
3 Face Recognition CVPR 2021
Recognition and Quality Assessment
AdaFace: Quality Adaptive Margin for Face
4 Face Recognition CVPR 2022
Recognition
5 Detecting Deepfakes with Self-Blended Images Face Recognition CVPR 2022
MSML: Enhancing Occlusion-Robustness by
6 Multi-Scale Segmentation-Based Mask Face Recognition AAAI 2022
Learning for Face Recognition

Deep Learning for Computer Vision 12


Project List
Topic Paper Type Conference Year
Read Like Humans: Autonomous, Bidirectional
7 and Iterative Language Modeling for Scene Scene Text Recognition CVPR 2021
Text Recognition
CDistNet: Perceiving Multi-Domain Character
8 Scene Text Recognition IJCV 2023
Distance for Robust Text Recognition
Towards Accurate Scene Text Recognition with
9 Scene Text Recognition CVPR 2020
Semantic Reasoning Networks
Fourier Contour Embedding for Arbitrary-
10 Scene Text Detection CVPR 2021
Shaped Text Detection
Real-time Scene Text Detection with
11 Scene Text Detection AAAI 2020
Differentiable Binarization
DPText-DETR: Towards Better Scene Text
12 Scene Text Detection AAAI 2023
Detection with Dynamic Points in Transformer

Deep Learning for Computer Vision 13


Project List
Topic Paper Type Conference Year
YOLOv7: Trainable Bag-of-Freebies Sets New
13 Object Detection CVPR 2023
State-of-the-Art for Real-Time Object Detectors
Deformable DETR: Deformable Transformers
14 Object Detection ICLR 2021
for End-to-End Object Detection
Instance-aware, Context-focused, and Memory-
15 Object Detection CVPR 2020
efficient Weakly Supervised Object Detection
16 End-to-End Object Detection with Transformers Segmentation ECCV 2020
SegFormer: Simple and Efficient Design for
17 Segmentation NeurIPS 2021
Semantic Segmentation with Transformers
MedSegDiff: Medical Image Segmentation with
18 Segmentation MIDL 2023
Diffusion Model
AE TextSpotter: Learning Visual and Linguistic
19 Scene Text Spotting ECCV 2020
Representation for Ambiguous Text Spotting
SwinTextSpotter: Scene Text Spotting via Better
20 Synergy between Text Detection and Text Scene Text Spotting CVPR 2022
Recognition
Deep Learning for Computer Vision 14
Presentation and Problem Solving Skills
• All subjects have abundant materials online. You can find lots of useful materials.
The following are a few samples:

• [CVPR 2021 Oral] MagFace: A Universal Representation for Face Recognition and
Quality Assessment

⁻ https://www.youtube.com/watch?v=BcwC3VBZWJU&t=52s&ab_channel=MengIrving

• [CVPR 2021 Oral] Dictionary-guided Scene Text Recognition

⁻ https://www.youtube.com/watch?v=3h-RRloXLMI&ab_channel=VinAIResearch

Deep Learning for Computer Vision 15


Presentation Example

CVPR2022 DualGenerator Face Reneactment


Deep Learning for Computer Vision 16

You might also like