Professional Documents
Culture Documents
▪ Style transfer
▪ Generative networks creating images and voxels
▪ Adversarial networks (DCGAN) – still early but promising
▪ Artomatix
▪ Allegorithmic
▪ Autodesk
2
STYLE TRANSFER
Something Fun
▪ Doodle a masterpiece!
Content Style
▪ Uses CNN to take the “style” from one image and apply it to
another
5
GAMEWORKS: MATERIALS & TEXTURES
Using DL for Game Development & Content Creation
▪ Set of tools targeting the game industry using machine learning and deep learning
▪ Launched at Game Developer Conference in March, tools run as a web service
▪ Sign up for the Beta at: https://gwmt.nvidia.com
▪ Tools in this initial release:
▪ Photo to Material: 2shot
▪ Texture Multiplier
▪ Super-Resolution
6
PHOTO TO MATERIAL
The 2Shot Tool
▪ https://mediatech.aalto.fi/publications/graphics/TwoShotSVBRDF/
▪ Or align later
Diffuse
Specular Normals Glossiness Anisotropy
albedo
8
TEXTURE MULTIPLIER
Organic variations of textures
▪ https://arxiv.org/pdf/1505.07376.pdf
▪ Artomatix
▪ Similar product “Texture Mutation”
▪ https://artomatix.com/
9
SUPER RESOLUTION
10
SUPER RESOLUTION
Zoom.. ENHANCE!
OK!
Sure!
Can you
Zoom in on the
enhance that?
license plate
11
SUPER RESOLUTION Construct a
high-resolution image
The task at hand
Given a
low-resolution image
H Upscale n*H
(magic?)
n*W
12
UPSCALE: CREATE MORE PIXELS
An ill-posed task?
Pixels of the upscaled image
? ? ?
Pixels of the given image
? ? ? ? ? ?
? ? ?
? ? ? ? ? ?
? ? ?
? ? ? ? ? ?
13
TRADITIONAL APPROACH
▪ Interpolation (bicubic, lanczos, etc.)
▪ Interpolation + Sharpening (and other filtration)
▪ Too many possibilities (8x8 grayscale has 256(8∗8) ≈ 10153 pixel combinations!)
14
A NEW APPROACH
First: narrow the possible set
Photos
Natural images
Textures
Compress Reconstruct
+
prior information
+
constraints
16
PATCH-BASED MAPPING: TRAINING
Low-resolution patch Mapping High-resolution patch
Model
params
training
LR,HR
Training images pairs of patches
17
PATCH-BASED MAPPING
𝒙𝑯
𝒙𝑳
Encode Decode
LR patch
HR patch
18
PATCH-BASED MAPPING: SPARSE CODING
𝒙𝑯
𝒙𝑳
Encode Decode
LR patch
HR patch
Sparse
code
19
PATCH FEATURES & RECONSTRUCTION
Image patch can be reconstructed as a sparse linear combination of features
Features are learned from the dataset over time
𝑫
𝒙 = 𝑫𝒛 = 𝒅𝟏 𝒛𝟏 + ⋯ + 𝒅𝑲 𝒛𝑲
𝑫 - dictionary
𝒙 - patch
= 0.8 * + 0.3 * + 0.5 *
𝒛 - sparse code
𝒙 𝒅𝟑𝟔 𝒅𝟒𝟐 𝒅𝟔𝟑
20
GENERALIZED PATCH-BASED MAPPING
Mapping in
Mapping feature space Mapping
LR patch
High-level High-level
representation of representation of HR patch
the LR patch the HR patch
“Features”
21
GENERALIZED PATCH-BASED MAPPING
Mapping in
Mapping feature space Mapping
𝑊1 𝑊2 𝑊3
LR patch
HR patch
Trainable parameters
22
MAPPING OF THE WHOLE IMAGE
Using Convolutions
Convolutional operators
HR image
LR image
23
AUTO-ENCODERS
24
AUTO-ENCODER
Encode Decode
features
25
AUTO-ENCODER
Parameters
𝑊
Inference
𝑦 = 𝐹𝑊 (𝑥)
𝑥 𝑦 Training
𝑊 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐷𝑖𝑠𝑡(𝑥𝑖 , 𝐹𝑊 𝑥𝑖 )
𝑖
𝑥𝑖 - training set
26
AUTO-ENCODER
Encode
▪ Our encoder is LOSSY by definition
input
information loss 27
SUPER-RESOLUTION AUTO-ENCODER
Parameters
𝑊
Inference
𝑦 = 𝐹𝑊 (𝑥)
𝑥 𝑦 Training
𝑊 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐷𝑖𝑠𝑡(𝑥𝑖 , 𝐹𝑊 𝑥𝑖 )
𝑖
𝑥𝑖 - training set
28
SUPER RESOLUTION AE: TRAINING
𝑥 y
𝑥ො 𝐹W
𝐷
Downscaling SR AE
𝑊
LR image
𝑥𝑖 - training set
29
SUPER RESOLUTION AE: INFERENCE
y
𝑥ො 𝐹W
SR AE
𝑊
Given LR image
Constructed HR image
𝑦 = 𝐹𝑊 (𝑥)
ො
30
SUPER-RESOLUTION: ILL-POSED TASK?
31
THE LOSS FUNCTION
32
THE LOSS FUNCTION
Measuring the “distance” from a good result
𝑊 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐷 𝑥𝑖 , 𝐹𝑊 (𝑥𝑖 )
𝑖
33
LOSS FUNCTION
MSE
Mean Squared Error
1 2
𝑥 −𝐹 𝑥
𝑁
34
LOSS FUNCTION: PSNR
MSE PSNR
Mean Squared Error Peak Signal-to-Noise Ratio
1 2 𝑀𝐴𝑋 2
𝑥 −𝐹 𝑥 10 ∗ 𝑙𝑜𝑔10
𝑁 𝑀𝑆𝐸
35
LOSS FUNCTION: HFEN
MSE PSNR
Mean Squared Error Peak Signal-to-Noise Ratio
1 2 𝑀𝐴𝑋2
𝑥 −𝐹 𝑥 10 ∗ 𝑙𝑜𝑔10
𝑁 𝑀𝑆𝐸
HFEN(see A)
High Frequency Error Norm High-Pass filter
𝐻𝑃(𝑥 − 𝐹 𝑥 ) 2
Perceptual loss
Ref A: http://ieeexplore.ieee.org/document/5617283/ 36
REGULAR LOSS
Result 4x Result 4x
37
REGULAR LOSS + PERCEPTUAL LOSS
Result 4x Result 4x
38
WARNING… THIS IS EXPERIMENTAL!
39
SUPER-RESOLUTION: GAN-BASED LOSS
𝐹(𝑥)
real
𝑥
𝑦 𝐷(𝑦)