SlideShare a Scribd company logo
1 of 133
Download to read offline
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 20171
Lecture 13:
Generative Models
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Administrative
2
Midterm grades released on Gradescope this week
A3 due next Friday, 5/26
HyperQuest deadline extended to Sunday 5/21, 11:59pm
Poster session is June 6
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Overview
● Unsupervised Learning
● Generative Models
○ PixelRNN and PixelCNN
○ Variational Autoencoders (VAE)
○ Generative Adversarial Networks (GAN)
3
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Supervised vs Unsupervised Learning
4
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Supervised vs Unsupervised Learning
5
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Cat
Classification
This image is CC0 public domain
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Supervised vs Unsupervised Learning
6
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
DOG, DOG, CAT
This image is CC0 public domain
Object Detection
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Supervised vs Unsupervised Learning
7
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Semantic Segmentation
GRASS, CAT,
TREE, SKY
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Supervised vs Unsupervised Learning
8
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Image captioning
A cat sitting on a suitcase on the floor
Caption generated using neuraltalk2
Image is CC0 Public domain.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 20179
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Supervised vs Unsupervised Learning
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201710
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Supervised vs Unsupervised Learning
K-means clustering
This image is CC0 public domain
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201711
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Supervised vs Unsupervised Learning
Principal Component Analysis
(Dimensionality reduction)
This image from Matthias Scholz
is CC0 public domain
3-d 2-d
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201712
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Supervised vs Unsupervised Learning
Autoencoders
(Feature learning)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201713
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Supervised vs Unsupervised Learning
2-d density estimation
2-d density images left and right
are CC0 public domain
1-d density estimation
Figure copyright Ian Goodfellow, 2016. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
14
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Holy grail: Solve
unsupervised learning
=> understand structure
of visual world
15
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, image
captioning, etc.
Training data is cheap
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Models
16
Training data ~ pdata
(x) Generated samples ~ pmodel
(x)
Want to learn pmodel
(x) similar to pdata
(x)
Given training data, generate new samples from same distribution
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Models
17
Training data ~ pdata
(x) Generated samples ~ pmodel
(x)
Want to learn pmodel
(x) similar to pdata
(x)
Given training data, generate new samples from same distribution
Addresses density estimation, a core problem in unsupervised learning
Several flavors:
- Explicit density estimation: explicitly define and solve for pmodel
(x)
- Implicit density estimation: learn model that can sample from pmodel
(x) w/o explicitly defining it
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Why Generative Models?
18
- Realistic samples for artwork, super-resolution, colorization, etc.
- Generative models of time-series data can be used for simulation and
planning (reinforcement learning applications!)
- Training generative models can also enable inference of latent
representations that can be useful as general features
FIgures from L-R are copyright: (1) Alec Radford et al. 2016; (2) David Berthelot et al. 2017; Phillip Isola et al. 2017. Reproduced with authors permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Taxonomy of Generative Models
19
Generative models
Explicit density Implicit density
Direct
Tractable density Approximate density
Markov Chain
Variational Markov Chain
Fully Visible Belief Nets
- NADE
- MADE
- PixelRNN/CNN
Change of variables models
(nonlinear ICA)
Variational Autoencoder Boltzmann Machine
GSN
GAN
Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Fully Visible Belief Nets
- NADE
- MADE
- PixelRNN/CNN
Change of variables models
(nonlinear ICA)
Taxonomy of Generative Models
20
Generative models
Explicit density Implicit density
Direct
Tractable density Approximate density
Markov Chain
Variational Markov Chain
Variational Autoencoder Boltzmann Machine
GSN
GAN
Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017.
Today: discuss 3 most
popular types of generative
models today
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201721
PixelRNN and PixelCNN
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201722
Fully visible belief network
Use chain rule to decompose likelihood of an image x into product of 1-d
distributions:
Explicit density model
Likelihood of
image x
Probability of i’th pixel value
given all previous pixels
Then maximize likelihood of training data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Then maximize likelihood of training data
23
Fully visible belief network
Use chain rule to decompose likelihood of an image x into product of 1-d
distributions:
Explicit density model
Likelihood of
image x
Probability of i’th pixel value
given all previous pixels
Complex distribution over pixel
values => Express using a neural
network!
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201724
Fully visible belief network
Use chain rule to decompose likelihood of an image x into product of 1-d
distributions:
Explicit density model
Likelihood of
image x
Probability of i’th pixel value
given all previous pixels
Will need to define
ordering of “previous
pixels”
Complex distribution over pixel
values => Express using a neural
network!Then maximize likelihood of training data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelRNN
25
Generate image pixels starting from corner
Dependency on previous pixels modeled
using an RNN (LSTM)
[van der Oord et al. 2016]
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelRNN
26
Generate image pixels starting from corner
Dependency on previous pixels modeled
using an RNN (LSTM)
[van der Oord et al. 2016]
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelRNN
27
Generate image pixels starting from corner
Dependency on previous pixels modeled
using an RNN (LSTM)
[van der Oord et al. 2016]
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelRNN
28
Generate image pixels starting from corner
Dependency on previous pixels modeled
using an RNN (LSTM)
[van der Oord et al. 2016]
Drawback: sequential generation is slow!
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelCNN
29
[van der Oord et al. 2016]
Still generate image pixels starting from
corner
Dependency on previous pixels now
modeled using a CNN over context region
Figure copyright van der Oord et al., 2016. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelCNN
30
[van der Oord et al. 2016]
Still generate image pixels starting from
corner
Dependency on previous pixels now
modeled using a CNN over context region
Training: maximize likelihood of training
images
Figure copyright van der Oord et al., 2016. Reproduced with permission.
Softmax loss at each pixel
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
PixelCNN
31
[van der Oord et al. 2016]
Still generate image pixels starting from
corner
Dependency on previous pixels now
modeled using a CNN over context region
Training is faster than PixelRNN
(can parallelize convolutions since context region
values known from training images)
Generation must still proceed sequentially
=> still slow
Figure copyright van der Oord et al., 2016. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generation Samples
32
Figures copyright Aaron van der Oord et al., 2016. Reproduced with permission.
32x32 CIFAR-10 32x32 ImageNet
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201733
PixelRNN and PixelCNN
Improving PixelCNN performance
- Gated convolutional layers
- Short-cut connections
- Discretized logistic loss
- Multi-scale
- Training tricks
- Etc…
See
- Van der Oord et al. NIPS 2016
- Salimans et al. 2017
(PixelCNN++)
Pros:
- Can explicitly compute likelihood
p(x)
- Explicit likelihood of training
data gives good evaluation
metric
- Good samples
Con:
- Sequential generation => slow
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201734
Variational
Autoencoders (VAE)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201735
PixelCNNs define tractable density function, optimize likelihood of training data:
So far...
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
So far...
36
PixelCNNs define tractable density function, optimize likelihood of training data:
VAEs define intractable density function with latent z:
Cannot optimize directly, derive and optimize lower bound on likelihood instead
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
37
Encoder
Input data
Features
Unsupervised approach for learning a lower-dimensional feature representation
from unlabeled training data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
38
Encoder
Input data
Features
Unsupervised approach for learning a lower-dimensional feature representation
from unlabeled training data
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
39
Encoder
Input data
Features
Unsupervised approach for learning a lower-dimensional feature representation
from unlabeled training data
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
z usually smaller than x
(dimensionality reduction)
Q: Why dimensionality
reduction?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
40
Encoder
Input data
Features
Unsupervised approach for learning a lower-dimensional feature representation
from unlabeled training data
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
z usually smaller than x
(dimensionality reduction)
Q: Why dimensionality
reduction?
A: Want features to
capture meaningful
factors of variation in
data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
41
Encoder
Input data
Features
How to learn this feature representation?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
42
Encoder
Input data
Features
How to learn this feature representation?
Train such that features can be used to reconstruct original data
“Autoencoding” - encoding itself
Decoder
Reconstructed
input data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
43
Encoder
Input data
Features
How to learn this feature representation?
Train such that features can be used to reconstruct original data
“Autoencoding” - encoding itself
Decoder
Reconstructed
input data
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN (upconv)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
44
Encoder
Input data
Features
How to learn this feature representation?
Train such that features can be used to reconstruct original data
“Autoencoding” - encoding itself
Decoder
Reconstructed
input data
Reconstructed data
Input data
Encoder: 4-layer conv
Decoder: 4-layer upconv
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
45
Encoder
Input data
Features
Decoder
Reconstructed
input data
Reconstructed data
Input data
Encoder: 4-layer conv
Decoder: 4-layer upconv
L2 Loss function:
Train such that features
can be used to
reconstruct original data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
46
Encoder
Input data
Features
Decoder
Reconstructed
input data
Reconstructed data
Input data
Encoder: 4-layer conv
Decoder: 4-layer upconv
L2 Loss function:
Train such that features
can be used to
reconstruct original data
Doesn’t use labels!
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
47
Encoder
Input data
Features
Decoder
Reconstructed
input data
After training,
throw away decoder
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
48
Encoder
Input data
Features
Classifier
Predicted Label
Fine-tune
encoder
jointly with
classifier
Loss function
(Softmax, etc)
Encoder can be
used to initialize a
supervised model
plane
dog deer
bird
truck
Train for final task
(sometimes with
small data)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Some background first: Autoencoders
49
Encoder
Input data
Features
Decoder
Reconstructed
input data
Autoencoders can reconstruct
data, and can learn features to
initialize a supervised model
Features capture factors of
variation in training data. Can we
generate new images from an
autoencoder?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201750
Variational Autoencoders
Probabilistic spin on autoencoders - will let us sample from the model to generate data!
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201751
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Assume training data is generated from underlying unobserved (latent)
representation z
Probabilistic spin on autoencoders - will let us sample from the model to generate data!
Sample from
true conditional
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201752
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Assume training data is generated from underlying unobserved (latent)
representation z
Probabilistic spin on autoencoders - will let us sample from the model to generate data!
Sample from
true conditional
Intuition (remember from autoencoders!):
x is an image, z is latent factors used to
generate x: attributes, orientation, etc.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201753
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201754
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How should we represent this model?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201755
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How should we represent this model?
Choose prior p(z) to be simple, e.g.
Gaussian. Reasonable for latent attributes,
e.g. pose, how much smile.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201756
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How should we represent this model?
Choose prior p(z) to be simple, e.g.
Gaussian.
Conditional p(x|z) is complex (generates
image) => represent with neural network
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201757
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How to train the model?
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201758
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How to train the model?
Remember strategy for training generative
models from FVBNs. Learn model parameters
to maximize likelihood of training data
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201759
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How to train the model?
Remember strategy for training generative
models from FVBNs. Learn model parameters
to maximize likelihood of training data
Now with latent z
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201760
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How to train the model?
Remember strategy for training generative
models from FVBNs. Learn model parameters
to maximize likelihood of training data
Q: What is the problem with this?
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201761
Sample from
true prior
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders
Sample from
true conditional
We want to estimate the true parameters
of this generative model.
How to train the model?
Remember strategy for training generative
models from FVBNs. Learn model parameters
to maximize likelihood of training data
Q: What is the problem with this?
Intractable!
Decoder
network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201762
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201763
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
Simple Gaussian prior
✔
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201764
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
Decoder neural network
✔ ✔
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201765
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
Intractible to compute
p(x|z) for every z!
✔ ✔
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201766
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
✔ ✔
Posterior density also intractable:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201767
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
✔
✔
Posterior density also intractable:
✔
✔
Intractable data likelihood
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201768
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Variational Autoencoders: Intractability
Data likelihood:
✔
✔
Posterior density also intractable:
✔
✔
Solution: In addition to decoder network modeling pθ
(x|z), define additional
encoder network qɸ
(z|x) that approximates pθ
(z|x)
Will see that this allows us to derive a lower bound on the data likelihood that is
tractable, which we can optimize
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Variational Autoencoders
69
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic
Mean and (diagonal) covariance of z | x Mean and (diagonal) covariance of x | z
Encoder network Decoder network
(parameters ɸ) (parameters θ)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Variational Autoencoders
70
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Encoder network
Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic
Decoder network
(parameters ɸ) (parameters θ)
Sample z from Sample x|z from
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Variational Autoencoders
71
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Encoder network
Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic
Decoder network
(parameters ɸ) (parameters θ)
Sample z from Sample x|z from
Encoder and decoder networks also called
“recognition”/“inference” and “generation” networks
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201772
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201773
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Taking expectation wrt. z
(using encoder network) will
come in handy later
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201774
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201775
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201776
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201777
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201778
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
The expectation wrt. z (using
encoder network) let us write
nice KL terms
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201779
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
This KL term (between
Gaussians for encoder and z
prior) has nice closed-form
solution!
pθ
(z|x) intractable (saw
earlier), can’t compute this KL
term :( But we know KL
divergence always >= 0.
Decoder network gives pθ
(x|z), can
compute estimate of this term through
sampling. (Sampling differentiable
through reparam. trick, see paper.)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201780
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Tractable lower bound which we can take
gradient of and optimize! (pθ
(x|z) differentiable,
KL term differentiable)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201781
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Variational lower bound (“ELBO”) Training: Maximize lower bound
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201782
Variational Autoencoders
Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
Variational lower bound (“ELBO”) Training: Maximize lower bound
Reconstruct
the input data
Make approximate
posterior distribution
close to prior
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201783
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201784
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Let’s look at computing the bound
(forward pass) for a given minibatch of
input data
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201785
Encoder network
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201786
Encoder network
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Make approximate
posterior distribution
close to prior
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201787
Encoder network
Sample z from
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Make approximate
posterior distribution
close to prior
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201788
Encoder network
Decoder network
Sample z from
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Make approximate
posterior distribution
close to prior
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201789
Encoder network
Decoder network
Sample z from
Sample x|z from
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Make approximate
posterior distribution
close to prior
Maximize
likelihood of
original input
being
reconstructed
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201790
Encoder network
Decoder network
Sample z from
Sample x|z from
Input Data
Variational Autoencoders
Putting it all together: maximizing the
likelihood lower bound
Make approximate
posterior distribution
close to prior
Maximize
likelihood of
original input
being
reconstructed
For every minibatch of input
data: compute this forward
pass, and then backprop!
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201791
Decoder network
Sample z from
Sample x|z from
Variational Autoencoders: Generating Data!
Use decoder network. Now sample z from prior!
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201792
Decoder network
Sample z from
Sample x|z from
Variational Autoencoders: Generating Data!
Use decoder network. Now sample z from prior!
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201793
Decoder network
Sample z from
Sample x|z from
Variational Autoencoders: Generating Data!
Use decoder network. Now sample z from prior! Data manifold for 2-d z
Vary z1
Vary z2Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201794
Variational Autoencoders: Generating Data!
Vary z1
Vary z2
Degree of smile
Head pose
Diagonal prior on z
=> independent
latent variables
Different
dimensions of z
encode
interpretable factors
of variation
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201795
Variational Autoencoders: Generating Data!
Vary z1
Vary z2
Degree of smile
Head pose
Diagonal prior on z
=> independent
latent variables
Different
dimensions of z
encode
interpretable factors
of variation
Also good feature representation that
can be computed using qɸ
(z|x)!
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201796
Variational Autoencoders: Generating Data!
32x32 CIFAR-10
Labeled Faces in the Wild
Figures copyright (L) Dirk Kingma et al. 2016; (R) Anders Larsen et al. 2017. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Variational Autoencoders
97
Probabilistic spin to traditional autoencoders => allows generating data
Defines an intractable density => derive and optimize a (variational) lower bound
Pros:
- Principled approach to generative models
- Allows inference of q(z|x), can be useful feature representation for other tasks
Cons:
- Maximizes lower bound of likelihood: okay, but not as good evaluation as
PixelRNN/PixelCNN
- Samples blurrier and lower quality compared to state-of-the-art (GANs)
Active areas of research:
- More flexible approximations, e.g. richer approximate posterior instead of diagonal
Gaussian
- Incorporating structure in latent variables
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201798
Generative Adversarial
Networks (GAN)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
So far...
99
PixelCNNs define tractable density function, optimize likelihood of training data:
VAEs define intractable density function with latent z:
Cannot optimize directly, derive and optimize lower bound on likelihood instead
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
So far...
PixelCNNs define tractable density function, optimize likelihood of training data:
VAEs define intractable density function with latent z:
Cannot optimize directly, derive and optimize lower bound on likelihood instead
10
0
What if we give up on explicitly modeling density, and just want ability to sample?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
So far...
PixelCNNs define tractable density function, optimize likelihood of training data:
VAEs define intractable density function with latent z:
Cannot optimize directly, derive and optimize lower bound on likelihood instead
10
1
What if we give up on explicitly modeling density, and just want ability to sample?
GANs: don’t work with any explicit density function!
Instead, take game-theoretic approach: learn to generate from training distribution
through 2-player game
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Adversarial Networks
10
2
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Problem: Want to sample from complex, high-dimensional training distribution. No direct
way to do this!
Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to
training distribution.
Q: What can we use to
represent this complex
transformation?
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Problem: Want to sample from complex, high-dimensional training distribution. No direct
way to do this!
Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to
training distribution.
Generative Adversarial Networks
10
3
zInput: Random noise
Generator
Network
Output: Sample from
training distribution
Q: What can we use to
represent this complex
transformation?
A: A neural network!
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
4
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
5
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
zRandom noise
Generator Network
Discriminator Network
Fake Images
(from generator)
Real Images
(from training set)
Real or Fake
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
6
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Train jointly in minimax game
Minimax objective function:
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
7
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Train jointly in minimax game
Minimax objective function:
Discriminator output
for real data x
Discriminator output for
generated fake data G(z)
Discriminator outputs likelihood in (0,1) of real image
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
8
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Train jointly in minimax game
Minimax objective function:
Discriminator output
for real data x
Discriminator output for
generated fake data G(z)
Discriminator outputs likelihood in (0,1) of real image
- Discriminator (θd
) wants to maximize objective such that D(x) is close to 1 (real) and
D(G(z)) is close to 0 (fake)
- Generator (θg
) wants to minimize objective such that D(G(z)) is close to 1
(discriminator is fooled into thinking generated G(z) is real)
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
10
9
Minimax objective function:
Alternate between:
1. Gradient ascent on discriminator
2. Gradient descent on generator
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
0
Minimax objective function:
Alternate between:
1. Gradient ascent on discriminator
2. Gradient descent on generator
In practice, optimizing this generator objective
does not work well!
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
When sample is likely
fake, want to learn
from it to improve
generator. But
gradient in this region
is relatively flat!
Gradient signal
dominated by region
where sample is
already good
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
1
Minimax objective function:
Alternate between:
1. Gradient ascent on discriminator
2. Instead: Gradient ascent on generator, different
objective
Instead of minimizing likelihood of discriminator being correct, now
maximize likelihood of discriminator being wrong.
Same objective of fooling discriminator, but now higher gradient
signal for bad samples => works much better! Standard in practice.
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
High gradient signal
Low gradient signal
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
2
Minimax objective function:
Alternate between:
1. Gradient ascent on discriminator
2. Instead: Gradient ascent on generator, different
objective
Instead of minimizing likelihood of discriminator being correct, now
maximize likelihood of discriminator being wrong.
Same objective of fooling discriminator, but now higher gradient
signal for bad samples => works much better! Standard in practice.
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
High gradient signal
Low gradient signal
Aside: Jointly training two
networks is challenging,
can be unstable. Choosing
objectives with better loss
landscapes helps training,
is an active area of
research.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
3
Putting it together: GAN training algorithm
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
4
Putting it together: GAN training algorithm
Some find k=1
more stable,
others use k > 1,
no best rule.
Recent work (e.g.
Wasserstein GAN)
alleviates this
problem, better
stability!
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Training GANs: Two-player game
11
5
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
zRandom noise
Generator Network
Discriminator Network
Fake Images
(from generator)
Real Images
(from training set)
Real or Fake
After training, use generator network to
generate new images
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Adversarial Nets
11
6
Nearest neighbor from training set
Generated samples
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Figures copyright Ian Goodfellow et al., 2014. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Adversarial Nets
11
7
Nearest neighbor from training set
Generated samples (CIFAR-10)
Ian Goodfellow et al., “Generative
Adversarial Nets”, NIPS 2014
Figures copyright Ian Goodfellow et al., 2014. Reproduced with permission.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Adversarial Nets: Convolutional Architectures
11
8
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Generator is an upsampling network with fractionally-strided convolutions
Discriminator is a convolutional network
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
11
9
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Generator
Generative Adversarial Nets: Convolutional Architectures
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
0
Radford et al,
ICLR 2016
Samples
from the
model look
amazing!
Generative Adversarial Nets: Convolutional Architectures
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
1
Radford et al,
ICLR 2016
Interpolating
between
random
points in latent
space
Generative Adversarial Nets: Convolutional Architectures
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Generative Adversarial Nets: Interpretable Vector Math
12
2
Smiling woman Neutral woman Neutral man
Samples
from the
model
Radford et al, ICLR 2016
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
3
Smiling woman Neutral woman Neutral man
Samples
from the
model
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
Generative Adversarial Nets: Interpretable Vector Math
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
4
Smiling woman Neutral woman Neutral man
Smiling Man
Samples
from the
model
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
Generative Adversarial Nets: Interpretable Vector Math
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
5
Radford et al,
ICLR 2016
Glasses man No glasses man No glasses woman
Generative Adversarial Nets: Interpretable Vector Math
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
6
Glasses man No glasses man No glasses woman
Woman with glasses
Radford et al,
ICLR 2016
Generative Adversarial Nets: Interpretable Vector Math
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
12
7
CycleGAN. Zhu et al. 2017.
2017: Year of the GAN
Better training and generation
LSGAN. Mao et al. 2017.
BEGAN. Bertholet et al. 2017.
Source->Target domain transfer
Many GAN applications
Pix2pix. Isola 2017. Many examples at
https://phillipi.github.io/pix2pix/
Reed et al. 2017.
Text -> Image Synthesis
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
“The GAN Zoo”
12
8
https://github.com/hindupuravinash/the-gan-zoo
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
“The GAN Zoo”
12
9
https://github.com/hindupuravinash/the-gan-zoo
See also: https://github.com/soumith/ganhacks for tips
and tricks for trainings GANs
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
GANs
13
0
Don’t work with an explicit density function
Take game-theoretic approach: learn to generate from training distribution through 2-player
game
Pros:
- Beautiful, state-of-the-art samples!
Cons:
- Trickier / more unstable to train
- Can’t solve inference queries such as p(x), p(z|x)
Active areas of research:
- Better loss functions, more stable training (Wasserstein GAN, LSGAN, many others)
- Conditional GANs, GANs for all kinds of applications
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Recap
13
1
Generative Models
- PixelRNN and PixelCNN
- Variational Autoencoders (VAE)
- Generative Adversarial Networks (GANs)
Explicit density model, optimizes exact likelihood, good
samples. But inefficient sequential generation.
Optimize variational lower bound on likelihood. Useful
latent representation, inference queries. But current
sample quality not the best.
Game-theoretic approach, best samples!
But can be tricky and unstable to train,
no inference queries.
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Recap
13
2
Generative Models
- PixelRNN and PixelCNN
- Variational Autoencoders (VAE)
- Generative Adversarial Networks (GANs)
Explicit density model, optimizes exact likelihood, good
samples. But inefficient sequential generation.
Optimize variational lower bound on likelihood. Useful
latent representation, inference queries. But current
sample quality not the best.
Game-theoretic approach, best samples!
But can be tricky and unstable to train,
no inference queries.Also recent work in combinations of
these types of models! E.g. Adversarial
Autoencoders (Makhanzi 2015) and
PixelVAE (Gulrajani 2016)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017
Recap
13
3
Generative Models
- PixelRNN and PixelCNN
- Variational Autoencoders (VAE)
- Generative Adversarial Networks (GANs)
Explicit density model, optimizes exact likelihood, good
samples. But inefficient sequential generation.
Optimize variational lower bound on likelihood. Useful
latent representation, inference queries. But current
sample quality not the best.
Game-theoretic approach, best samples!
But can be tricky and unstable to train,
no inference queries.
Next time: Reinforcement Learning

More Related Content

What's hot

Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Yamato OKAMOTO
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via InformationDeep Learning JP
 
最近の自然言語処理
最近の自然言語処理最近の自然言語処理
最近の自然言語処理naoto moriyama
 
MLデザインパターン入門_Embeddings
MLデザインパターン入門_EmbeddingsMLデザインパターン入門_Embeddings
MLデザインパターン入門_EmbeddingsMasakazu Shinoda
 
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)ARISE analytics
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models수철 박
 
機械学習と深層学習の数理
機械学習と深層学習の数理機械学習と深層学習の数理
機械学習と深層学習の数理Ryo Nakamura
 
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...Deep Learning JP
 
Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)Yamato OKAMOTO
 
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向SSII
 
自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)cvpaper. challenge
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
【論文読み会】Autoregressive Diffusion Models.pptx
【論文読み会】Autoregressive Diffusion Models.pptx【論文読み会】Autoregressive Diffusion Models.pptx
【論文読み会】Autoregressive Diffusion Models.pptxARISE analytics
 
変分推論と Normalizing Flow
変分推論と Normalizing Flow変分推論と Normalizing Flow
変分推論と Normalizing FlowAkihiro Nitta
 

What's hot (20)

VQ-VAE
VQ-VAEVQ-VAE
VQ-VAE
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Mean Teacher
Mean TeacherMean Teacher
Mean Teacher
 
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
 
最近の自然言語処理
最近の自然言語処理最近の自然言語処理
最近の自然言語処理
 
MLデザインパターン入門_Embeddings
MLデザインパターン入門_EmbeddingsMLデザインパターン入門_Embeddings
MLデザインパターン入門_Embeddings
 
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)
【論文読み会】Alias-Free Generative Adversarial Networks(StyleGAN3)
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models
 
機械学習と深層学習の数理
機械学習と深層学習の数理機械学習と深層学習の数理
機械学習と深層学習の数理
 
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
 
Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)
 
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
 
自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
【論文読み会】Autoregressive Diffusion Models.pptx
【論文読み会】Autoregressive Diffusion Models.pptx【論文読み会】Autoregressive Diffusion Models.pptx
【論文読み会】Autoregressive Diffusion Models.pptx
 
グラフと木
グラフと木グラフと木
グラフと木
 
変分推論と Normalizing Flow
変分推論と Normalizing Flow変分推論と Normalizing Flow
変分推論と Normalizing Flow
 

Viewers also liked

Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationYanbin Kong
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Association for Computational Linguistics
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Bhaskar Mitra
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Jaemin Cho
 
Technological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyTechnological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyMelanie Swan
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !LINAGORA
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic TheoryMelanie Swan
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHenning Spjelkavik
 

Viewers also liked (20)

Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and Segmentation
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)
 
Technological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyTechnological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-Economy
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Care your Child
Care your ChildCare your Child
Care your Child
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic Theory
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
 

Similar to Cs231n 2017 lecture13 Generative Model

Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingYanbin Kong
 
Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksYanbin Kong
 
Recurrent Neural Network and natural languages processing.pdf
Recurrent Neural Network and natural languages processing.pdfRecurrent Neural Network and natural languages processing.pdf
Recurrent Neural Network and natural languages processing.pdfDayakerP
 
Deep learning: challenges and applications
Deep learning: challenges and  applicationsDeep learning: challenges and  applications
Deep learning: challenges and applicationsAboul Ella Hassanien
 
Learning occam razor
Learning occam razorLearning occam razor
Learning occam razorMinakshi Atre
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Research in Deep Learning: A Perspective from NSF
Research in Deep Learning: A Perspective from NSFResearch in Deep Learning: A Perspective from NSF
Research in Deep Learning: A Perspective from NSFDESMOND YUEN
 
Application of deep leaning to computer vision
Application of deep leaning to computer visionApplication of deep leaning to computer vision
Application of deep leaning to computer visionDjamal Abide, MSc
 
Automatic Attendance System using Deep Learning Framework
Automatic Attendance System using Deep Learning FrameworkAutomatic Attendance System using Deep Learning Framework
Automatic Attendance System using Deep Learning FrameworkPinaki Ranjan Sarkar
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Er. Shiva K. Shrestha
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsBianca Pereira
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftSebastian Ruder
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
Learning Probabilistic Relational Models
Learning Probabilistic Relational ModelsLearning Probabilistic Relational Models
Learning Probabilistic Relational ModelsUniversity of Nantes
 
The Success of Deep Generative Models
The Success of Deep Generative ModelsThe Success of Deep Generative Models
The Success of Deep Generative Modelsinside-BigData.com
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionDarian Frajberg
 
FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social Relation Ex...
FL-MSRE: A Few-Shot Learning based Approach  to Multimodal Social Relation Ex...FL-MSRE: A Few-Shot Learning based Approach  to Multimodal Social Relation Ex...
FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social Relation Ex...Takato Hayashi
 

Similar to Cs231n 2017 lecture13 Generative Model (20)

Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Object detection stanford
Object detection stanfordObject detection stanford
Object detection stanford
 
Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural Networks
 
Recurrent Neural Network and natural languages processing.pdf
Recurrent Neural Network and natural languages processing.pdfRecurrent Neural Network and natural languages processing.pdf
Recurrent Neural Network and natural languages processing.pdf
 
Deep learning: challenges and applications
Deep learning: challenges and  applicationsDeep learning: challenges and  applications
Deep learning: challenges and applications
 
Learning occam razor
Learning occam razorLearning occam razor
Learning occam razor
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Research in Deep Learning: A Perspective from NSF
Research in Deep Learning: A Perspective from NSFResearch in Deep Learning: A Perspective from NSF
Research in Deep Learning: A Perspective from NSF
 
Application of deep leaning to computer vision
Application of deep leaning to computer visionApplication of deep leaning to computer vision
Application of deep leaning to computer vision
 
Automatic Attendance System using Deep Learning Framework
Automatic Attendance System using Deep Learning FrameworkAutomatic Attendance System using Deep Learning Framework
Automatic Attendance System using Deep Learning Framework
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)
 
Deep learning
Deep learningDeep learning
Deep learning
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
Learning Probabilistic Relational Models
Learning Probabilistic Relational ModelsLearning Probabilistic Relational Models
Learning Probabilistic Relational Models
 
The Success of Deep Generative Models
The Success of Deep Generative ModelsThe Success of Deep Generative Models
The Success of Deep Generative Models
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
 
FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social Relation Ex...
FL-MSRE: A Few-Shot Learning based Approach  to Multimodal Social Relation Ex...FL-MSRE: A Few-Shot Learning based Approach  to Multimodal Social Relation Ex...
FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social Relation Ex...
 

More from Yanbin Kong

[Public] gerrit concepts and workflows
[Public] gerrit   concepts and workflows[Public] gerrit   concepts and workflows
[Public] gerrit concepts and workflowsYanbin Kong
 
Emotion coaching introduction
Emotion coaching introductionEmotion coaching introduction
Emotion coaching introductionYanbin Kong
 
A New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureA New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureYanbin Kong
 
Cs231n 2017 lecture9 CNN Architecture
Cs231n 2017 lecture9 CNN ArchitectureCs231n 2017 lecture9 CNN Architecture
Cs231n 2017 lecture9 CNN ArchitectureYanbin Kong
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测Yanbin Kong
 
Mega guess on i phone5c
Mega guess on i phone5cMega guess on i phone5c
Mega guess on i phone5cYanbin Kong
 

More from Yanbin Kong (6)

[Public] gerrit concepts and workflows
[Public] gerrit   concepts and workflows[Public] gerrit   concepts and workflows
[Public] gerrit concepts and workflows
 
Emotion coaching introduction
Emotion coaching introductionEmotion coaching introduction
Emotion coaching introduction
 
A New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureA New Golden Age for Computer Architecture
A New Golden Age for Computer Architecture
 
Cs231n 2017 lecture9 CNN Architecture
Cs231n 2017 lecture9 CNN ArchitectureCs231n 2017 lecture9 CNN Architecture
Cs231n 2017 lecture9 CNN Architecture
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测
 
Mega guess on i phone5c
Mega guess on i phone5cMega guess on i phone5c
Mega guess on i phone5c
 

Recently uploaded

wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 

Recently uploaded (20)

Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 

Cs231n 2017 lecture13 Generative Model

  • 1. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 20171 Lecture 13: Generative Models
  • 2. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Administrative 2 Midterm grades released on Gradescope this week A3 due next Friday, 5/26 HyperQuest deadline extended to Sunday 5/21, 11:59pm Poster session is June 6
  • 3. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Overview ● Unsupervised Learning ● Generative Models ○ PixelRNN and PixelCNN ○ Variational Autoencoders (VAE) ○ Generative Adversarial Networks (GAN) 3
  • 4. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Supervised vs Unsupervised Learning 4 Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.
  • 5. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Supervised vs Unsupervised Learning 5 Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. Cat Classification This image is CC0 public domain
  • 6. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Supervised vs Unsupervised Learning 6 Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. DOG, DOG, CAT This image is CC0 public domain Object Detection
  • 7. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Supervised vs Unsupervised Learning 7 Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. Semantic Segmentation GRASS, CAT, TREE, SKY
  • 8. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Supervised vs Unsupervised Learning 8 Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. Image captioning A cat sitting on a suitcase on the floor Caption generated using neuraltalk2 Image is CC0 Public domain.
  • 9. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 20179 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Supervised vs Unsupervised Learning
  • 10. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201710 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Supervised vs Unsupervised Learning K-means clustering This image is CC0 public domain
  • 11. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201711 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Supervised vs Unsupervised Learning Principal Component Analysis (Dimensionality reduction) This image from Matthias Scholz is CC0 public domain 3-d 2-d
  • 12. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201712 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Supervised vs Unsupervised Learning Autoencoders (Feature learning)
  • 13. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201713 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Supervised vs Unsupervised Learning 2-d density estimation 2-d density images left and right are CC0 public domain 1-d density estimation Figure copyright Ian Goodfellow, 2016. Reproduced with permission.
  • 14. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. 14 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.
  • 15. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Holy grail: Solve unsupervised learning => understand structure of visual world 15 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. Training data is cheap
  • 16. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Models 16 Training data ~ pdata (x) Generated samples ~ pmodel (x) Want to learn pmodel (x) similar to pdata (x) Given training data, generate new samples from same distribution
  • 17. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Models 17 Training data ~ pdata (x) Generated samples ~ pmodel (x) Want to learn pmodel (x) similar to pdata (x) Given training data, generate new samples from same distribution Addresses density estimation, a core problem in unsupervised learning Several flavors: - Explicit density estimation: explicitly define and solve for pmodel (x) - Implicit density estimation: learn model that can sample from pmodel (x) w/o explicitly defining it
  • 18. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Why Generative Models? 18 - Realistic samples for artwork, super-resolution, colorization, etc. - Generative models of time-series data can be used for simulation and planning (reinforcement learning applications!) - Training generative models can also enable inference of latent representations that can be useful as general features FIgures from L-R are copyright: (1) Alec Radford et al. 2016; (2) David Berthelot et al. 2017; Phillip Isola et al. 2017. Reproduced with authors permission.
  • 19. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Taxonomy of Generative Models 19 Generative models Explicit density Implicit density Direct Tractable density Approximate density Markov Chain Variational Markov Chain Fully Visible Belief Nets - NADE - MADE - PixelRNN/CNN Change of variables models (nonlinear ICA) Variational Autoencoder Boltzmann Machine GSN GAN Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017.
  • 20. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Fully Visible Belief Nets - NADE - MADE - PixelRNN/CNN Change of variables models (nonlinear ICA) Taxonomy of Generative Models 20 Generative models Explicit density Implicit density Direct Tractable density Approximate density Markov Chain Variational Markov Chain Variational Autoencoder Boltzmann Machine GSN GAN Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. Today: discuss 3 most popular types of generative models today
  • 21. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201721 PixelRNN and PixelCNN
  • 22. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201722 Fully visible belief network Use chain rule to decompose likelihood of an image x into product of 1-d distributions: Explicit density model Likelihood of image x Probability of i’th pixel value given all previous pixels Then maximize likelihood of training data
  • 23. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Then maximize likelihood of training data 23 Fully visible belief network Use chain rule to decompose likelihood of an image x into product of 1-d distributions: Explicit density model Likelihood of image x Probability of i’th pixel value given all previous pixels Complex distribution over pixel values => Express using a neural network!
  • 24. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201724 Fully visible belief network Use chain rule to decompose likelihood of an image x into product of 1-d distributions: Explicit density model Likelihood of image x Probability of i’th pixel value given all previous pixels Will need to define ordering of “previous pixels” Complex distribution over pixel values => Express using a neural network!Then maximize likelihood of training data
  • 25. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelRNN 25 Generate image pixels starting from corner Dependency on previous pixels modeled using an RNN (LSTM) [van der Oord et al. 2016]
  • 26. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelRNN 26 Generate image pixels starting from corner Dependency on previous pixels modeled using an RNN (LSTM) [van der Oord et al. 2016]
  • 27. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelRNN 27 Generate image pixels starting from corner Dependency on previous pixels modeled using an RNN (LSTM) [van der Oord et al. 2016]
  • 28. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelRNN 28 Generate image pixels starting from corner Dependency on previous pixels modeled using an RNN (LSTM) [van der Oord et al. 2016] Drawback: sequential generation is slow!
  • 29. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelCNN 29 [van der Oord et al. 2016] Still generate image pixels starting from corner Dependency on previous pixels now modeled using a CNN over context region Figure copyright van der Oord et al., 2016. Reproduced with permission.
  • 30. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelCNN 30 [van der Oord et al. 2016] Still generate image pixels starting from corner Dependency on previous pixels now modeled using a CNN over context region Training: maximize likelihood of training images Figure copyright van der Oord et al., 2016. Reproduced with permission. Softmax loss at each pixel
  • 31. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 PixelCNN 31 [van der Oord et al. 2016] Still generate image pixels starting from corner Dependency on previous pixels now modeled using a CNN over context region Training is faster than PixelRNN (can parallelize convolutions since context region values known from training images) Generation must still proceed sequentially => still slow Figure copyright van der Oord et al., 2016. Reproduced with permission.
  • 32. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generation Samples 32 Figures copyright Aaron van der Oord et al., 2016. Reproduced with permission. 32x32 CIFAR-10 32x32 ImageNet
  • 33. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201733 PixelRNN and PixelCNN Improving PixelCNN performance - Gated convolutional layers - Short-cut connections - Discretized logistic loss - Multi-scale - Training tricks - Etc… See - Van der Oord et al. NIPS 2016 - Salimans et al. 2017 (PixelCNN++) Pros: - Can explicitly compute likelihood p(x) - Explicit likelihood of training data gives good evaluation metric - Good samples Con: - Sequential generation => slow
  • 34. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201734 Variational Autoencoders (VAE)
  • 35. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201735 PixelCNNs define tractable density function, optimize likelihood of training data: So far...
  • 36. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 So far... 36 PixelCNNs define tractable density function, optimize likelihood of training data: VAEs define intractable density function with latent z: Cannot optimize directly, derive and optimize lower bound on likelihood instead
  • 37. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 37 Encoder Input data Features Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data
  • 38. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 38 Encoder Input data Features Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN
  • 39. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 39 Encoder Input data Features Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN z usually smaller than x (dimensionality reduction) Q: Why dimensionality reduction?
  • 40. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 40 Encoder Input data Features Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN z usually smaller than x (dimensionality reduction) Q: Why dimensionality reduction? A: Want features to capture meaningful factors of variation in data
  • 41. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 41 Encoder Input data Features How to learn this feature representation?
  • 42. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 42 Encoder Input data Features How to learn this feature representation? Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself Decoder Reconstructed input data
  • 43. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 43 Encoder Input data Features How to learn this feature representation? Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself Decoder Reconstructed input data Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN (upconv)
  • 44. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 44 Encoder Input data Features How to learn this feature representation? Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself Decoder Reconstructed input data Reconstructed data Input data Encoder: 4-layer conv Decoder: 4-layer upconv
  • 45. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 45 Encoder Input data Features Decoder Reconstructed input data Reconstructed data Input data Encoder: 4-layer conv Decoder: 4-layer upconv L2 Loss function: Train such that features can be used to reconstruct original data
  • 46. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 46 Encoder Input data Features Decoder Reconstructed input data Reconstructed data Input data Encoder: 4-layer conv Decoder: 4-layer upconv L2 Loss function: Train such that features can be used to reconstruct original data Doesn’t use labels!
  • 47. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 47 Encoder Input data Features Decoder Reconstructed input data After training, throw away decoder
  • 48. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 48 Encoder Input data Features Classifier Predicted Label Fine-tune encoder jointly with classifier Loss function (Softmax, etc) Encoder can be used to initialize a supervised model plane dog deer bird truck Train for final task (sometimes with small data)
  • 49. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Some background first: Autoencoders 49 Encoder Input data Features Decoder Reconstructed input data Autoencoders can reconstruct data, and can learn features to initialize a supervised model Features capture factors of variation in training data. Can we generate new images from an autoencoder?
  • 50. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201750 Variational Autoencoders Probabilistic spin on autoencoders - will let us sample from the model to generate data!
  • 51. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201751 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Assume training data is generated from underlying unobserved (latent) representation z Probabilistic spin on autoencoders - will let us sample from the model to generate data! Sample from true conditional
  • 52. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201752 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Assume training data is generated from underlying unobserved (latent) representation z Probabilistic spin on autoencoders - will let us sample from the model to generate data! Sample from true conditional Intuition (remember from autoencoders!): x is an image, z is latent factors used to generate x: attributes, orientation, etc.
  • 53. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201753 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model.
  • 54. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201754 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How should we represent this model?
  • 55. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201755 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How should we represent this model? Choose prior p(z) to be simple, e.g. Gaussian. Reasonable for latent attributes, e.g. pose, how much smile.
  • 56. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201756 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How should we represent this model? Choose prior p(z) to be simple, e.g. Gaussian. Conditional p(x|z) is complex (generates image) => represent with neural network Decoder network
  • 57. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201757 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How to train the model? Decoder network
  • 58. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201758 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How to train the model? Remember strategy for training generative models from FVBNs. Learn model parameters to maximize likelihood of training data Decoder network
  • 59. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201759 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How to train the model? Remember strategy for training generative models from FVBNs. Learn model parameters to maximize likelihood of training data Now with latent z Decoder network
  • 60. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201760 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How to train the model? Remember strategy for training generative models from FVBNs. Learn model parameters to maximize likelihood of training data Q: What is the problem with this? Decoder network
  • 61. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201761 Sample from true prior Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders Sample from true conditional We want to estimate the true parameters of this generative model. How to train the model? Remember strategy for training generative models from FVBNs. Learn model parameters to maximize likelihood of training data Q: What is the problem with this? Intractable! Decoder network
  • 62. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201762 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood:
  • 63. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201763 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: Simple Gaussian prior ✔
  • 64. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201764 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: Decoder neural network ✔ ✔
  • 65. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201765 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: Intractible to compute p(x|z) for every z! ✔ ✔
  • 66. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201766 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: ✔ ✔ Posterior density also intractable:
  • 67. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201767 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: ✔ ✔ Posterior density also intractable: ✔ ✔ Intractable data likelihood
  • 68. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201768 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Variational Autoencoders: Intractability Data likelihood: ✔ ✔ Posterior density also intractable: ✔ ✔ Solution: In addition to decoder network modeling pθ (x|z), define additional encoder network qɸ (z|x) that approximates pθ (z|x) Will see that this allows us to derive a lower bound on the data likelihood that is tractable, which we can optimize
  • 69. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Variational Autoencoders 69 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic Mean and (diagonal) covariance of z | x Mean and (diagonal) covariance of x | z Encoder network Decoder network (parameters ɸ) (parameters θ)
  • 70. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Variational Autoencoders 70 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Encoder network Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic Decoder network (parameters ɸ) (parameters θ) Sample z from Sample x|z from
  • 71. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Variational Autoencoders 71 Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 Encoder network Since we’re modeling probabilistic generation of data, encoder and decoder networks are probabilistic Decoder network (parameters ɸ) (parameters θ) Sample z from Sample x|z from Encoder and decoder networks also called “recognition”/“inference” and “generation” networks
  • 72. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201772 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
  • 73. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201773 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: Taking expectation wrt. z (using encoder network) will come in handy later
  • 74. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201774 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
  • 75. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201775 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
  • 76. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201776 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
  • 77. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201777 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood:
  • 78. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201778 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: The expectation wrt. z (using encoder network) let us write nice KL terms
  • 79. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201779 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: This KL term (between Gaussians for encoder and z prior) has nice closed-form solution! pθ (z|x) intractable (saw earlier), can’t compute this KL term :( But we know KL divergence always >= 0. Decoder network gives pθ (x|z), can compute estimate of this term through sampling. (Sampling differentiable through reparam. trick, see paper.)
  • 80. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201780 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: Tractable lower bound which we can take gradient of and optimize! (pθ (x|z) differentiable, KL term differentiable)
  • 81. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201781 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: Variational lower bound (“ELBO”) Training: Maximize lower bound
  • 82. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201782 Variational Autoencoders Now equipped with our encoder and decoder networks, let’s work out the (log) data likelihood: Variational lower bound (“ELBO”) Training: Maximize lower bound Reconstruct the input data Make approximate posterior distribution close to prior
  • 83. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201783 Variational Autoencoders Putting it all together: maximizing the likelihood lower bound
  • 84. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201784 Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Let’s look at computing the bound (forward pass) for a given minibatch of input data
  • 85. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201785 Encoder network Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound
  • 86. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201786 Encoder network Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Make approximate posterior distribution close to prior
  • 87. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201787 Encoder network Sample z from Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Make approximate posterior distribution close to prior
  • 88. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201788 Encoder network Decoder network Sample z from Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Make approximate posterior distribution close to prior
  • 89. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201789 Encoder network Decoder network Sample z from Sample x|z from Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Make approximate posterior distribution close to prior Maximize likelihood of original input being reconstructed
  • 90. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201790 Encoder network Decoder network Sample z from Sample x|z from Input Data Variational Autoencoders Putting it all together: maximizing the likelihood lower bound Make approximate posterior distribution close to prior Maximize likelihood of original input being reconstructed For every minibatch of input data: compute this forward pass, and then backprop!
  • 91. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201791 Decoder network Sample z from Sample x|z from Variational Autoencoders: Generating Data! Use decoder network. Now sample z from prior! Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
  • 92. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201792 Decoder network Sample z from Sample x|z from Variational Autoencoders: Generating Data! Use decoder network. Now sample z from prior! Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
  • 93. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201793 Decoder network Sample z from Sample x|z from Variational Autoencoders: Generating Data! Use decoder network. Now sample z from prior! Data manifold for 2-d z Vary z1 Vary z2Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
  • 94. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201794 Variational Autoencoders: Generating Data! Vary z1 Vary z2 Degree of smile Head pose Diagonal prior on z => independent latent variables Different dimensions of z encode interpretable factors of variation Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
  • 95. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201795 Variational Autoencoders: Generating Data! Vary z1 Vary z2 Degree of smile Head pose Diagonal prior on z => independent latent variables Different dimensions of z encode interpretable factors of variation Also good feature representation that can be computed using qɸ (z|x)! Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014
  • 96. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201796 Variational Autoencoders: Generating Data! 32x32 CIFAR-10 Labeled Faces in the Wild Figures copyright (L) Dirk Kingma et al. 2016; (R) Anders Larsen et al. 2017. Reproduced with permission.
  • 97. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Variational Autoencoders 97 Probabilistic spin to traditional autoencoders => allows generating data Defines an intractable density => derive and optimize a (variational) lower bound Pros: - Principled approach to generative models - Allows inference of q(z|x), can be useful feature representation for other tasks Cons: - Maximizes lower bound of likelihood: okay, but not as good evaluation as PixelRNN/PixelCNN - Samples blurrier and lower quality compared to state-of-the-art (GANs) Active areas of research: - More flexible approximations, e.g. richer approximate posterior instead of diagonal Gaussian - Incorporating structure in latent variables
  • 98. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 201798 Generative Adversarial Networks (GAN)
  • 99. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 So far... 99 PixelCNNs define tractable density function, optimize likelihood of training data: VAEs define intractable density function with latent z: Cannot optimize directly, derive and optimize lower bound on likelihood instead
  • 100. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 So far... PixelCNNs define tractable density function, optimize likelihood of training data: VAEs define intractable density function with latent z: Cannot optimize directly, derive and optimize lower bound on likelihood instead 10 0 What if we give up on explicitly modeling density, and just want ability to sample?
  • 101. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 So far... PixelCNNs define tractable density function, optimize likelihood of training data: VAEs define intractable density function with latent z: Cannot optimize directly, derive and optimize lower bound on likelihood instead 10 1 What if we give up on explicitly modeling density, and just want ability to sample? GANs: don’t work with any explicit density function! Instead, take game-theoretic approach: learn to generate from training distribution through 2-player game
  • 102. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Adversarial Networks 10 2 Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Problem: Want to sample from complex, high-dimensional training distribution. No direct way to do this! Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution. Q: What can we use to represent this complex transformation?
  • 103. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Problem: Want to sample from complex, high-dimensional training distribution. No direct way to do this! Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution. Generative Adversarial Networks 10 3 zInput: Random noise Generator Network Output: Sample from training distribution Q: What can we use to represent this complex transformation? A: A neural network! Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 104. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 4 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 105. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 5 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images zRandom noise Generator Network Discriminator Network Fake Images (from generator) Real Images (from training set) Real or Fake Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.
  • 106. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 6 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Minimax objective function: Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 107. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 7 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Minimax objective function: Discriminator output for real data x Discriminator output for generated fake data G(z) Discriminator outputs likelihood in (0,1) of real image Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 108. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 8 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Minimax objective function: Discriminator output for real data x Discriminator output for generated fake data G(z) Discriminator outputs likelihood in (0,1) of real image - Discriminator (θd ) wants to maximize objective such that D(x) is close to 1 (real) and D(G(z)) is close to 0 (fake) - Generator (θg ) wants to minimize objective such that D(G(z)) is close to 1 (discriminator is fooled into thinking generated G(z) is real) Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 109. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 10 9 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Gradient descent on generator Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 110. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 0 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Gradient descent on generator In practice, optimizing this generator objective does not work well! Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 When sample is likely fake, want to learn from it to improve generator. But gradient in this region is relatively flat! Gradient signal dominated by region where sample is already good
  • 111. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 1 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Instead: Gradient ascent on generator, different objective Instead of minimizing likelihood of discriminator being correct, now maximize likelihood of discriminator being wrong. Same objective of fooling discriminator, but now higher gradient signal for bad samples => works much better! Standard in practice. Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 High gradient signal Low gradient signal
  • 112. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 2 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Instead: Gradient ascent on generator, different objective Instead of minimizing likelihood of discriminator being correct, now maximize likelihood of discriminator being wrong. Same objective of fooling discriminator, but now higher gradient signal for bad samples => works much better! Standard in practice. Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 High gradient signal Low gradient signal Aside: Jointly training two networks is challenging, can be unstable. Choosing objectives with better loss landscapes helps training, is an active area of research.
  • 113. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 3 Putting it together: GAN training algorithm Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 114. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 4 Putting it together: GAN training algorithm Some find k=1 more stable, others use k > 1, no best rule. Recent work (e.g. Wasserstein GAN) alleviates this problem, better stability! Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
  • 115. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Training GANs: Two-player game 11 5 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images zRandom noise Generator Network Discriminator Network Fake Images (from generator) Real Images (from training set) Real or Fake After training, use generator network to generate new images Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.
  • 116. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Adversarial Nets 11 6 Nearest neighbor from training set Generated samples Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Figures copyright Ian Goodfellow et al., 2014. Reproduced with permission.
  • 117. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Adversarial Nets 11 7 Nearest neighbor from training set Generated samples (CIFAR-10) Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Figures copyright Ian Goodfellow et al., 2014. Reproduced with permission.
  • 118. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Adversarial Nets: Convolutional Architectures 11 8 Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016 Generator is an upsampling network with fractionally-strided convolutions Discriminator is a convolutional network
  • 119. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 11 9 Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016 Generator Generative Adversarial Nets: Convolutional Architectures
  • 120. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 0 Radford et al, ICLR 2016 Samples from the model look amazing! Generative Adversarial Nets: Convolutional Architectures
  • 121. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 1 Radford et al, ICLR 2016 Interpolating between random points in latent space Generative Adversarial Nets: Convolutional Architectures
  • 122. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Generative Adversarial Nets: Interpretable Vector Math 12 2 Smiling woman Neutral woman Neutral man Samples from the model Radford et al, ICLR 2016
  • 123. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 3 Smiling woman Neutral woman Neutral man Samples from the model Average Z vectors, do arithmetic Radford et al, ICLR 2016 Generative Adversarial Nets: Interpretable Vector Math
  • 124. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 4 Smiling woman Neutral woman Neutral man Smiling Man Samples from the model Average Z vectors, do arithmetic Radford et al, ICLR 2016 Generative Adversarial Nets: Interpretable Vector Math
  • 125. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 5 Radford et al, ICLR 2016 Glasses man No glasses man No glasses woman Generative Adversarial Nets: Interpretable Vector Math
  • 126. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 6 Glasses man No glasses man No glasses woman Woman with glasses Radford et al, ICLR 2016 Generative Adversarial Nets: Interpretable Vector Math
  • 127. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 12 7 CycleGAN. Zhu et al. 2017. 2017: Year of the GAN Better training and generation LSGAN. Mao et al. 2017. BEGAN. Bertholet et al. 2017. Source->Target domain transfer Many GAN applications Pix2pix. Isola 2017. Many examples at https://phillipi.github.io/pix2pix/ Reed et al. 2017. Text -> Image Synthesis
  • 128. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 “The GAN Zoo” 12 8 https://github.com/hindupuravinash/the-gan-zoo
  • 129. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 “The GAN Zoo” 12 9 https://github.com/hindupuravinash/the-gan-zoo See also: https://github.com/soumith/ganhacks for tips and tricks for trainings GANs
  • 130. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 GANs 13 0 Don’t work with an explicit density function Take game-theoretic approach: learn to generate from training distribution through 2-player game Pros: - Beautiful, state-of-the-art samples! Cons: - Trickier / more unstable to train - Can’t solve inference queries such as p(x), p(z|x) Active areas of research: - Better loss functions, more stable training (Wasserstein GAN, LSGAN, many others) - Conditional GANs, GANs for all kinds of applications
  • 131. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Recap 13 1 Generative Models - PixelRNN and PixelCNN - Variational Autoencoders (VAE) - Generative Adversarial Networks (GANs) Explicit density model, optimizes exact likelihood, good samples. But inefficient sequential generation. Optimize variational lower bound on likelihood. Useful latent representation, inference queries. But current sample quality not the best. Game-theoretic approach, best samples! But can be tricky and unstable to train, no inference queries.
  • 132. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Recap 13 2 Generative Models - PixelRNN and PixelCNN - Variational Autoencoders (VAE) - Generative Adversarial Networks (GANs) Explicit density model, optimizes exact likelihood, good samples. But inefficient sequential generation. Optimize variational lower bound on likelihood. Useful latent representation, inference queries. But current sample quality not the best. Game-theoretic approach, best samples! But can be tricky and unstable to train, no inference queries.Also recent work in combinations of these types of models! E.g. Adversarial Autoencoders (Makhanzi 2015) and PixelVAE (Gulrajani 2016)
  • 133. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 13 - May 18, 2017 Recap 13 3 Generative Models - PixelRNN and PixelCNN - Variational Autoencoders (VAE) - Generative Adversarial Networks (GANs) Explicit density model, optimizes exact likelihood, good samples. But inefficient sequential generation. Optimize variational lower bound on likelihood. Useful latent representation, inference queries. But current sample quality not the best. Game-theoretic approach, best samples! But can be tricky and unstable to train, no inference queries. Next time: Reinforcement Learning