Slides by Víctor Garcia about the paper:
Nguyen, Anh, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, and Jeff Clune. "Plug & play generative networks: Conditional iterative generation of images in latent space." arXiv preprint arXiv:1612.00005 (2016).
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (UPC Reading Group)
1. Plug & Play Generative Networks:
Conditional Iterative Generation of Images in Latent Space
Anh Nguyen, Jason Yosinski, Yoshua
Bengio, Alexey Dosovitskiy, Jeff Clune
[GitHub] [Arxiv]
Slides by Víctor Garcia
UPC Computer Vision Reading Group (27/01/2017)
2. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
3. Introduction
Interpretation of different frameworks to generate images maximizing:
p(x, y) = p(x)*p(y|x)
Prior Condition
Encourages to
look realistic
Encourages to
look from a
particular class
7. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
8. Probabilistic Interpretation of the method
Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for
iteratively producing random samples from a distribution p(x):
9. Probabilistic Interpretation of the method
Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for
iteratively producing random samples:
Current state
10. Probabilistic Interpretation of the method
Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for
iteratively producing random samples:
Future State Current state
11. Probabilistic Interpretation of the method
Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for
iteratively producing random samples:
Future State Current state Gradient to the
natural manifold of
p(x)
12. Probabilistic Interpretation of the method
Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for
iteratively producing random samples:
Gradient to the
natural manifold of
p(x)
NoiseFuture State Current state
15. Probabilistic Interpretation of the method
p(x)
Step towards an image that
causes the classifier to produce
a higher score for class C
Step towards a more
generic image
Noise
19. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
21. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
22. Method | PPGN-x: DAE model of p(x)
What a Denoising Autoencoder is?
x
h(x)
R(x)
23. Method | PPGN-x: DAE model of p(x)
What a Denoising Autoencoder is?
x_noise
h(x)
x
N(0,σ^2)
R(x)
24. Method | PPGN-x: DAE model of p(x)
What a Denoising Autoencoder is?
x_noise
h(x)
x
N(0,σ^2)
R(x)
27. Method | PPGN-x: DAE model of p(x)
1) Poorly modeled data, blurry 2) Slow changes
28. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
29. Method | DGN-AM: sampling without a learned prior
Deep Generator Network-based Activation Maximization
It is faster if we move over h subspace instead of the x
fc6
AlexNet
30. Method | DGN-AM: sampling without a learned prior
Deep Generator Network-based Activation Maximization
Discriminator 1/0
AlexNet
fc6
31. Method | DGN-AM: sampling without a learned prior
Once we trained the network G we find the equation for the MALA algorithm
32. Method | DGN-AM: sampling without a learned prior
Once we trained the network G we find the equation for the MALA algorithm
33. Method | DGN-AM: sampling without a learned prior
Once we trained the network G we find the equation for the MALA algorithm
34. Method | DGN-AM: sampling without a learned prior
Once we trained the network G we find the equation for the MALA algorithm
No learned prior No noise
35. Method | DGN-AM: sampling without a learned prior
+ Different modes from different starts
- Same image after many steps
- Low mixing speed
36. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
37. Method | PPGN-h: Generator and DAE model of p(h)
A 7 layers DAE is added to model the prior p(h) in order to increase the mixing speed
38. Method | PPGN-h: Generator and DAE model of p(h)
The equation is the following:
Prior p(h) Conditioned
Gradient
Noise
39. Method | PPGN-h: Generator and DAE model of p(h)
- Similar to the last case. Low diversity
- p(h) model learned by DAE is too simple
40. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
41. Method | Joint PPGN-h: joint Generator and DAE
In order to model p(h) in a more complex way
DAE: h/fc6 → ? → h/fc6
42. Method | Joint PPGN-h: joint Generator and DAE
In order to model p(h) in a more complex way
DAE: h/fc6 → ? → h/fc6
Joint Generator and DAE: h/fc6 x h/fc6
G E
43. Method | Joint PPGN-h: joint Generator and DAE
In order to model p(h) in a more complex way
DAE: h/fc6 → ? → h/fc6
Joint Generator and DAE: h/fc6 x h/fc6
G E
With the same existing network we train the Generator G to act as a DAE in conjunction with the E
network
44. Method | Joint PPGN-h: joint Generator and DAE
AlexNet
Equation is the
same than before
48. Method | Joint PPGN-h: joint Generator and DAE
Noise sweeps
For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:
fc6
N(0, ) +
49. Method | Joint PPGN-h: joint Generator and AE
Noise sweeps
For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:
51. Method | Joint PPGN-h: joint Generator and AE
Noise sweeps
We can still recover large information from the image when mapping with a lot of noise.
Many → one.
52. Method | Joint PPGN-h: joint Generator and DAE
Combination of Losses
Comparison of Losses:
● Real Images
●
●
●
●
53. Method | Joint PPGN-h: joint Generator and DAE
Combination of Losses
54. Method | Joint PPGN-h: joint Generator and DAE
Combination of Losses
58. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
61. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
64. Index
● Introduction
● Probabilistic Interpretation of the method
● Methods and Experiments
○ PPGN-x: DAE model of p(x)
○ DGN-AM: sampling without a learned prior
○ PPGN-h: Generator and DAE model of p(h)
○ Joint PPGN-h: joint Generator and DAE
● Further Experiments
○ Image Generation: Captioning
○ Image Generation: Multifaceted Feature Visualization
○ Image inpainting
● Conclusions
70. Conclusions
● Only using GANs for the reconstruction, GANs collapse into fewer modes, far
from the original p(x).
● Using extra Losses it is possible to better reconstruct the images even for 1000
classes and for higher resolution. Mapping one-to-one helps to prevent typical
latent → missing modes.
● It would be great to generate also the embedding space for this
super-resolution multi-class images instead of using a supervised learned
space.