Deep Generative Image Models using a Laplacian Pyramid of Adversarial Netowrk
1 Introduction
This work uses a cascade of convolution network (Convet) under the laplacian pyramid framework to generate images in a coarse-to-fine fashion. At each level of the pyramid, an independent Convnet is trained using Generative Adversarial Network, which can capture image structures at a particular scale. Instead of generating an image at one time, they do it in a sequence style. At the beginning, a random vector is sampled as input to a Convet, output an image at its level. The second stage samples the structure of image at the current level, conditioned on the image generated before. Subsequent levels continue this process , always conditioning on the output from the previous scale until the final level is reached.
2 Approach
2.1 Generative Adversarial Network
The method contains two networks against one another: a generative model G that captures the data distribution and a discriminative model D that distinguishes between samples drawn from G and samples of the image data. G takes a noise vector drawn from a distribution Pnoise(Z) and output an image h'. D takes an image as input stochastically chose to be either h' , as generated by G , or h, a real image drawn from the training data. Pdata(h). D outputs a scalar probability ,which is trained to be high if the input was real and low if generated from G.
2.2 Laplacian pyramid
d(.) be a downsampling operation.
Image I with shape of (W,H), d(I) with shape of (W/2,H/2)
u(.) be a upsampling operation.
Image I with shape of (W/2,H/2), u(I) with shape of (W,H)
The coefficient hk at each level k of the laplacian pyramid are constructed by taking the difference between adjacent levels in the Gaussian pyramid.
2.3 Laplacian Generative Adversarial Network
We have a set of generative Convet models {G0,G1,......Gk}, each of which captures the distribution of coefficient hk for natural images at a different level of the laplacian pyramid.
Note that models at all levels except the final are conditional generative models that take an upsampled version of the current image I'k+1 as a conditioning variable, in addition to the noise vector Zk.
The generative models {G0,G1,......., Gk} are trained using the Conditional GAN approach at each level of the pyramid. We construct a laplacian pyramid from each training image I. At each level we make a stochastic decision to either (i) construct the coefficients hk using the Equation(3), or generate them using Gk.
Gk takes as input the image Lk = u(Lk+1), as well as noise vector Zk. Dk takes as input hk or h'k along with the low-pass image