DRAW: A Recurrent Neural Network For Image Generation

1 Introduction

We draw pictures not at once, but in a sequencial, iterative fashion.This work proposes an architecture to create a scene in a time series, and refine the sketches successively.

The core of DRAW is a pair of recurrrent neural networks: an encoder that compresses the real images and a decoder that reconstitutes images after receiving codes. The loss function is a variational upper bound on the log-likelihood of the data.

It generates images step by step , selectively attending to parts of images while ignoring others.

The DRAW architecture is similar to other variational auto-encoders: an encoder network determines a distribution over latent codes that capture salient information about the input data; a decoder network receives samples from the code distribution and uses them to condition its own distribution over images.

2 The DRAW Network

2.1 Network Architecture

f:id:PDFangeltop1:20160205193820p:plain

Q(Zt|h(t,enc)) is a diagnonal Gaussian

f:id:PDFangeltop1:20160205194026p:plain

f:id:PDFangeltop1:20160205194103p:plain

f:id:PDFangeltop1:20160205194154p:plain

2.2 Loss Function

The final canvas matrix Ct is used to parameterise a model D(X|Ct). D is a Bernoulli distribution.

f:id:PDFangeltop1:20160205194615p:plain reconstruction loss

f:id:PDFangeltop1:20160205194623p:plain latent loss

f:id:PDFangeltop1:20160205194630p:plain

f:id:PDFangeltop1:20160205194640p:plain

2.3 Stochastic data generation

f:id:PDFangeltop1:20160205200318p:plain

2.4 Read and Write Operation

f:id:PDFangeltop1:20160205201544p:plain

N*N grid of Gaussian Filters is positioned on the image by specifying the co-ordinates of the grid center and the stride distance between adjacent filters.

f:id:PDFangeltop1:20160205201854p:plain