Sequence level training with recurrent neural networks

0 Beam Search Pseudo-Code 1 Introduction In the previous seq2seq approach, the model is trained to predict the next word given the previous ground truth words as input. And at the test time , the resulting model is used to generate the ent…

Generative Adversarial Text to Image Synthesis

1 Introduction This work proposes a method to translate text into image pixels. One thorny remaining issue not solved by deep learning alone is that the distribution of images conditioned on a text is highly multimodal, in the sense that t…

A neural algorithm of artistic style

1 Introduction This paper proposes an algorithm to divide a picture into style and content. This can be used in picture synthesis. For example, given a picture of scenery called A, and a picture of any famous artwork B, we can recombine A …

Generating Images with Recurrent adversarial network

1 Introduction This work integrates the GAN and sequential generation into the model. By taking the normal sampling noise as input into the GAN at time T, the GAN generates the current part and write it on the canvas. All parts along the t…

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Netowrk

1 Introduction This work uses a cascade of convolution network (Convet) under the laplacian pyramid framework to generate images in a coarse-to-fine fashion. At each level of the pyramid, an independent Convnet is trained using Generative …

Paper Reading: Image Captioning with Semantic Attention

1 Introduction The contribution of this paper is to propose an attention mechanism which is integrated in a RNN framework, over the visual attributes of the given image and use the attention over visual attributes as well as the informatio…

DenseCap: Fully Convolutional Localisation Networks for Dense Captioning

1 Introduction This paper addresses the object localization and image caption jointly by proposing a fully convolutional localization network. (FCLN). The architecture is composed of a Convnet, a novel dense localization layer, and a RNN l…

Spatial Transformer Networks

1 Introduction A desirable property of a system which is able to reason about images is to disentangle object pose and part deformation from texture and shape. In order to overcome the drawback that CNN's lack of ability to be spatially in…

Generative Images From Captions With Attention

1 Introduction This work propose a sequential deep learning model to generate images from captions, the model draws a patch on a canvas in a time series and attend to relevant word at each time. 2 Related work Deep discriminative Model VS …

DRAW: A Recurrent Neural Network For Image Generation

1 Introduction We draw pictures not at once, but in a sequencial, iterative fashion.This work proposes an architecture to create a scene in a time series, and refine the sketches successively. The core of DRAW is a pair of recurrrent neura…

Show, Attend, and Tell: Neural Image Caption Generation with visual Attention

各种带隐藏变量的机器学习模型学会了吗? 神经网络,RBM,各种概率图 EM算法,变分推演,平均场等等套隐藏变量的求解方法 以上算法背后的凸优化方法 推公式 1 Introduction In the past, to solve image caption task, one always extracts features from im…

Unifying Visual-semantic Embedding with Multimodal Neural Language Models

1 Introduction This work use the framework of encoder-decode models to solve the problem of image-caption generation. For encoder, the model learn the joint sentence-image embedding where sentence embeddings are encoded using LSTM, and ima…

Deep Visual-Semantic Alignments for Generating Image Description

1 Introduction This paper uses CNN to learn image regions embedding, bidirectional RNN to learn sentence embeddings, associate them into a common multimodal space, and a structured objective to align two modalities. Then it proposes a mutl…

Deep Compositional Captioning: Describe Novel Object Categories without Paired Training Data

1 Inroduction In the past, the image caption model can only be trained on paired image-sentence corpora. To address this limitation, the author proposed a Deep Compositional Captioner that can generate descriptions about objects which don'…

Generation and Comprehension of Unambiguous Object Description

1 Inroduction The normal image caption task suffers a problem of the difficulty of evalution. There is no very convincing metric evalution that say one generation is exactly better than others. So this work does not generate deccription fr…

Paper Reading -> Learning like a child ,Fast Novel Concept Learning from sentence Description of Images

1 Introduction This paper address the problem of generating descriptions from images. The difference from this paper to other work is that the author proposed a method to deal with the new concept not seen in the training set. More specifi…

Some papers about image caption

1 Deep Visual-Semantic Alignments for Generating Image Description (CVPR2015) 2 long-term recurrent convolutional networks for visual recognition and description cvpr2015 3 Show and tell, a neural image caption generator. 4 Unifying visual…

Deep Fragment Embedding for bidirectional Image sentence mapping

1 Introduction: This model works on a finer level and embeds fragments of images(objects) and fragments of sentences into a common space. And the paper states that both global level of images and sentences and the finer level of their resp…

Multimodal Convolutional Neural Network for Matching Image and Sentence (Accepted by ICCV 2015)

This paper provides an end-to-end framework to match "image representation" and "word composition". More specifically, it consists of an image CNN encoding the image content, and one matching CNN learning the joint representation of image …

今日からブログを始めました。

今日からブログを始めました。このブログは主に自分の研究の話です。 思いついたidea、読んだ論文の要約、実装の話とかを書く予定です。