2016-02-04から1日間の記事一覧
1 Introduction This work use the framework of encoder-decode models to solve the problem of image-caption generation. For encoder, the model learn the joint sentence-image embedding where sentence embeddings are encoded using LSTM, and ima…
1 Introduction This paper uses CNN to learn image regions embedding, bidirectional RNN to learn sentence embeddings, associate them into a common multimodal space, and a structured objective to align two modalities. Then it proposes a mutl…