Beyond Caption to Narrative: Video Captioning with Multiple Sentences
We attempt to generate video captions that convey richer contents by temporally segmenting the video with action localization, generating multiple captions from a single video, and connecting them with natural language processing techniques, in order to generate a story-like caption. We show that our proposed method can generate captions that are richer in contents.
Andrew Shin, Katsunori Ohnishi, Tatsuya Harada
ICIP 2016