HIGHLIGHTS
- who: RESHMI S. BHOOSHAN and colleagues from the (UNIVERSITY) have published the article: A Multimodal Framework For Video Caption Generation, in the Journal: (JOURNAL)
- what: In this work a video caption generation framework consisting of discrete wavelet convolutional neural architecture along with multimodal feature attention is proposed. To achieve this goal, the authors propose a deep neural_network architecture utilizing Discrete Wavelet Transform (DWT) based CNN for extracting more finer visual details from the video frames, which enables better video caption generation. Of this analysis and a comparative study with the state-of-the-art video . . .
If you want to have access to all the content you need to log in!
Thanks :)
If you don't have an account, you can create one here.