A multimodal framework for video caption generation

HIGHLIGHTS

  • who: RESHMI S. BHOOSHAN and colleagues from the (UNIVERSITY) have published the article: A Multimodal Framework For Video Caption Generation, in the Journal: (JOURNAL)
  • what: In this work a video caption generation framework consisting of discrete wavelet convolutional neural architecture along with multimodal feature attention is proposed. To achieve this goal, the authors propose a deep neural_network architecture utilizing Discrete Wavelet Transform (DWT) based CNN for extracting more finer visual details from the video frames, which enables better video caption generation. Of this analysis and a comparative study with the state-of-the-art video . . .

     

    Logo ScioWire Beta black

    If you want to have access to all the content you need to log in!

    Thanks :)

    If you don't have an account, you can create one here.

     

Scroll to Top

Add A Knowledge Base Question !

+ = Verify Human or Spambot ?