Fine-grained cross-modal semantic consistency in natural conservation image data from a multi-task perspective

By sp_admin / 14th August 2024

HIGHLIGHTS

What: Building upon this the authors propose momentum encoding from a multi-task perspective as a bridge for information effectively improving mutual information representation quality and optimizing the distribution of feature points within the crossmodal shared semantic space. The authors seek not only to retrieve a single image but also to attach essential descriptions when summoning an image. To address this, the authors propose a multi-task model for joint training in cross-modal image-text retrieval and image captioning. In this paper, the objective is to preserve semantic consistency in the context of fine-grained visual . . .

If you want to have access to all the content you need to log in!

Thanks :)

Username or Email

Password

Remember me

Lost your password?

If you don't have an account, you can create one here.

Search Knowledgebase
Ask A Question?

Add A Knowledge Base Question !

Question Title:

Category:

Captcha: + = Verify Human or Spambot ?