[ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
image-classification clip image-retrieval vlm ovi text-retrieval multimodal vision-language oti contrastive-learning textual-inversion vision-language-model siglip iclr2025 modality-gap modality-inversion intra-modal inter-modal intra-modal-misalignment visual-inversion
-
Updated
Feb 7, 2025