Skip to content

eltociear/C-VQA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C-VQA: Counterfactual Reasoning VQA Dataset

This is the code and data for C-VQA.

Dataset

The dataset directory is C-VQA. You can find the questions in .csv files.

Download Images

After cloning:

pip install gdown
bash download_images.sh

Scripts

The scripts directory contains all required scripts for running models in the paper.

Before you run the a script, install the corresponding model and get the weights. Then put the script in the root directory of the model.

Please change PATH_TO_IMAGES in the scripts to the actual directory of images.

Please change PATH_TO_MODEL in the scripts for ViperGPT with different code generators to the actual directory of models.

For example, to run BLIP on C-VQA, use run this command in the root directory of LLaVa:

python run_eval_lavis.py --model-name blip2_t5 --model-type pretrain_flant5xxl --query PATH_TO_CSV_FILE

Download Code Generator Models

Change YOUR_HUGGINGFACE_TOKEN in download_model.py to your huggingface token. Then run:

pip install huggingface_hub
python download_model.py

You can add more code generators in download_model.py by adding models in repo_ids and local_dirs.

Citation

If this code is useful for your research, please consider citing our work.

@InProceedings{zhang2023cvqa,
    author    = {Zhang, Letian and Zhai, Xiaotong and Zhao, Zhongkai and Wen, Xin and Zhao, Bingchen},
    title     = {What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-Modal Language Models},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
    year      = {2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.2%
  • Shell 1.8%