Skip to content

Latest commit

 

History

History
147 lines (113 loc) · 19.2 KB

File metadata and controls

147 lines (113 loc) · 19.2 KB

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector (CVPR'2020)

Abstract

Conventional methods for object detection typically require a substantial amount of training data and preparing such high-quality training data is very labor-intensive. In this paper, we propose a novel few-shot object detection network that aims at detecting objects of unseen categories with only a few annotated examples. Central to our method are our Attention-RPN, Multi-Relation Detector and Contrastive Training strategy, which exploit the similarity between the few shot support set and query set to detect novel objects while suppressing false detection in the background. To train our network, we contribute a new dataset that contains 1000 categories of various objects with high-quality annotations. To the best of our knowledge, this is one of the first datasets specifically designed for few-shot object detection. Once our few-shot network is trained, it can detect objects of unseen categories without further training or finetuning. Our method is general and has a wide range of potential applications. We produce a new state-of-the-art performance on different datasets in the few-shot setting. The dataset link is https://github.com/fanq15/Few-Shot-Object-Detection-Dataset.

Citation

@inproceedings{fan2020fsod,
    title={Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector},
    author={Fan, Qi and Zhuo, Wei and Tang, Chi-Keung and Tai, Yu-Wing},
    booktitle={CVPR},
    year={2020}
}

Note: ALL the reported results use the data split released from TFA official repo. Currently, each setting is only evaluated with one fixed few shot dataset. Please refer to DATA Preparation to get more details about the dataset and data preparation.

How to reproduce Attention RPN

Following the original implementation, it consists of 2 steps:

  • Step1: Base training

    • use all the images and annotations of base classes to train a base model.
  • Step2: Few shot fine-tuning:

    • use the base model from step1 as model initialization and further fine tune the model with few shot datasets.

An example of VOC split1 1 shot setting with 8 gpus

# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
    configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_base-training.py 8

# step2: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
    configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_1shot-fine-tuning.py 8

Note:

  • The default output path of base model in step1 is set to work_dirs/{BASE TRAINING CONFIG}/latest.pth. When the model is saved to different path, please update the argument load_from in step2 few shot fine-tune configs instead of using resume_from.
  • To use pre-trained checkpoint, please set the load_from to the downloaded checkpoint path.

Results on VOC dataset

Note:

  • The paper doesn't conduct experiments of VOC dataset. Therefore, we use the VOC setting of TFA to evaluate the method.
  • Some implementation details should be noticed:
    • The training batch size are 8x2 for all the VOC experiments and 4x2 for all the COCO experiments(following the official repo).
    • Only the roi head will be trained during few shot fine-tuning for VOC experiments.
    • The iterations or training strategy for VOC experiments may not be the optimal.
  • The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.
  • The difficult samples will not be used in base training or few shot setting.

Base Training

Arch Split Base AP50 ckpt log
r50 c4 1 71.9 ckpt log
r50 c4 2 73.5 ckpt log
r50 c4 3 73.4 ckpt log

Few Shot Finetuning

Arch Split Shot Novel AP50 ckpt log
r50 c4 1 1 35.0 ckpt log
r50 c4 1 2 36.0 ckpt log
r50 c4 1 3 39.1 ckpt log
r50 c4 1 5 51.7 ckpt log
r50 c4 1 10 55.7 ckpt log
r50 c4 2 1 20.8 ckpt log
r50 c4 2 2 23.4 ckpt log
r50 c4 2 3 35.9 ckpt log
r50 c4 2 5 37.0 ckpt log
r50 c4 2 10 43.3 ckpt log
r50 c4 3 1 31.9 ckpt log
r50 c4 3 2 30.8 ckpt log
r50 c4 3 3 38.2 ckpt log
r50 c4 3 5 48.9 ckpt log
r50 c4 3 10 51.6 ckpt log

Results on COCO dataset

Note:

  • Following the original implementation, the training batch size are 4x2 for all the COCO experiments.
  • The official implementation use different COCO data split from TFA, and we report the results of both setting. To reproduce the result following official data split (coco 17), please refer to Data Preparation to get more details about data preparation.
  • The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.

Base Training

Arch data source Base mAP ckpt log
r50 c4 TFA 23.6 ckpt log
r50 c4 official repo 24.0 ckpt log

Few Shot Finetuning

Arch data source Shot Novel mAP ckpt log
r50 c4 TFA 10 9.2 ckpt log
r50 c4 TFA 30 14.8 ckpt log
r50 c4 official repo 10 11.6 ckpt log