Skip to content

Commit

Permalink
Merge pull request #19 from songtianhui/main
Browse files Browse the repository at this point in the history
Support Remove Stage
  • Loading branch information
songtianhui authored Jan 26, 2024
2 parents a45cc9e + 91a0916 commit 94ca2e6
Show file tree
Hide file tree
Showing 30 changed files with 478 additions and 407 deletions.
232 changes: 117 additions & 115 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,115 +1,117 @@
# MixFormerV2
The official implementation of the NeurIPS 2023 paper: [**MixFormerV2: Efficient Fully Transformer Tracking**](https://arxiv.org/abs/2305.15896).

## Model Framework
![model](tracking/model.png)

## Distillation Training Pipeline
![training](tracking/training.png)


## News

- **[Sep 22, 2023]** MixFormerV2 is accpeted by **NeurIPS 2023**! :tada:

- **[May 31, 2023]** We released two versions of the pretrained model, which can be accessed on [Google Driver](https://drive.google.com/drive/folders/1soQMZyvIcY7YrYrGdk6MCstTPlMXNd30?usp=sharing).

- **[May 26, 2023]** Code is available now!


## Highlights

### :sparkles: Efficient Fully Transformer Tracking Framework

MixFormerV2 is a well unified fully transformer tracking model, without any dense convolutional operation and complex score prediction module. We propose four key prediction tokens to capture the correlation between target template and search area.

### :sparkles: A New Distillation-based Model Reduction Paradigm

To further improve efficiency, we present a new distillation paradigm for tracking model, including dense-to-sparse stage and deep-to-shallow stage.

### :sparkles: Strong Performance and Fast Inference Speed

MixFormerV2 works well for different benchmarks and can achieve **70.6%** AUC on LaSOT and **57.4%** AUC on TNL2k, while keeping 165fps on GPU. To our best knowledge, MixFormerV2-S is the **first** transformer-based one-stream tracker which achieves real-time running on CPU.


## Install the environment
Use the Anaconda
``` bash
conda create -n mixformer2 python=3.6
conda activate mixformer2
bash install_requirements.sh
```

## Data Preparation
Put the tracking datasets in ./data. It should look like:
```
${MixFormerV2_ROOT}
-- data
-- lasot
|-- airplane
|-- basketball
|-- bear
...
-- got10k
|-- test
|-- train
|-- val
-- coco
|-- annotations
|-- train2017
-- trackingnet
|-- TRAIN_0
|-- TRAIN_1
...
|-- TRAIN_11
|-- TEST
```

## Set project paths
Run the following command to set paths for this project
```
python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .
```
After running this command, you can also modify paths by editing these two files
```
lib/train/admin/local.py # paths about training
lib/test/evaluation/local.py # paths about testing
```

## Train MixFormerV2

Training with multiple GPUs using DDP. More details of other training settings can be found at `tracking/train_mixformer.sh`.

``` bash
bash tracking/train_mixformer.sh
```

## Test and evaluate MixFormerV2 on benchmarks
- LaSOT/GOT10k-test/TrackingNet/OTB100/UAV123/TNL2k. More details of test settings can be found at `tracking/test_mixformer.sh`.

``` bash
bash tracking/test_mixformer.sh

```


## TODO
- [ ] Progressive eliminating version of training.
- [ ] Fast version of test forwarding.

## Contant
Tianhui Song: 191098194@smail.nju.edu.cn

Yutao Cui: cuiyutao@smail.nju.edu.cn


## Citiation
``` bibtex
@misc{mixformerv2,
title={MixFormerV2: Efficient Fully Transformer Tracking},
author={Yutao Cui and Tianhui Song and Gangshan Wu and Limin Wang},
year={2023},
eprint={2305.15896},
archivePrefix={arXiv}
}
```
# MixFormerV2
The official implementation of the NeurIPS 2023 paper: [**MixFormerV2: Efficient Fully Transformer Tracking**](https://arxiv.org/abs/2305.15896).

## Model Framework
![model](tracking/model.png)

## Distillation Training Pipeline
![training](tracking/training.png)


## News

- **[Sep 22, 2023]** MixFormerV2 is accpeted by **NeurIPS 2023**! :tada:

- **[May 31, 2023]** We released two versions of the pretrained model, which can be accessed on [Google Driver](https://drive.google.com/drive/folders/1soQMZyvIcY7YrYrGdk6MCstTPlMXNd30?usp=sharing).

- **[May 26, 2023]** Code is available now!


## Highlights

### :sparkles: Efficient Fully Transformer Tracking Framework

MixFormerV2 is a well unified fully transformer tracking model, without any dense convolutional operation and complex score prediction module. We propose four key prediction tokens to capture the correlation between target template and search area.

### :sparkles: A New Distillation-based Model Reduction Paradigm

To further improve efficiency, we present a new distillation paradigm for tracking model, including dense-to-sparse stage and deep-to-shallow stage.

### :sparkles: Strong Performance and Fast Inference Speed

MixFormerV2 works well for different benchmarks and can achieve **70.6%** AUC on LaSOT and **57.4%** AUC on TNL2k, while keeping 165fps on GPU. To our best knowledge, MixFormerV2-S is the **first** transformer-based one-stream tracker which achieves real-time running on CPU.


## Install the environment
Use the Anaconda
``` bash
conda create -n mixformer2 python=3.6
conda activate mixformer2
bash install_requirements.sh
```

## Data Preparation
Put the tracking datasets in ./data. It should look like:
```
${MixFormerV2_ROOT}
-- data
-- lasot
|-- airplane
|-- basketball
|-- bear
...
-- got10k
|-- test
|-- train
|-- val
-- coco
|-- annotations
|-- train2017
-- trackingnet
|-- TRAIN_0
|-- TRAIN_1
...
|-- TRAIN_11
|-- TEST
```

## Set project paths
Run the following command to set paths for this project
```
python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .
```
After running this command, you can also modify paths by editing these two files
```
lib/train/admin/local.py # paths about training
lib/test/evaluation/local.py # paths about testing
```

## Train MixFormerV2

Training with multiple GPUs using DDP.
You can follow instructions (in Chinese now) in [training.md](tutorials/training_zh.md).
Example scripts can be found in `tracking/train_mixformer.sh`.

``` bash
bash tracking/train_mixformer.sh
```

## Test and evaluate MixFormerV2 on benchmarks
- LaSOT/GOT10k-test/TrackingNet/OTB100/UAV123/TNL2k. More details of test settings can be found in `tracking/test_mixformer.sh`.

``` bash
bash tracking/test_mixformer.sh

```


## TODO
- [x] Progressive eliminating version of training.
- [ ] Fast version of test forwarding.

## Contant
Tianhui Song: 191098194@smail.nju.edu.cn

Yutao Cui: cuiyutao@smail.nju.edu.cn


## Citiation
``` bibtex
@misc{mixformerv2,
title={MixFormerV2: Efficient Fully Transformer Tracking},
author={Yutao Cui and Tianhui Song and Gangshan Wu and Limin Wang},
year={2023},
eprint={2305.15896},
archivePrefix={arXiv}
}
```
3 changes: 2 additions & 1 deletion experiments/mixformer2_vit/student_288_depth12.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ MODEL:
DEPTH: 12
MLP_RATIO: 4
PRETRAINED: True
PRETRAINED_PATH: './models/mae_pretrain_vit_base.pth' #'/data0/cyt/experiments/trackmae/models/mae_pretrain_vit_base.pth'
PRETRAINED_PATH: './models/mae_pretrain_vit_base.pth'
HEAD_TYPE: MLP
HIDDEN_DIM: 768
PREDICT_MASK: false
Expand All @@ -68,6 +68,7 @@ TRAIN:
DECAY_RATE: 400
VAL_EPOCH_INTERVAL: 5
WEIGHT_DECAY: 0.0001
FIND_UNUSED_PARAMETERS: false
TEST:
EPOCH: 500
SEARCH_FACTOR: 4.5
Expand Down
73 changes: 1 addition & 72 deletions experiments/mixformer2_vit/teacher_288_depth12.yaml
Original file line number Diff line number Diff line change
@@ -1,42 +1,3 @@
DATA:
MAX_SAMPLE_INTERVAL: 200
MEAN:
- 0.485
- 0.456
- 0.406
SEARCH:
CENTER_JITTER: 4.5
FACTOR: 5.0 #4.5
SCALE_JITTER: 0.5
SIZE: 288
STD:
- 0.229
- 0.224
- 0.225
TEMPLATE:
CENTER_JITTER: 0
FACTOR: 2.0
SCALE_JITTER: 0
SIZE: 128
NUMBER: 2
TRAIN:
DATASETS_NAME:
- GOT10K_vottrain
- LASOT
- COCO17
- TRACKINGNET
DATASETS_RATIO:
- 1
- 1
- 1
- 1
SAMPLE_PER_EPOCH: 60000
VAL:
DATASETS_NAME:
- GOT10K_votval
DATASETS_RATIO:
- 1
SAMPLE_PER_EPOCH: 10000
MODEL:
VIT_TYPE: base_patch16
FEAT_SZ: 72
Expand All @@ -47,36 +8,4 @@ MODEL:
PRETRAINED_PATH: './models/mae_pretrain_vit_base.pth' #'/data0/cyt/experiments/trackmae/models/mae_pretrain_vit_base.pth'
HEAD_TYPE: MLP
HIDDEN_DIM: 768
PREDICT_MASK: false
TRAIN:
BACKBONE_MULTIPLIER: 0.1
BATCH_SIZE: 2 # 8 for 2080ti (maybe 10), 32 for tesla V100(32 G)
DEEP_SUPERVISION: false
EPOCH: 500
IOU_WEIGHT: 2.0
GRAD_CLIP_NORM: 0.1
L1_WEIGHT: 5.0
CORNER_WEIGHT: 5.0
FEAT_WEIGHT: 0.0
LR: 0.0004
LR_DROP_EPOCH: 400
NUM_WORKER: 8
OPTIMIZER: ADAMW
PRINT_INTERVAL: 50
SCHEDULER:
TYPE: step
DECAY_RATE: 400
VAL_EPOCH_INTERVAL: 5
WEIGHT_DECAY: 0.0001
TEST:
EPOCH: 500
SEARCH_FACTOR: 4.5
SEARCH_SIZE: 288
TEMPLATE_FACTOR: 2.0
TEMPLATE_SIZE: 128
UPDATE_INTERVALS:
LASOT: [200]
GOT10K_TEST: [200]
TRACKINGNET: [25]
VOT20: [10]
VOT20LT: [200]
PREDICT_MASK: False
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ MODEL:
HIDDEN_DIM: 768
FEAT_SZ: 96
PREDICT_MASK: false
PRETRAINED_STAGE1: True
PRETRAINED_STATIC: True
TRAIN:
BACKBONE_MULTIPLIER: 0.1
BATCH_SIZE: 32 # 8 for 2080ti (maybe 10), 32 for tesla V100(32 G)
Expand Down
2 changes: 1 addition & 1 deletion experiments/mixformer2_vit_online/288_depth8_score.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ MODEL:
HIDDEN_DIM: 768
FEAT_SZ: 96
PREDICT_MASK: false
PRETRAINED_STAGE1: True
PRETRAINED_STATIC: True
TRAIN:
BACKBONE_MULTIPLIER: 0.1
BATCH_SIZE: 32 # 8 for 2080ti (maybe 10), 32 for tesla V100(32 G)
Expand Down
Loading

0 comments on commit 94ca2e6

Please sign in to comment.