Self-Supervised Learning of Pose-Informed Latents
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -e .
Two dataloaders are provided for Objectron, one which is "file-based" and is used for prototyping on machines with fast SSD, while the other one is "hdf5" based for the formal experiments.
For the HDF5 dataloader expects a pre-processed HDF5 file, which contains the full frame images and annotations in the same package.
To generate the required HDF5 file, you need to have downloaded the Objectron tf.records
dataset, and use the simpose/data/objectron/
For the file-based dataloader, bounding box crops are taken for each object instance in the Objectron raw video sequences. The "raw" objectron annotations are used directly, without further processing. This dataloader allow reducing the orignal 2Tb dataset to a more tractable 2Gb dataset for quick experiment cycles.
TODO: Add UCF-101 notebooks.
Assuming you have the prepared HDF5 data in the /home/raphael/datasets/objectron folder, you can start pre-training with the following command:
python siampose/ --data=~/datasets/objectron --output=output --config=configs/config-pretrain-8gb.yaml
During pre-training, the accuracy on category prediction is used as a proxy for the model quality.
To evaluate on the zero-shot pose estimation task, you must first generate the embeddings using the main program.
python siampose/ --data=~/datasets/objectron \
--output=output/pretrain_224 \
This will generate embeddings for all images in the training and validation set. Care must be taken to use the same split as in training, or else you will get leaky results.
Once the embeddings are generated, the evaluation script can be launched.
python siampose/ output/pretrain_224 --subset_size=5000 --cpu
3D IoU @ 50% precision will be reported for each individual category in objectron.
The qualitative evaluation notebooks can be found in the "notebooks" folder.