Multi-label image annotator trained on a subset of corel-5k dataset. I manually labeled the images using labelImg. For test images, a text file is generated annotating each image with the most significant objects present in that image.
Results
- Image Files: Both training and validation data are in
images
folder - Annotation XMLs: Manually annotated XML files are in
annotations/xmls
folder.annotations/train.txt
contains the training image names andannotations/test.txt
contains the validation image names. I wrote this script to map each label with an integer id. This mapping is written inannotations/label_map.pbtxt
- Inference: For testing purpose, images are kept in
test_images
folder. After the program finishes running, annotated images are generated inoutput/test_images
folder.
First, with python and pip installed, install the scripts requirements:
pip install -r requirements.txt
Then you must compile the Protobuf libraries:
protoc object_detection/protos/*.proto --python_out=.
Add models
and models/slim
to your PYTHONPATH
:
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
Note: This must be ran every time you open terminal, or added to your
~/.bashrc
file.
Run the script:
python object_detection/create_tf_record.py
Once the script finishes running, you will end up with a train.record
and a val.record
file. This is what we will use to train the model.
Training an object detector from scratch can take days, even when using multiple GPUs! In order to speed up training, we’ll take an object detector trained on a different dataset, and reuse some of it’s parameters to initialize our new model.
I used faster_rcnn_resnet101_coco
for the demo from model zoo.
Extract the files and move all the model.ckpt
to our models home directory.
Run the following script to train the model:
python object_detection/train.py \
--logtostderr \
--train_dir=train \
--pipeline_config_path=faster_rcnn_resnet101.config
When you model is ready depends on your training data, the more data, the more steps you’ll need. You can test your model every ~5k steps to make sure you’re on the right path.
You can find checkpoints for your model in train
folder.
Move the model.ckpt files with the highest number to the root of the repo:
model.ckpt-STEP_NUMBER.data-00000-of-00001
model.ckpt-STEP_NUMBER.index
model.ckpt-STEP_NUMBER.meta
In order to use the model, you first need to convert the checkpoint files (model.ckpt-STEP_NUMBER.*
) into a frozen inference graph by running this command:
python object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path faster_rcnn_resnet101.config \
--trained_checkpoint_prefix model.ckpt-STEP_NUMBER \
--output_directory output_inference_graph
You should see a new output_inference_graph
directory with a frozen_inference_graph.pb
file.
Just run the following command:
python object_detection/object_detection_runner.py
It will run your object detection model found at output_inference_graph/frozen_inference_graph.pb
on all the images in the test_images
directory and output the results in the output/test_images
directory.
I followed this excellent blog post on how to use Google Object Detection API with custom dataset.