GitHub - ccb-hms/computervision: Everything to get started with computer vision

The CCB Computer Vision Code Repository

This repository contains template code that that can be used as a starting point for computer vision projects. All frameworks, libraries and data sets are open source and publicly available. Some common tasks included here are:

The Dentex Challenge 2023

The Dentex Challenge 2023 aims to provide insights into the effectiveness of AI in dental radiology analysis and its potential to improve dental practice by comparing frameworks that simultaneously point out abnormal teeth with dental enumeration and associated diagnosis on panoramic dental X-rays. The dataset comprises panoramic dental X-rays obtained from three different institutions using standard clinical conditions but varying equipment and imaging protocols, resulting in diverse image quality reflecting heterogeneous clinical practice. It includes X-rays from patients aged 12 and above, randomly selected from the hospital's database to ensure patient privacy and confidentiality. A detailed description of the data and the annotation protocol can be found on the Dentex Challenge website. The data set is publicly available for download from the Zenodo open-access data repository.

Install locally with Docker

The most convenient way to get started with this repository is to run the code examples in a Docker container.

The Dockerfile and docker-compose.yml files included in the repository can be used to create a Docker image and run a Docker container, which together provide a reproducible Python development environment for computer vision experimentation. This environment includes Ubuntu 22.04, Python 3.10.12, PyTorch 2.2 and Tensorflow 2.15 with NVIDIA CUDA 12.1, and a Jupyter Lab server, making it well-suited for training and evaluating custom models.

Here's a step-by-step guide on how to use this setup:

Install Docker on your machine.
Clone the GitHub project repository to download the contents of the repository:

git clone git@github.com:ccb-hms/computervision.git

Navigate to the repository's directory: Use cd computervision to change your current directory to the repository's directory.
Build the Docker image. Use the command docker compose build to build a Docker image from the Dockerfile in the current directory. This image will include all the specifications from the Dockerfile, such as Ubuntu 22.04, Python 3.10.12, PyTorch 2.2 and TensorFlow 2.15 with CUDA, and a Jupyter Lab server.
Run docker compose up to start the Docker container based on the configurations in the docker-compose.yml file. This will also download a TensorFlow 2 image with the TensorBoard server for tracking and visualizing important metrics such as loss and accuracy. The default docker-compose.ymlfile expects a GPU accelerator and the NVIDIA Container Toolkit installed on the local machine. Without a GPU, training of the neural networks in the example notebooks will be extremely slow. However, with the following command, the containers can be run without GPU support:

docker compose -f docker-compose-cpu.yml up

Access Jupyter Lab: Click on the link that starts with localhost:8888 provided by the output of the last command.
Access TensorBoard: Open a web browser and go to localhost:6006 to access the TensorBoard server. Real-time visualizations of important metrics will show up once model training is started.
Data sets and model checkpoints use a ./data folder inside the root of the repository. The location and the name of this directory are defined by the environmental variable DATA_ROOT.

GPU support for Docker

The NVIDIA Container Toolkit is a set of tools designed to enable GPU-accelerated applications to run within Docker containers. This toolkit facilitates the integration of NVIDIA GPUs with container runtimes, allowing developers and data scientists to harness the power of GPU computing in containerized environments. See the NVIDIA Container Toolkit page for installation instructions.

Install without Docker

For installation in a local environment we use Pipenv to provide a pure, repeatable, application environment. Mac/windows users should install pipenv into their main python environment as instructed. Pipenv is a packaging tool for Python that solves some common problems associated with the typical workflow using pip, virtualenv, and the good old requirements.txt. It combines the functionalities of pip and virtualenv into one tool, providing a smooth and convenient workflow for developers.

Follow the recommendations below for installation on O2, the HPC platform at HMS. For local Linux-based environments, omit the instructions for loading modules.

The notebooks use an environment variable called DATA_ROOT to keep track of the data files. For use with a docker container, this variable is defined in the Dockerfile as DATA_ROOT=/app/docker. If you do not use docker, you can just set the DATA_ROOT variable yourself or run the bash script in computervision/bash_scripts/create_env:

cd computervision/bash_scripts
chmod +x ./create_env
source ./create_env

This creates a .env file in the project directory which is then automatically read by pipenv when the jupyter lab server is started with:

pipenv run jupyter lab

Install on O2 at Harvard Medical School

O2 is the Linux-based high-performance computing platform at Harvard Medical School. The platform is managed by the Research Computing Group, part of HMS IT, and documented on the O2 documentation website. The cluster does not support Docker at this time. To install this package incl. the detectron2 library, follow the instructions to install on O2.

Label Studio

Label Studio is an open-source data labeling tool for labeling, annotating, and exploring many different data types. Additionally, the tool includes a powerful machine learning interface that can be used for new model training, active learning, supervised learning, and many other training techniques.

Multi-type annotations: Label Studio supports multiple types of annotations, including labeling for audio, video, images, text, and time series data. These annotations can be used for tasks such as object detection, semantic segmentation, and text classification among others.
Customizable: The label interface can be customized using a configuration API.

Machine Learning backend: Label Studio allows integration with machine learning models. You can pre-label data using model predictions and then manually adjust the results.
Data Import and Export: Label Studio supports various data sources for import and export. You can import data from Amazon S3, Google Cloud Storage, or a local file system, and export it in popular formats like COCO, Pascal VOC, or YOLO.
Collaboration: It supports multiple users, making it suitable for collaborative projects.
Scalability: Label Studio can be deployed in any environment, be it on a local machine or in a distributed setting, making it a scalable solution.

How to Use Label Studio

The tool is included in this repository as a submodule. When you clone the main project, by default the directory that contains the submodule is included, but without the files. Those can be installed when needed:

# Clone the main project if not already done
git clone git@github.com:ccb-hms/computervision.git
# CD into the computervision/label-studio directory 
cd computervision/label-studio
# Download the latest version 
git submodule init
git submodule update

Label studio can be run as a server application in a docker container. The process is the same as described above for the main repository.

# CD into the computervision/label-studio directory 
cd computervision/label-studio
# Create the Label Studio image 
docker compose build
# Run the Label Studio server
docker compose up

Once installed, open a web browser and go to localhost:8080 to access the Label Studio server. For more detailed installation instructions, see the installation instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.github/workflows		.github/workflows
bash_scripts		bash_scripts
docs		docs
images		images
label-studio @ b1d9b70		label-studio @ b1d9b70
notebooks		notebooks
src/computervision		src/computervision
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.gitmodules		.gitmodules
.readthedocs.yml		.readthedocs.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
README.rst		README.rst
docker-compose-cpu.yml		docker-compose-cpu.yml
docker-compose-gpu.yml		docker-compose-gpu.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The CCB Computer Vision Code Repository

The Dentex Challenge 2023

Install locally with Docker

GPU support for Docker

Install without Docker

Install on O2 at Harvard Medical School

Label Studio

How to Use Label Studio

About

Packages

Languages

License

ccb-hms/computervision

Folders and files

Latest commit

History

Repository files navigation

The CCB Computer Vision Code Repository

The Dentex Challenge 2023

Install locally with Docker

GPU support for Docker

Install without Docker

Install on O2 at Harvard Medical School

Label Studio

How to Use Label Studio

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Languages

Packages