UNIMORE AImageLab Zip MONKEY Challenge Solution - ImmunoZip

Overview of the MONKEY Challenge: Detection of Inflammation in Kidney Biopsies

The MONKEY (Machine-learning for Optimal detection of iNflammatory cells in the KidnEY) challenge aims to develop automated methods for detecting and classifying inflammatory cells in kidney transplant biopsies. This initiative seeks to enhance the consistency and efficiency of histopathological assessments, particularly in the context of the Banff classification system.

Important

Please read the Monkey Challenge Website and watch the Official Seminar on YouTube.

For a medical background introduction, please read this doc: pathology and background.

Kick-off Webinar

Challenge Overview

The challenge comprises two primary tasks:

Detection of Mononuclear Inflammatory Cells (MNLs): Identifying mononuclear leukocytes in biopsy images.
Classification of Inflammatory Cells: Distinguishing between monocytes and lymphocytes within the detected cells.

Installation & Inference

⚠️ Important

To test inference with our pretrained models, included in this repo, we highly suggest you use docker. In this way you will create a docker container as in the challenge that will be tested with a single input WSI .tif file and the corresponding tissue mask .tif file.

System Requirements

We recommend a machine using UNIX/LINUX with at least 32 GB of RAM, an NVIDIA CUDA compatible GPU with 16 GB minimum and 60 GB or more of free disk space.

We installed CUDA 12.1, and used both an HPC clusters with various NVIDIA GPUS ranging from 16 to 48 GB and a local machine with Docker and the NVIDIA Container Toolkit for testing the challenge container.

The docker algorithm ran without issues in the Grand Challenge platform, using the ml.g4dn.xlarge istance that comprises a single NVIDIA T4 GPU, 4 vCPUs, 16GiB of Memory and 125 GB NVMe SSD.

Steps

Download the backbone model weights (VIT-256 & SAM-H) from the CellViT-plus-plus repository here.
Put the CellVit++ backbones weights in the respective folders:
- CellViT-SAM-H-x40-AMP.pth goes in the docker_inference_grand_challenge/resources/backbones/SAM-H folder.
- CellViT-256-x40-AMP.pth goes in the docker_inference_grand_challenge/resources/backbones/VIT-256 folder.
The finetuned weights (ensemble) are already in the repository folder in docker_inference_grand_challenge/example_model, while other additional fallback model weights or the one used before the ensemble submission are in the docker_inference_grand_challenge/resources/models folder.
Put your WSI .tif image in the docker_inference_grand_challenge/test/input/images/kidney-transplant-biopsy-wsi-pas folder and the WSI tissue-mask .tif file in the docker_inference_grand_challenge/test/input/images/tissue-mask folder.

Option A - Inference with docker (recommended)

Run the test_run.sh in the docker_inference_grand_challenge folder. The script will create a docker container installing all the required dependencies, then run the inference via the entrypoint inference.py script.
If successful, 3 JSONs files for the predictions will be in the docker_inference_grand_challenge/test/output folder.
(OPTIONAL) If the inference was successfull, you can use the save.sh script to save the compressed container for uploading it in the Grand Challenge website. You can then compress the models inside the docker_inference_grand_challenge/example_model folder also for uploading them separately in the Grand Challenge platform. You can do that by running:
```
  tar -czvf ensemble_compressed_models.tar.gz -C /docker_inference_grand_challenge/example_model/ .
```

Option B - Inference using Conda env (less reproducible)

Install the conda env and other requirements:

 conda env create -f environment_verbose.yaml
 conda activate cellvit_env
 pip install -r requirements.txt

Re-install a correct/different version of torch, torchvision and torchaudio depending on your CUDA version (we used CUDA 12.1 as the original CellVit++ repo)
```
 pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121
```
Install ASAP

Install Openslide and WholeSlideData:

 pip install openslide-python openslide-bin
 pip install git+https://github.com/DIAGNijmegen/pathology-whole-slide-data@main

Substitute the path_to_ASAP_installation with the actual ASAP path and path_to_conda_env with the path to your cellvit_env and run the command:
```
 echo "path_to_ASAP_installation/ASAP/bin" > path_to_conda_env/lib/python3.10/site-packages/asap.pth
```
In the docker_inference_grand_challenge/inference.py script, set the flag in line 226 to False:
```
DOCKER_INFERENCE = False # NOTE: Set to False if running locally without Docker
```
Run the docker_inference_grand_challenge/inference.py script
If successful, 3 JSONs files for the predictions will be in the docker_inference_grand_challenge/test_output folder.

Architecture and Inference Pipeline

Note

We are currently writing the documentation for this repository as it is not yet complete. Thank you for your patience.

Our approach is based on the state-of-the-art CellViT-plus-plus framework, which leverages a pre-trained foundational model backbone for nuclei detection, segmentation, and classification in whole slide images (WSIs). We enhance the system by fine-tuning a multi-layer perceptron (MLP) classifier to assign one of three classes to every detected nucleus: monocytes, lymphocytes, and an additional "other" class. The "other" class is generated semi-automatically using the CellViT SAM-H model, augmenting the training dataset for the MONKEY challenge.

Detection and Classification Workflow

1. WSI Patchification

We create a custom patchified dataset from the input WSI using the Whole Slide Data library.
The patches and their associated region-of-interest (ROI) masks are stored in a temporary folder.
This step ensures efficient processing of extremely large WSIs.

2. Nuclei Detection and Embedding Extraction

The pre-trained CellViT-plus-plus backbone is used to detect nuclei in the patches.
The model also extracts high-quality feature embeddings for each detected nucleus.

3. MLP Classifier Fine-tuning

A multi-layer perceptron (MLP) classifier is fine-tuned on a custom dataset composed of extracted nuclei embeddings.
Since the foundational model detects all nuclei, we generate a third annotated class ("other") using CellViT SAM-H.
This additional class improves classifier robustness on the MONKEY challenge dataset.

Inference with Ensemble or Single Model

At inference time, we support two modes:

1. Ensemble Inference

Each of the 5-fold models runs independently on the patchified dataset.
Global cell prediction dictionaries from each model are merged using a KDTree-based clustering strategy.
Within each cluster:
- Majority voting determines the final predicted class.
- Averaged probabilities (only from agreeing predictions) reduce variance.

2. Single Model Inference

If only one model is provided, the pipeline runs without ensemble merging.

Postprocessing

1. ROI Filtering

Only predictions inside the tissue region (defined by the WSI mask) are retained.

2. Overlapping Detections

KDTree-based deduplication removes overlapping predictions within a specified radius.

3. Coordinate Conversion

Using the base microns-per-pixel (MPP) resolution, final pixel coordinates are converted to millimeters.

Annotation Generation

The final, filtered detections are parsed into three JSON files, corresponding to the three required classes.
These JSON files serve as the final output for evaluation.

Example of semi-automatic Annotation using the CellVit SAM-H backbone

Acknowledgments and Citations

This repository makes use of multiple frameworks and models developed by external research teams. We acknowledge their contributions and provide citations for the works that influenced this project.

Referenced Works

CellViT & CellViT++
- Hörst, F., et al. (2024). CellViT: Vision Transformers for precise cell segmentation and classification. Medical Image Analysis, 94, 103143.
  DOI:10.1016/j.media.2024.103143
- Hörst, F., et al. (2025). CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models. arXiv.
  DOI:10.48550/ARXIV.2501.05269
VIT256 & HIPT
- Mahmood Lab. HIPT: Hierarchical Image Pyramid Transformer for Histopathology.
  Licensed under Apache 2.0 with Commons Clause.
  Source Repository
Segment Anything Model (SAM)
- Kirillov, A., et al. (2023). Segment Anything. Meta AI Research.
  Licensed under Apache 2.0.
  Source Repository

These works have significantly contributed to the development of our approach in the MONKEY Challenge. If you use this repository, please ensure proper attribution to these sources.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docker_inference_grand_challenge		docker_inference_grand_challenge
docs		docs
media/images		media/images
source		source
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UNIMORE AImageLab Zip MONKEY Challenge Solution - ImmunoZip

Overview of the MONKEY Challenge: Detection of Inflammation in Kidney Biopsies

Kick-off Webinar

Challenge Overview

Installation & Inference

⚠️ Important

System Requirements

Steps

Option A - Inference with docker (recommended)

Option B - Inference using Conda env (less reproducible)

Architecture and Inference Pipeline

Detection and Classification Workflow

1. WSI Patchification

2. Nuclei Detection and Embedding Extraction

3. MLP Classifier Fine-tuning

Inference with Ensemble or Single Model

1. Ensemble Inference

2. Single Model Inference

Postprocessing

1. ROI Filtering

2. Overlapping Detections

3. Coordinate Conversion

Annotation Generation

Example of semi-automatic Annotation using the CellVit SAM-H backbone

Acknowledgments and Citations

Referenced Works

About

Releases

Packages

Languages

License

AImageLab-zip/MONKEY_challenge_ziplab

Folders and files

Latest commit

History

Repository files navigation

UNIMORE AImageLab Zip MONKEY Challenge Solution - ImmunoZip

Overview of the MONKEY Challenge: Detection of Inflammation in Kidney Biopsies

Kick-off Webinar

Challenge Overview

Installation & Inference

⚠️ Important

System Requirements

Steps

Option A - Inference with docker (recommended)

Option B - Inference using Conda env (less reproducible)

Architecture and Inference Pipeline

Detection and Classification Workflow

1. WSI Patchification

2. Nuclei Detection and Embedding Extraction

3. MLP Classifier Fine-tuning

Inference with Ensemble or Single Model

1. Ensemble Inference

2. Single Model Inference

Postprocessing

1. ROI Filtering

2. Overlapping Detections

3. Coordinate Conversion

Annotation Generation

Example of semi-automatic Annotation using the CellVit SAM-H backbone

Acknowledgments and Citations

Referenced Works

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages