Skip to content

Commit

Permalink
Finetune (#227)
Browse files Browse the repository at this point in the history
* ft attn and loss

* wip

* loss func works

* ft text on dataset

* ft region

* wip

* working ft region

* added region loss to finetune_region.py

* correct eos token

* clean

* clean

* clean

* clean

* ear

* works with hf safetensors

* Readme
  • Loading branch information
EthanReid authored Feb 6, 2025
1 parent f981200 commit 5dc35df
Show file tree
Hide file tree
Showing 8 changed files with 689 additions and 495 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ data
poetry.lock
dist
clients/python/moondream/torch
.vscode/launch.json
wandb
117 changes: 117 additions & 0 deletions moondream/finetune/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Finetuning Moondream 2B

This readme will walk you through the process of finetuning the text and region encoders of the Moondream 2B model.

> Make sure to run all commands from the root directory of the project.
## Initial Setup

### Clone and Setup Environment
```bash
git clone https://github.com/vikhyat/moondream
cd moondream
python -m venv .venv
source .venv/bin/activate
```

### Install Dependencies
```bash
# Install base requirements
pip install -r requirements.txt

# Install finetuning specific dependencies
pip install safetensors datasets bitsandbytes tqdm wandb einops
```

## Downloading the Base Model

Download `model.safetensors` from the [Hugging Face repository](https://huggingface.co/vikhyatk/moondream2/tree/main) and place it in the `models` directory as `moondream_base.safetensors`.

```bash
# Create models directory
mkdir -p models

# Download it using curl (run from root moondream directory)
wget https://huggingface.co/vikhyatk/moondream2/resolve/main/model.safetensors
```

## Weights & Biases

We use Weights & Biases (wandb) to track finetuning progress.

To set it up to track your runs, use `wandb login`

This will take you through creating an account if you don't have one setup already. Enter your API key and you're ready to go.

## Finetuning the Text Encoder

For this example, we will be teaching Moondream to describe images.

Given the prompt:
`\n\nQuestion: Describe this image.\n\nAnswer:`

We return a more detailed caption of the image then you would get from the base model.

1. Double check that you've updated MODEL_PATH to point to the base moondream model in `moondream/finetune/finetune_text.py`
2. Double check that the save path ends in `.safetensors`, otherwise the run will fail.
> Navigate to line 150 in `moondream/finetune/finetune_text.py`,
``` # Add save path
save_file(
model.state_dict(),
"", // update this line ex: "models/moondream_text_finetuned.safetensors"
)
```

### Start Text Finetuning
```bash
python -m moondream.finetune.finetune_text
```

The process will output a finetuned version of Moondream into your save path. Example output: `models/moondream_text_finetuned.safetensors`

### Test the Finetuned Text Encoder

You can test the finetuned models performance with the following command (run from root moondream directory).

This will return the caption of the image.

```bash
# Remember to update the paths
python -m moondream.torch.sample --model [FINETUNED_MODEL_PATH] --image "[DATASET_DIRECTORY]/test/[IMAGE_NAME]" --prompt "\n\nQuestion: Describe this image.\n\nAnswer:" --endpoint query
```

## Finetuning the Region Encoder

For this example, we will be teaching Moondream to detect railroad cracks in images of a railway.

Our dataset trains our model such that,

Given the prompt:
`\n\nDetect: crack\n\n`

We are returned the coordinates of a detected crack in the following format:
```{'objects': [{'x_min': [X_MIN], 'y_min': [Y_MIN], 'x_max': [X_MAX], 'y_max': [Y_MAX]}]}```

### Setup Dataset Dependencies

1. Visit https://universe.roboflow.com/research-zwl99/railwayvision
2. Download dataset in COCO JSON format into relevant directory (ex: `datasets`)
3. Update path to `annotation_file` (line 169) & `img_dir` (line 170) in `finetune_region.py` to point at the dataset
- `annotation_file` should point to `<dataset_directory>/train/_annotations.coco.json`
- `img_dir` should point to `<dataset_directory>/train/`
4. Double check that you've updated MODEL_PATH to point to the base moondream model in `moondream/finetune/finetune_region.py`
5. Double check that the save path ends in `.safetensors`, otherwise the run will fail.
> Navigate to line 262 in `moondream/finetune/finetune_region.py`
``` # Add save path
save_file(
model.state_dict(),
"", // update this line ex: "models/moondream_region_finetuned.safetensors"
)
```

### Start Region Finetuning
```bash
python -m moondream.finetune.finetune_region
```

The process will output a finetuned version of Moondream into your save path. Example output: `models/moondream_region_finetuned.safetensors`
Empty file added moondream/finetune/__init__.py
Empty file.
Loading

0 comments on commit 5dc35df

Please sign in to comment.