Skip to content

Latest commit

 

History

History
249 lines (207 loc) · 9.36 KB

experiments.md

File metadata and controls

249 lines (207 loc) · 9.36 KB

Configuring and running experiments

Overview

The large number of component variants supported in this repository creates the need for configuring many components and their parameters before running a specific experiment. We rely on features provided by Hydra to make this process easier.

At the core, three main Hydra configs—train.yaml (generic training), eval.yaml (running evaluation), and unlearn.yaml (unlearning training)—provide the base configuration for the main types of experiments. These are then extended by experiment-specific configs and command-line overrides. We set up experiment configs for common usecases like LLaMA-2 unlearning on TOFU, LLaMA-2 evaluation on MUSE etc. which set the required datasets, models, and base train and eval configs to make things easier.


Table of Contents


Example Commands

## runs a finetuning using experiment details from configs/finetune/tofu/default.yaml
python src/train.py --config-name=train.yaml experiment=finetune/tofu/default

## runs an unlearning training using experiment details from configs/unlearn/tofu/default.yaml
python src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default


## runs an evaluation using experiment details from configs/eval/muse/default.yaml
python src/eval.py --config-name=eval.yaml experiment=eval/muse/default
## Note: eval.yaml is the default config set in src/eval.py, so this argument can be omitted

## an extensively filled out configuration for an unlearning experiment
python src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default data_split=News \
trainer=NPO trainer.method_args.retain_loss_type=KL task_name=llama2_books_NPO_KL \
retain_logs_path=saves/eval/muse_books_retain/MUSE_EVAL.json

## an even more extensively filled out configuration for an unlearning experiment
python src/train.py --config-name=unlearn.yaml \
experiment=unlearn/tofu/default.yaml \
task_name=NPO_unlearn_tofu_llama_8 \
model=Llama-3.1-8B-Instruct \
model.model_args.pretrained_model_name_or_path=saves/finetune/path_model_llama \
trainer=NPO trainer.args.per_device_train_batch_size=4 \
forget_split=forget05 retain_split=retain95 \
retain_logs_path=saves/eval/tofu_retain95/TOFU_EVAL.json \
paths.output_dir=saves/unlearn/NPO/evals

Note: The unlearning experiments support evaluation during the unlearning training. But this is supported only on a single GPU, evaluation can be performed during unlearning itself. When multiple GPUs are used to train, checkpoints must be stored and evaluated after training.


Commonly Overridden Arguments

To understand the structure of an evaluation config and the kind of available parameters for overriding, refer to: configs/experiment/examples/tofu_eval.yaml.

To understand the structure of an unlearning config and the kind of available parameters for overriding, refer to: configs/experiment/examples/muse_unlearn.yaml.

The following tables list the most commonly used arguments while running experiments.

<style> table { width: 100%; border-collapse: collapse; margin-bottom: 20px; } th, td { border: 1px solid #000; padding: 4px; word-wrap: break-word; word-break: break-all; } th { text-align: left; } col.argument { width: 30%; } col.description { width: 70%; } </style>

Model Settings

Argument Description and examples
model Selecting the model. Example: model=Llama-2-7b-hf
model.model_args.pretrained_model_name_or_path Specifies the model checkpoint or HuggingFace ID.
model.tokenizer_args.pretrained_model_name_or_path Specifies the tokenizer location. Make sure this matches the model from above by providing model path as needed..
model.template_args Optional chat templating parameters (e.g., start/end tags). Example: apply_chat_template: false, user_start_tag: "[INST] "

Trainer Settings

Argument Description and examples
trainer Overall trainer or unlearning method selection, decides the finetuning algorithm. Example: trainer=NPO or trainer=finetune
trainer.args Main training hyperparameters like per_device_train_batch_size, per_device_eval_batch_size, gradient_accumulation_steps, learning_rate, num_train_epochs, optim and other HuggingFace TrainingArguments.
trainer.method_args Method-specific parameters for unlearning trainers. Example: retain_loss_type, NPO hyperparams like gamma, alpha, beta etc.

Data Settings

Argument Description and examples
data Overall data configuration/format. Example: data=unlearn, data=finetune.
data.forget, data.retain, data.anchor etc. Set sub-datasets in the overall dataset using data.forget=MUSE_forget data.retain=MUSE_retain, set which sub-dataset to index over (others are randomly sampled) using data.anchor=forget
data_split/forget_split/retain_split These arguments are custom to specific datasets and are used to populate dataset paths.
data_split specifies the overall dataset split or type. Example: data_split=News or data_split=Books
forget_split/retain_split splits are used to use various sub-parts of the dataset. Example: forget_split=forget01 retain_split=retain99

Experiment Settings

Argument Description and examples
task_name Experiment identifier used to generate custom output paths. Example: task_name=llama2_books_NPO_KL.
eval Overall evaluation benchmark configuration selection. Example: eval=muse.
retain_logs_path Path to load eval logs of retain models used some evaluation metrics Example: retain_logs_path=saves/eval/muse_books_retain/MUSE_EVAL.json.
paths Contains attributes used to decide path configuration like paths.output_dir=$LOCAL_PATH.

Simple Finetuning

In addition to running unlearning based finetuning, we also support simple finetuning training with a given dataset.

These use src/train.py with the train.yaml config to set up a standard supervised training environment. Parameters such as learning rate, batch size, and optimizer settings can be adjusted via experiment-specific configs or command-line overrides.

Example:

python src/train.py --config-name=train.yaml experiment=finetune/tofu/default \
  trainer.args.learning_rate=5e-5 task_name=llama3.2-1B_finetune_example

Distributed Training

Distributed training configurations enable scaling experiments across multiple devices or nodes. In most cases, default distributed settings from configs/accelerate/default_config.yaml are sufficient. You can run distributed training with a default command such as:

CUDA_VISIBLE_DEVICES=0,1 accelerate launch --config_file configs/accelerate/default_config.yaml --main_process_port 18765 \
  src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default.yaml

Note: Evaluation runs are designed to work only a single GPU (this includes running evaluation during training). To run an evaluation job, modify your command to make only one GPU visible (assuming one GPU is enough for inference):

CUDA_VISIBLE_DEVICES=0 python src/eval.py  experiment=eval/muse/default.yaml