GitHub - locuslab/open-unlearning: A unified, easily extensible repository for LLM unlearning benchmarks (TOFU, MUSE) - enabling new evaluations, methods, and tasks.

An easily extensible framework unifying LLM unlearning evaluation benchmarks.

📖 Overview

We provide efficient and streamlined implementations of the TOFU, MUSE unlearning benchmarks while supporting 5 unlearning methods, 3+ datasets, 6+ evaluation metrics, and 7+ LLMs. Each of these can be easily extended to incorporate more variants.

We invite the LLM unlearning community to collaborate by adding new benchmarks, unlearning methods, datasets and evaluation metrics here to expand OpenUnlearning's features, gain feedback from wider usage and drive progress in the field.

🗃️ Available Components

We provide several variants for each of the components in the unlearning pipeline.

Component	Available Options
Benchmarks	TOFU, MUSE
Unlearning Methods	GradAscent, GradDiff, NPO, SimNPO, DPO
Evaluation Metrics	Verbatim Probability, Verbatim ROUGE, QA-ROUGE, MIA Attacks, TruthRatio, Model Utility
Datasets	MUSE-News (BBC), MUSE-Books (Harry Potter), TOFU (different splits)
Model Families	LLaMA 3.2, LLaMA 3.1, LLaMA-2, Phi-3.5, ICLM (from MUSE), Phi-1.5, Gemma

📌 Table of Contents

📖 Overview
🗃️ Available Components
⚡ Quickstart
- 🛠️ Environment Setup
- 💾 Data Setup
- 📜 Running Baseline Experiments
🧪 Running Experiments
- 🚀 Perform Unlearning
- 📊 Perform an Evaluation
➕ How to Add New Components
📚 Further Documentation
🔗 Support & Contributors
📝 Citation

⚡ Quickstart

🛠️ Environment Setup

conda create -n unlearning python=3.11
conda activate unlearning
pip install .[flash-attn]

💾 Data Setup

Download the log files containing metric results from the models used in the supported benchmarks (including the retain model logs used to compare the unlearned models against).

python setup_data.py # populates saves/eval with evaluation results of the uploaded models

🧪 Running Experiments

We provide an easily configurable interface for running evaluations by leveraging Hydra configs. For a more detailed documentation of aspects like running experiments, commonly overriden arguments, interfacing with configurations, distributed training and simple finetuning of models, refer docs/experiments.md.

🚀 Perform Unlearning

An example command for launching an unlearning process with GradAscent on the TOFU forget10 split:

python src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default \
  forget_split=forget10 retain_split=retain90 trainer=GradAscent

experiment- Path to the Hydra config file configs/experiment/unlearn/muse/default.yaml with default experimental settings for TOFU unlearning, e.g. train dataset, eval benchmark details, model paths etc..
forget_split/retain_split- Sets the forget and retain dataset splits.
trainer- Load configs/trainer/GradAscent.yaml and override the unlearning method with the handler (see config) implemented in src/trainer/unlearn/grad_ascent.py.

📊 Perform an Evaluation

An example command for launching a TOFU evaluation process on forget10 split:

python src/eval.py --config-name=eval.yaml experiment=eval/tofu/default \
  model=Llama-3.2-1B-Instruct \
  model.model_args.pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-1B-Instruct_full

experiment-Path to the evaluation configuration configs/experiment/eval/tofu/default.yaml.
model- Sets up the model and tokenizer configs for the Llama-3.2-1B-Instruct model.
model.model_args.pretrained_model_name_or_path- Overrides the default experiment config to evaluate a model from a HuggingFace ID (can use a local model checkpoint path as well).

For more details about creating and running evaluations, refer docs/evaluation.md.

📜 Running Baseline Experiments

The scripts below execute standard baseline unlearning experiments on the TOFU and MUSE datasets, evaluated using their corresponding benchmarks. The expected results for these are in docs/results.md.

bash scripts/tofu_unlearn.sh
bash scripts/muse_unlearn.sh

➕ How to Add New Components

Adding a new component (trainer, evaluation metric, benchmark, model, or dataset) requires defining a new class, registering it, and creating a configuration file. Learn more about adding new components in docs/components.md.

Please feel free to raise a pull request for any new features after setting up the environment in development mode.

pip install .[flash-attn, dev]

📚 Further Documentation

For more in-depth information on specific aspects of the framework, refer to the following documents:

Documentation	Contains
`docs/components.md`	Instructions on how to add new components such as trainers, benchmarks, metrics, models, datasets, etc.
`docs/evaluation.md`	Detailed instructions on creating and running evaluation metrics and benchmarks.
`docs/experiments.md`	Guide on running experiments in various configurations and settings, including distributed training, fine-tuning, and overriding arguments.
`docs/hydra.md`	Explanation of the Hydra features used in configuration management for experiments.
`docs/results.md`	Reference results from various unlearning methods run using this framework on TOFU and MUSE benchmarks.

🔗 Support & Contributors

Developed and maintained by Vineeth Dorna (@Dornavineeth) and Anmol Mekala (@molereddy).

If you encounter any issues or have questions, feel free to raise an issue in the repository 🛠️.

📝 Citation

This repo is inspired from LLaMA-Factory. We acknowledge the TOFU and MUSE benchmarks, which served as the foundation for our re-implementation.

If you use OpenUnlearning in your research, please cite:

@misc{openunlearning2025,
  title={OpenUnlearning: A Unified Framework for LLM Unlearning Benchmarks},
  author={Dorna, Vineeth and Mekala, Anmol and Maini, Pratyush and Zhao, Wenlong},
  year={2025},
  note={\url{https://github.com/locuslab/open-unlearning}}
}
@inproceedings{maini2024tofu,
  title={TOFU: A Task of Fictitious Unlearning for LLMs},
  author={Maini, Pratyush and Feng, Zhili and Schwarzschild, Avi and Lipton, Zachary Chase and Kolter, J Zico},
  booktitle={First Conference on Language Modeling},
  year={2024}
}

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
assets		assets
configs		configs
docs		docs
scripts		scripts
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
setup_data.py		setup_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An easily extensible framework unifying LLM unlearning evaluation benchmarks.

📖 Overview

🗃️ Available Components

📌 Table of Contents

⚡ Quickstart

🛠️ Environment Setup

💾 Data Setup

🧪 Running Experiments

🚀 Perform Unlearning

📊 Perform an Evaluation

📜 Running Baseline Experiments

➕ How to Add New Components

📚 Further Documentation

🔗 Support & Contributors

📝 Citation

📄 License

About

Releases

Packages

Contributors 5

Languages

License

locuslab/open-unlearning

Folders and files

Latest commit

History

Repository files navigation

An easily extensible framework unifying LLM unlearning evaluation benchmarks.

📖 Overview

🗃️ Available Components

📌 Table of Contents

⚡ Quickstart

🛠️ Environment Setup

💾 Data Setup

🧪 Running Experiments

🚀 Perform Unlearning

📊 Perform an Evaluation

📜 Running Baseline Experiments

➕ How to Add New Components

📚 Further Documentation

🔗 Support & Contributors

📝 Citation

📄 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages