Skip to content

This repo contains source code for "In-Context Learning with Differentially Private Text Sanitization in Large Language Models" (accepted to the Findings of DSPP 2024 )

Notifications You must be signed in to change notification settings

larryfans/ICL_DP_Text-Sanitization

Repository files navigation

ICL_DP_Text-Sanitization

model_figure

Introduction

This repo contains source code for "In-Context Learning with Differentially Private Text Sanitization in Large Language Models" (accepted to the Findings of DSPP 2024 )

With the increasing popularity of cloud-based large language models, users often inadvertently input text containing personal information during interactions, leading to significant privacy concerns. To address this challenge, we propose an in-context learning(ICL) based on differential privacy(DP) to protect users' instances and context information formally. The core idea entails enhancing local differentially private text sanitization and using token mapping relationships to remap private responses effectively. We conduct the experimental evaluation by comparing our method with Custext, two-shot, and zero-shot. The findings indicate that our method can attain a competitive edge while maintaining robust privacy protections.

Set Environment

  1. In models.py, set you openai key:
openai.api_key = '/api_key/'
openai.base_url = "/base_url/"
  1. Install dependency
pip install -r requirements.txt

The first time of use stopwords from the NLTK package, you need to execute the following code in Python

nltk.download('stopwords')

How to Run Our Code

bash run.sh

ICL Prompt of Summary Task

Prompt_figure

About

This repo contains source code for "In-Context Learning with Differentially Private Text Sanitization in Large Language Models" (accepted to the Findings of DSPP 2024 )

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published