From c0dcc5f0f6ce9df0f62828e1a653032ddd50f13f Mon Sep 17 00:00:00 2001 From: yyy-Apple <50064641+yyy-Apple@users.noreply.github.com> Date: Wed, 2 Mar 2022 00:31:11 +0800 Subject: [PATCH] update README.md --- train/README.md | 2 +- train/reproduce/data/README.md | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) create mode 100644 train/reproduce/data/README.md diff --git a/train/README.md b/train/README.md index 0b434cb..51b0d5e 100644 --- a/train/README.md +++ b/train/README.md @@ -18,7 +18,7 @@ python bart.py --train_file train.json --validation_file val.json --output_dir m Then you can use your custom BARTScore for evaluation. ## Reproduce -To reproduce our results, please see the folder [`reproduce`](reproduce). Due to limited computing resources, we sharded the BART into multiple GPUs and trained the model. See [`reproduce/finetune.py`](reproduce/finetune.py) for details. +To reproduce our results, please see the folder [`reproduce`](reproduce). Due to limited computing resources, we sharded the BART into multiple GPUs and trained the model, please see [`reproduce/finetune.py`](reproduce/finetune.py) for details. diff --git a/train/reproduce/data/README.md b/train/reproduce/data/README.md new file mode 100644 index 0000000..041966c --- /dev/null +++ b/train/reproduce/data/README.md @@ -0,0 +1,7 @@ +# Data + +The Parabank2 dataset can be found here: [https://nlp.jhu.edu/parabank/](https://nlp.jhu.edu/parabank/). Please reformat the data into a `parabank2.json` file like this if you want to use our provided script: +``` +{"text": "1994 German Grand Prix", "summary": "1994 Grand Prix of Germany"} +{"text": "2004 OFC Nations Cup", "summary": "2004 Ocean Cup of Nations"} +``` \ No newline at end of file