Skip to content

Aaalan-Zhang/11711-fall24-rag-for-medicine

Repository files navigation

A-MedRAG

This is the project repo for the final project of 11-711 ANLP - A-MedRAG: Adaptive Retrieval-Augmented Generation Framework for Medical Question Answering. In this project, we proposed an adaptive RAG framework built on top of the MedRAG toolkit. Our approach aims to reach a better trade-off between the performance and the computational cost of the previous MedRAG and i-MedRAG methods.

Instruction for Reproducibility

Environment Setup

  • Please install the required packages by running the following command:

    pip install -r requirements.txt
  • Git-lfs is required to download and load corpora for the first time.

  • Java is requried for using BM25.

Reproduce Experiment Results

To reproduce our results across different approaches (CoT, MedRAG, i-MedRAG, and our proposed adaptive RAG), please see the commands in run_rag.sh.

Example:

To run the A-MedRAG method using Llama3.1-8B on the MMLU dataset:

python MedRAG_alan/src/pipeline/pipeline.py --input_json_path "MIRAGE_alan/benchmark.json" --output_folder_path "MIRAGE_alan/prediction" --dataset_name "mmlu" --model_name "/data/models/huggingface/meta-llama/Llama-3.1-8B-Instruct" --store_desc "a-rag/meta-llama" --is_rag --is_agent

Evaluation

To evaluate the results generated by the above command, please see the commands in run_eval.sh. Make sure the working directory is MIRAGE_alan. Example:

To evaluate the results generated by A-MedRAG using Llama3.1-8B on the MMLU dataset:

python src/evaluate.py --results_dir ./prediction --llm_name meta-llama/Llama-3.1-8B-Instruct --agent --corpus_name Textbooks

Structure

Members

The team members are (ordered by name, last name, then first name):

Haojun Liu (haojunli)

Qingyang Liu (qliu3)

Chenglin Zhang (chengliz)

Contributions

All three members participated in brainstorming on the topics. Qingyang Liu submitted the proposal quiz survey on Canvas.

For Project 3, Qingyang Liu worked on the literature survey part and wrote Section 1, 2, and 3 in the report. Chenglin Zhang and Haojun Liu reproduced the results of the MedRAG paper. Chenglin Zhang wrote Section 4 and 5 of the paper (except for Section 4.4). Haojun Liu wrote Section 4.4.

For Project 4:

  • Qingyang Liu worked on reproducing baseline results and experimenting with different agent-based approaches; writing Section 2, 3, and 7 in the report.
  • Chenglin Zhang worked on Proof-of-Concept experiments of the baseline methods and result error analysis; writing Section 1, 4, and 5, 6, 7, 8 in the report.
  • Haojun Liu worked on the major implementation of the A-MedRAG pipeline, running experiments and evaluation using A-MedRAG, and result error analysis; writing Section 7 in the report.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •